github github
stars 207
issues 8
subscribers 12
forks 3



4 months ago

🦙 nvim-llama

Ollama interfaces for Neovim: get up and running with large language models locally in Neovim.

🏗️ 👷 Warning! Under active development!! 👷 🚧


Docker is required to use nvim-llama.

And that's it! All models and clients run from within Docker to provide chat interfaces and functionality. This is an agnostic approach that works for MacOS, Linux, and Windows.


Use your favorite package manager to install the plugin:


use 'jpmcb/nvim-llama'




Plug 'jpmcb/nvim-llama'

Setup & configuration

In your init.vim, setup the plugin:

require('nvim-llama').setup {}

You can provide the following optional configuration table to the setup function:

local defaults = {
    -- See plugin debugging logs
    debug = false,

    -- The model for ollama to use. This model will be automatically downloaded.
    model = llama2,

Model library

Ollama supports an incredible number of open-source models available on

Check out their docs to learn more:

When setting the model setting, the specified model will be automatically downloaded:

Model Parameters Size Model setting
Neural Chat 7B 4.1GB model = neural-chat
Starling 7B 4.1GB model = starling-lm
Mistral 7B 4.1GB model = mistral
Llama 2 7B 3.8GB model = llama2
Code Llama 7B 3.8GB model = codellama
Llama 2 Uncensored 7B 3.8GB model = llama2-uncensored
Llama 2 13B 13B 7.3GB model = llama2:13b
Llama 2 70B 70B 39GB model = llama2:70b
Orca Mini 3B 1.9GB model = orca-mini
Vicuna 7B 3.8GB model = vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models. 70B parameter models require upwards of 64 GB of ram (if not more).


The :Llama autocommand opens a Terminal window where you can start chatting with your LLM.

To exit Terminal mode, which by default locks the focus to the terminal buffer, use the bindings Ctrl-\ Ctrl-n