Background image representing the theme of this page: Downloading local AI models.

Download Models

Step 2 — Choose and install your first local AI models.

Now that Ollama is installed, it’s time to download the models your agent will use. These models run entirely on your machine — no cloud, no API keys, no external dependencies.

Download Your First Models

Once Ollama is installed, the next step is choosing the models that will power your local‑first AI experience. Every model you download runs entirely on your machine — no cloud, no API keys, no data leaving your device. This page helps you pick the right models for your hardware and your goals, whether you're exploring Llama 3 for reasoning, Mistral for balanced performance, or Phi for ultra‑fast experimentation.

All models listed here come from the official Ollama Model Library, which provides trusted, optimized versions of today’s leading open‑source LLMs.


Recommended Models

These models are widely used across the local AI community and offer a strong balance of speed, reasoning, and memory usage. They’re ideal for benchmarking, early agent development, and understanding how different architectures behave on your hardware.

Model Strengths Size
llama3 General reasoning, coding, writing ~4–8 GB
qwen Fast, efficient, excellent for tool‑use ~2–7 GB
mistral Balanced performance, strong reasoning ~4–7 GB
phi Very small, extremely fast ~1–2 GB
deepseek High‑performance reasoning ~7–10 GB

You don’t need all of them — but downloading at least two gives you a meaningful comparison when you begin benchmarking and building your first agent.


Download Commands

Every model in Ollama is installed with a single command. Run any of these in your terminal:

ollama pull llama3
ollama pull qwen
ollama pull mistral
ollama pull phi
ollama pull deepseek

Ollama will download the model, verify it, and store it locally. Once downloaded, models load instantly — even offline.


Verify Your Models

To confirm which models are installed, list them with:

ollama list

You should see output similar to:

NAME SIZE llama3 4.1 GB phi 1.2 GB

If a model appears in this list, it’s ready to run immediately.


Choosing the Right Model

Different models shine on different hardware. Here’s a quick guide to help you choose the best fit:

If you have 8 GB RAM or less:

  • phi — extremely fast and lightweight
  • qwen — efficient and great for tool‑use

If you have 16 GB RAM:

  • mistral — balanced performance
  • llama3 (smaller variants) — strong reasoning

If you have 32 GB+ RAM:

  • llama3 (full) — excellent general reasoning
  • deepseek — high‑performance reasoning

For deeper comparisons, the open‑source community maintains excellent benchmarks on Hugging Face’s Open LLM Leaderboard .


Troubleshooting

Download is slow

  • Try again — Ollama uses a global CDN and speeds vary
  • Switch networks if possible

“Error pulling model”

  • Restart Ollama: ollama serve
  • Restart your machine

Disk space issues

  • Models can be 2–10 GB each
  • Delete unused models: ollama rm modelname

Next Step
Benchmark Your Models →