Ollama Setup Guide

Enable local AI features in Fiavaion products by installing Ollama. Your data stays on your device—no cloud processing, no API fees, complete privacy.

Setup time: 5-10 minutes

What You'll Need

Operating System

Windows 10/11, macOS 11+, or Linux

Memory (RAM)

8GB minimum, 16GB+ recommended

Disk Space

5GB minimum for basic models

GPU (Optional)

NVIDIA GPU with 6GB+ VRAM for faster processing

Installation

Select your operating system:

Download Ollama

Visit the official Ollama website and download the Windows installer.

Download for Windows

Run the Installer

Double-click the downloaded OllamaSetup.exe file and follow the installation wizard. Accept the default settings.

Note: If Windows SmartScreen appears, click "More info" then "Run anyway". Ollama is safe and open-source.

Open Command Prompt

New to the command line? Don't worry — just follow these steps exactly:

Press the Windows key on your keyboard (the key with the Windows logo, usually bottom-left)
Type cmd
Click Command Prompt when it appears in the search results
A black window with white text will open — this is the Command Prompt

You'll type commands into this window. After typing each command, press Enter to run it.

Verify Installation

In the Command Prompt window, type (or copy and paste) the following command and press Enter:

ollama --version

You should see a version number like ollama version 0.5.x. This confirms Ollama installed correctly.

Download Your First Model

Still in the Command Prompt, type (or copy and paste) this command and press Enter:

ollama pull qwen3:8b-q4_K_M

You'll see a progress bar — wait for it to reach 100%. This downloads a 5GB model and may take a few minutes depending on your internet speed. When it says "success", the model is ready.

Test the Model

Type this command and press Enter to verify everything works:

ollama run qwen3:8b-q4_K_M "Hello, are you working?"

If you see a text response, Ollama is ready! Press Ctrl+C to exit, then you can close the Command Prompt window.

Ollama Runs in Background

After installation, Ollama runs automatically as a background service. Look for the llama icon in your system tray (bottom-right of taskbar). Our apps connect to it at localhost:11434.

Download Ollama

Visit the official Ollama website and download the macOS app.

Download for macOS

Apple Silicon (M1/M2/M3/M4) Intel Both supported - universal binary

Install the App

Open the downloaded Ollama-darwin.zip file
Drag Ollama.app to your Applications folder
Double-click Ollama in Applications to launch it
If prompted, click "Open" to allow the app from an identified developer

Allow Background Access

When Ollama first runs, it may ask for permission to run in the background. Click Allow to let it serve AI requests.

You'll see a llama icon appear in your menu bar (top-right).

Open Terminal

New to the command line? Don't worry — just follow these steps exactly:

Press Cmd+Space on your keyboard to open Spotlight search
Type Terminal
Press Enter (or click Terminal when it appears)
A window with text will open — this is the Terminal

You'll type commands into this window. After typing each command, press Enter to run it.

Download a Model

In the Terminal window, type (or copy and paste) this command and press Enter:

ollama pull qwen3:8b-q4_K_M

You'll see a progress bar — wait for it to reach 100%. This downloads the recommended 5GB model. When it says "success", the model is ready.

Test the Model

Type this command and press Enter to verify everything works:

ollama run qwen3:8b-q4_K_M "Hello, are you working?"

If you see a text response, you're all set! Press Ctrl+C to exit, then you can close the Terminal window.

Mac Performance Tips

Apple Silicon (M1/M2/M3/M4)

Excellent performance. The GPU is used automatically. 16GB+ unified memory recommended for larger models.

Intel Macs

Good performance on CPU. Stick to smaller models like gemma3:4b or phi3:mini for best results.

Alternative: Install via Homebrew

If you prefer Homebrew:

brew install ollama ollama serve & ollama pull qwen3:8b-q4_K_M

Install with One Command

Open your terminal and run the official install script:

curl -fsSL https://ollama.com/install.sh | sh

This automatically detects your system and installs Ollama.

Start the Ollama Service

The installer usually starts Ollama automatically. If not, run:

ollama serve

Or to run as a systemd service:

sudo systemctl enable ollama sudo systemctl start ollama

Download a Model

ollama pull qwen3:8b-q4_K_M

Test the Model

ollama run qwen3:8b-q4_K_M "Hello, are you working?"

Recommended Models

Select your GPU's VRAM to see which models work best for your hardware. Apple Silicon users: match your unified memory to these tiers.

Integrated GPU, GT 1030, GTX 1050 M1/M2/M3 with 8GB unified (~3-4GB available)

gemma3:4b Recommended

Fast responses for basic tasks

3GB download

phi3:mini Optional

Compact and efficient alternative

2.3GB download

Install Commands

ollama pull gemma3:4bollama pull phi3:mini

Run one model at a time. Close other GPU-intensive apps for best performance.

RTX 3050, RTX 4060, GTX 1070/1080 M1/M2/M3 with 16GB unified (~11GB available)

qwen3:8b-q4_K_M Recommended

Best JSON compliance, recommended default

5GB download

llava Optional

Image descriptions and diagram understanding

4.7GB download

Install Commands

ollama pull qwen3:8b-q4_K_Mollama pull llava

Good balance of speed and quality. Most users will be well-served at this tier.

RTX 3060 12GB, RTX 4070, RX 6700 XT M1/M2/M3 Pro with 18GB unified (~13GB available)

qwen3:8b-q4_K_M Recommended

Best JSON compliance, recommended default

5GB download

llava Recommended

Image descriptions and diagram understanding

4.7GB download

mistral:7b Optional

Creative writing and detailed analysis

4.1GB download

Install Commands

ollama pull qwen3:8b-q4_K_Mollama pull llavaollama pull mistral:7b

Use qwen3:8b as your default for the best JSON compliance and instruction following.

RTX 4080, RTX 3070 Ti, RX 7800 XT M1/M2/M3 Pro/Max with 24GB unified (~19GB available)

llama3.1:8b Recommended

High quality default for all tasks

4.7GB download

llava:13b Recommended

Superior image understanding and descriptions

8GB download

gemma2:9b Optional

Strong reasoning and comprehension

5.4GB download

Install Commands

ollama pull llama3.1:8bollama pull llava:13bollama pull gemma2:9b

Can keep multiple models loaded simultaneously for instant switching.

RTX 3090, RTX 4090, RTX A5000 M2/M3 Max with 32GB unified (~27GB available)

llama3.1:13b Recommended

Excellent quality for all tasks

7.4GB download

llava:13b Recommended

Superior image understanding

8GB download

deepseek-r1:14b Optional

Advanced reasoning and problem solving

8.5GB download

Install Commands

ollama pull llama3.1:13bollama pull llava:13bollama pull deepseek-r1:14b

Run larger models for significantly better output quality. Plenty of headroom.

Multi-GPU, RTX A6000, Professional GPUs M2/M3 Ultra with 64GB+ unified (~59GB+ available)

mixtral:8x7b Recommended

State-of-the-art mixture-of-experts model

26GB download

llama3.1:13b Recommended

High quality, fast responses

7.4GB download

llava:34b Optional

Best-in-class image understanding

20GB download

Install Commands

ollama pull mixtral:8x7bollama pull llama3.1:13bollama pull llava:34b

Maximum quality. Can run the largest models with room to spare.

Not sure about your VRAM?

Windows: Open Task Manager → Performance → GPU to see dedicated GPU memory. Mac: Apple menu → About This Mac shows your total unified memory—subtract 4-8GB for macOS and apps to estimate what's available for models.

Verify Connection

Fiavaion apps automatically detect Ollama. Here's how to verify it's running:

Check in Browser

Open localhost:11434 in your browser. You should see "Ollama is running".

Check in Terminal

Run ollama list to see your installed models.

Check in Our Apps

Open any Fiavaion app with AI features. It will show a green indicator if Ollama is connected.

Troubleshooting

"Ollama not detected" error

Make sure Ollama is running (check for the llama icon in system tray/menu bar)
Try restarting Ollama
Verify localhost:11434 is accessible
Check if a firewall is blocking the connection

Slow responses

Try a smaller model: ollama pull phi3:mini
Close other memory-intensive applications
If using CPU only, responses will be slower than with a GPU
Check available RAM with Task Manager (Windows) or Activity Monitor (Mac)

Model download fails

Check your internet connection
Ensure you have enough disk space (models are 2-8GB each)
Try a smaller model first: ollama pull phi3:mini
If behind a proxy, configure it in your terminal environment

Out of memory errors

Use a smaller model (3b or mini variants)
Close other applications to free up RAM
Restart Ollama to clear any stuck processes
Consider upgrading RAM if you frequently hit limits

macOS: "Cannot be opened" error

Right-click (or Ctrl+click) the Ollama app
Select "Open" from the context menu
Click "Open" in the dialog that appears
This only needs to be done once

Windows: Ollama not starting

Check Windows Services (services.msc) for "Ollama"
Right-click and select "Start" if it's stopped
Try reinstalling Ollama
Run as Administrator if needed

Your Data Stays Local

With Ollama, all AI processing happens on your computer. Your documents, text, and images are never sent to any server. This makes it safe for:

Student work (FERPA compliant)
Healthcare data (HIPAA safe)
Personal documents (GDPR friendly)
Sensitive business information

Need More Help?

Browse All Models Ollama GitHub AI Features Guide

Back to Documentation