AI Features & Privacy
Learn how AssisT handles AI processing with local LLMs and optional cloud APIs while keeping your data private.
Overview
AssisT uses a privacy-first hybrid AI system that prioritizes local processing while offering optional cloud capabilities. Your data stays on your device by default.
Key Principles
- Local First: AI runs on your computer using Ollama—no internet required
- No Data Collection: We never see, store, or transmit your data
- Optional Cloud: Use your own API key if you want cloud AI capabilities
- Graceful Fallback: Features work even without AI (with reduced functionality)
Local AI with Ollama
AssisT integrates with Ollama, a free, open-source tool that runs AI models directly on your computer.
Why Local AI?
| Benefit | Description |
|---|---|
| Privacy | Data never leaves your device |
| Compliance | Safe for GDPR, FERPA, and HIPAA environments |
| No Cost | No API fees or subscriptions |
| Offline | Works without internet connection |
| Speed | No network latency for requests |
Supported Models
AssisT automatically detects and uses available Ollama models:
| Model | Size | Best For |
|---|---|---|
| phi3:mini | 2GB | Fast responses, basic tasks |
| llama3.2 | 5GB | Balanced performance |
| mistral | 4GB | Complex analysis, detailed responses |
| llava | 4GB | Image understanding (vision) |
Installing Ollama
- Download Ollama from ollama.ai
- Install and run Ollama on your computer
- AssisT will automatically detect it
Installing Models
Once Ollama is running, you can install models directly from AssisT:
- Open AssisT settings
- Go to AI Settings > Local Models
- Click Install next to your preferred model
- Wait for the download to complete
Recommended Model Sets:
- Minimal (2GB):
phi3:mini- Fast responses for basic tasks - Balanced (5GB):
phi3:mini+llama3.2- Good for most users - Full (10GB): All models including vision - Complete AI capabilities
AssisT will recommend a model set based on your system’s available memory.
How Local AI Works
Your Browser (AssisT)
↓
Message Bridge
↓
Ollama (localhost:11434)
↓
AI Response
↓
Back to AssisT
All communication happens locally on your machine. Nothing is sent to external servers.
Cloud Providers (Optional)
For users who want more powerful AI capabilities, AssisT supports multiple cloud providers through API keys you provide.
Supported Providers
| Provider | Strengths | Best For |
|---|---|---|
| Anthropic (Claude) | Coding, academic writing, analysis | Text simplification, tutoring |
| OpenAI (ChatGPT) | Creative, conversational | Brainstorming, general tasks |
| Google (Gemini) | Multimodal, visual, factual | Image understanding |
| Perplexity | Real-time web, citations | Research, fact-checking |
Bringing Your Own API Key
- Get an API key from your preferred provider:
- Anthropic Console (Claude)
- OpenAI Platform (ChatGPT)
- Google AI Studio (Gemini)
- Perplexity Settings
- Open AssisT settings
- Go to AI Settings > Cloud Providers
- Select your provider and enter your API key
- Choose your preferred model
Cost vs Quality
| Model Type | Cost | Best For |
|---|---|---|
| Fast (Haiku, GPT-4o-mini, Flash) | Cheaper per token | Simple tasks, high volume |
| Balanced (Sonnet, GPT-4o, Pro) | Moderate | Most use cases |
| Quality (Opus, GPT-4) | Higher per token | Complex tasks, accuracy critical |
Tip: Start with faster models for simple tasks. Use larger models when you need more nuanced or accurate responses.
API Key Security
- Your API keys are stored locally in Chrome’s secure storage
- They are never sent to Fiavaion servers
- Only transmitted directly to the provider when you use cloud features
- You can remove them anytime from settings
How the Hybrid System Works
AssisT intelligently routes requests based on availability:
Feature Request
↓
Is Ollama available?
├─ YES → Use local AI (privacy preserved)
└─ NO → Is cloud API configured?
├─ YES → Use cloud AI (with your key)
└─ NO → Use fallback behavior
Task-Based Model Selection
Different features use the best available model for optimal results:
| Feature | Local Model | Recommended Cloud |
|---|---|---|
| Summarization | phi3:mini, llama3.2 | Any fast model |
| Text Simplification | llama3.2, mistral | Anthropic (clarity) |
| Socratic Tutor | mistral | Anthropic (reasoning) |
| Image Understanding | llava | Gemini or GPT-4o |
| Research & Citations | — | Perplexity (web access) |
Fallback Behaviors
When AI isn’t available, features gracefully degrade:
| Feature | Fallback Behavior |
|---|---|
| Summarize | Shows first paragraph |
| Simplify | Feature disabled with message |
| Image Describe | Requires vision model |
| TTS Prosody | Uses neutral tone |
Privacy Guarantees
What We Never Do
- Collect or store your data
- Send data to our servers
- Track your AI usage
- Share information with third parties
What Stays Local
- All text you process
- Documents you summarize
- Images you analyze
- Conversation history
GDPR/FERPA/HIPAA Compliance
Because AssisT processes everything locally:
- GDPR: No personal data is transmitted
- FERPA: Student data stays on the device
- HIPAA: Patient information never leaves the browser
This makes AssisT safe for educational institutions and healthcare settings.
Performance Tips
For Best Local AI Performance
- Use an SSD: Faster model loading
- 8GB+ RAM/VRAM: Required for larger models
- Keep Ollama Running: Faster first response
- Choose Appropriate Models: Match model size to your hardware
Why Memory Matters
- More VRAM = Better Models: With more video memory (or unified memory on Apple Silicon), you can run larger, more capable models
- More Memory = Longer Context: Additional memory allows longer context windows—the AI can “remember” more of your document
- Longer Context = Fewer Hallucinations: When AI sees more context, it makes fewer mistakes because it has more information to work with
Memory Types
| Type | What Matters | Notes |
|---|---|---|
| Dedicated GPU | VRAM (8GB good, 12GB+ great) | NVIDIA/AMD graphics cards |
| Apple Silicon | Unified memory (16GB good, 32GB+ excellent) | M1/M2/M3/M4 Macs |
| CPU-only | System RAM (16GB min, 32GB recommended) | Slower but works |
Recommended System Requirements
| Setup | RAM/VRAM | Storage | Models |
|---|---|---|---|
| Minimal | 8GB | 4GB free | phi3:mini |
| Standard | 16GB | 8GB free | phi3:mini + llama3.2 |
| Full | 32GB+ | 15GB free | All models + longer context |
Troubleshooting
Ollama Not Detected
- Ensure Ollama is installed and running
- Check that it’s accessible at
localhost:11434 - Restart Ollama if needed
- Refresh the AssisT extension
Slow Responses
- Try a smaller model (phi3:mini is fastest)
- Ensure Ollama isn’t processing other requests
- Check your system’s available memory
- Close other resource-intensive applications
Model Download Failed
- Check your internet connection
- Ensure enough disk space is available
- Try downloading a smaller model first
- Restart Ollama and try again