<p>OKBrain Harness recommends using local models whenever possible. You can expose the local Ollama URL via the <code>OLLAMA_URL</code> environment variable, since Ollama might be running on a different machine in your local network.</p>
<h3>For Vector Embeddings</h3>
<p>We strongly recommend against using a cloud model for vector embeddings. Instead, use a model like <code>nomic-embed-text:v1.5</code>, even without a GPU.</p>
<p>Set the following in your <code>.env.local</code>:</p>
<pre><code>VECTOR_EMBEDDING_MODEL=nomic-embed-text:v1.5
</code></pre>
<h3>As a Model Provider</h3>
<p>We also have built-in support for Ollama as a model provider for conversations. You can edit the <code>src/lib/ai/providers/ollama.ts</code> file to configure the available models.</p>
<h3>For Fact Extraction</h3>
<p>If the <code>qwen3.5:4b</code> model is available at the <code>OLLAMA_URL</code> endpoint, OKBrain Harness will use it for fact extraction. We have extensively tested various local models, and this one fits our fact extraction use case best. That is why it is hardcoded for now.</p>


Using Local Models

Introduction

Quick Start Guide

Proper Dev Setup

Own & Keep Your Data

Infinite Memory

Multi-Agent Conversations

Internet/Web Use

Computer Use

Events/Calendar Management

Documents Support

Apps with a Coding Agent

Users & Sharing

PWA & Mobile App

Location Data

Enabling Additional Features

Production Deployment to a Cloud VM

Production Deployment to a Local Mac

Contributions