Context Harness

A local-first context ingestion and retrieval framework for AI tools.

📖 Documentation & guides · Try the demo

Context Harness ingests external knowledge (files, Git repos, S3, Lua scripts) into a local SQLite store with optional embeddings, and exposes it via the ctx CLI and an MCP-compatible HTTP server so tools like Cursor and Claude can search your context.

Features

Connector-driven ingestion — plug in any source (filesystem, Git repos, S3 buckets, Lua scripts)
Extension registries — install community connectors, tools, and agents from Git-backed repos
Local-first storage — SQLite with FTS5 for keyword search
Embedding pipeline — local (fastembed or tract), Ollama, and OpenAI embeddings with automatic batching, retry, and staleness detection
Hybrid retrieval — keyword + semantic + weighted merge (configurable alpha)
MCP server — expose context to Cursor and other AI tools via HTTP
CLI-first — everything accessible via the ctx command
Incremental sync — checkpointed, idempotent, deterministic

Quick Start

For a 5-minute walkthrough with copy-paste config, see the Quick Start guide on the docs site.

1. Install

Pre-built binaries (recommended):

Download the latest release from GitHub Releases:

# macOS (Apple Silicon)
curl -L https://github.com/parallax-labs/context-harness/releases/latest/download/ctx-macos-aarch64.tar.gz | tar xz
sudo mv ctx /usr/local/bin/

# macOS (Intel)
curl -L https://github.com/parallax-labs/context-harness/releases/latest/download/ctx-macos-x86_64.tar.gz | tar xz
sudo mv ctx /usr/local/bin/

# Linux (x86_64)
curl -L https://github.com/parallax-labs/context-harness/releases/latest/download/ctx-linux-x86_64.tar.gz | tar xz
sudo mv ctx /usr/local/bin/

# Linux (aarch64)
curl -L https://github.com/parallax-labs/context-harness/releases/latest/download/ctx-linux-aarch64.tar.gz | tar xz
sudo mv ctx /usr/local/bin/

Windows: download ctx-windows-x86_64.zip from the releases page and add ctx.exe to your PATH.

Nix (NixOS / nix-darwin):

Install straight from the repo flake — no release tarball needed.

From a clone:

# Build (full binary with local embeddings)
nix build .#default
./result/bin/ctx --version

# Or install into your user profile (on $PATH)
nix profile install .#default

Without cloning (flake URL):

nix profile install github:parallax-labs/context-harness#default

The flake provides two packages:

Package	Description
`.#default`	Full build with local embeddings (fastembed; model downloads on first use).
`.#no-local-embeddings`	Minimal binary, no local embeddings (use OpenAI or Ollama only).

Use nix develop for a development shell. To use Context Harness inside your own flake (NixOS, Home Manager), see the Nix flake guide.

From source:

Local embeddings have no system dependencies; models are downloaded on first use.

Linux: Default features use rustls (no system OpenSSL). No extra packages required for a normal cargo build.
macOS: The build links against the C++ standard library (used by some dependencies). If you see library not found for -lc++, install the Xcode Command Line Tools: xcode-select --install. If you use Nix, run nix develop first so the shell provides Zig as the C/C++ compiler; then cargo build works.

cargo install --path .

2. Configure

cp config/ctx.example.toml config/ctx.toml
# Edit config/ctx.toml with your settings

Config path defaults to ./config/ctx.toml; use --config to override. See configuration reference for all options.

3. Initialize and sync

ctx init
ctx sync all          # sync all connectors in parallel
ctx sync git:platform # or sync a specific connector

4. Search

ctx search "your query"              # keyword (default)
ctx search "your query" --mode hybrid   # keyword + semantic (needs embeddings)
ctx embed pending                    # backfill embeddings if using local/ollama/openai

5. MCP server (Cursor, Claude, etc.)

ctx serve mcp

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "context-harness": {
      "url": "http://127.0.0.1:7331/mcp"
    }
  }
}

Architecture

Connectors → Normalization → Chunking → Embedding → SQLite Store → Query Engine → CLI / MCP Server

Data Flow

Connector pulls items from a source (filesystem, Git, S3, Lua scripts)
Items are normalized into a standard Document
Documents are chunked and stored in SQLite
FTS5 index enables keyword search over chunks
Chunks are embedded (local, Ollama, or OpenAI) and vectors stored as blobs
Query engine supports keyword, semantic, and hybrid retrieval
Results exposed via CLI and MCP-compatible HTTP server

CLI Commands

Full reference: CLI docs.

Command	Description
`ctx init`	Initialize database schema
`ctx stats`	Show database statistics (docs, chunks, embeddings)
`ctx sources`	List available connectors
`ctx sync <connector>`	Ingest from a connector (`all`, `git`, `git:name`)
`ctx search "<query>"`	Search indexed documents
`ctx search --explain`	Search with scoring breakdown per result
`ctx get <id>`	Retrieve a document by ID
`ctx embed pending`	Backfill missing embeddings
`ctx embed rebuild`	Delete and regenerate all embeddings
`ctx export`	Export index as JSON for static site search
`ctx serve mcp`	Start MCP-compatible HTTP server
`ctx connector init <name>`	Scaffold a new Lua connector
`ctx connector test <path>`	Test a connector without writing to DB
`ctx registry list`	List configured registries and available extensions
`ctx registry install`	Clone configured registries
`ctx registry update`	Pull latest changes for registries
`ctx registry search <q>`	Search extensions by name, tag, or description
`ctx registry add <ext>`	Scaffold a config entry for a registry extension
`ctx completions <shell>`	Generate shell completions (bash, zsh, fish)

HTTP API

The server exposes an MCP Streamable HTTP endpoint and REST endpoints. See MCP server reference for details. REST responses follow the schemas in docs/SCHEMAS.md.

Method	Path	Description
POST	`/mcp`	MCP Streamable HTTP endpoint (JSON-RPC for Cursor, Claude, etc.)
POST	`/tools/search`	Search indexed documents (REST)
POST	`/tools/get`	Retrieve a document by ID (REST)
GET	`/tools/list`	List all registered tools (REST)
GET	`/tools/sources`	List connector status (REST)
GET	`/agents/list`	List all registered agents (REST)
POST	`/agents/{name}/prompt`	Resolve agent prompt (REST)
GET	`/health`	Health check

Errors follow a consistent format:

{
  "error": {
    "code": "not_found",
    "message": "document not found: abc-123"
  }
}

Connector Configuration

All connector types support named instances — configure multiple of each. Full reference: Built-in connectors.

Filesystem Connector

[connectors.filesystem.docs]
root = "./docs"
include_globs = ["**/*.md", "**/*.txt"]
exclude_globs = ["**/drafts/**"]
follow_symlinks = false

[connectors.filesystem.notes]
root = "./notes"

Git Connector

Ingest documentation from any Git repository — point it at a repo URL and subdirectory:

[connectors.git.platform]
url = "https://github.com/acme/platform.git"   # or git@... or local path
branch = "main"
root = "docs/"                                  # scan this subdirectory
include_globs = ["**/*.md", "**/*.rst"]
shallow = true                                  # --depth 1 clone

[connectors.git.auth-service]
url = "https://github.com/acme/auth-service.git"
branch = "main"

Features:

Clones on first sync, pulls on subsequent syncs
Per-file last commit timestamp and author from git log
GitHub/GitLab web URLs auto-generated for each file
Shallow clone support to minimize disk usage
Incremental sync via checkpoint timestamps

S3 Connector

Ingest documentation from Amazon S3 buckets:

[connectors.s3.runbooks]
bucket = "acme-docs"
prefix = "engineering/runbooks/"
region = "us-east-1"
include_globs = ["**/*.md", "**/*.json"]
# endpoint_url = "http://localhost:9000"   # for MinIO / LocalStack

Set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.

Features:

Pagination for large buckets (1000+ objects)
LastModified timestamps for incremental sync
ETag tracking in metadata
Custom endpoint URL for S3-compatible services (MinIO, LocalStack)
Glob-based include/exclude filtering on object keys

Lua Script Connectors

Write custom connectors in Lua — no recompilation needed. Scripts have access to HTTP, JSON, environment variables, filesystem, base64, crypto, and logging APIs:

[connectors.script.jira]
path = "connectors/jira.lua"
timeout = 600
url = "https://mycompany.atlassian.net"
api_token = "${JIRA_API_TOKEN}"
project_key = "ENG"

# Scaffold a new connector
ctx connector init jira

# Test it
ctx connector test connectors/jira.lua

# Sync it
ctx sync script:jira

See examples/connectors/github-issues.lua for a complete example.

Extension Registries

Install community connectors, tools, and agents from Git-backed repositories. See Registry overview and Usage guide on the docs site.

Install the Community Registry

ctx registry init --config ./config/ctx.toml

Or during first run, ctx init will offer to install it automatically.

Browse and Install Extensions

# List all available extensions
ctx registry list --config ./config/ctx.toml

# Search for a connector
ctx registry search jira --config ./config/ctx.toml

# See details
ctx registry info connectors/jira --config ./config/ctx.toml

# Add it to your config (scaffolds the TOML entry with placeholders)
ctx registry add connectors/jira --config ./config/ctx.toml

Tools and agents from registries are auto-discovered at server startup — they appear in GET /tools/list and GET /agents/list without explicit config. Connectors need credentials, so they require explicit activation via ctx registry add.

Configure Multiple Registries

[registries.community]
url = "https://github.com/parallax-labs/ctx-registry.git"
path = "~/.ctx/registries/community"
readonly = true
auto_update = true

[registries.company]
url = "git@github.com:myorg/ctx-extensions.git"
path = "~/.ctx/registries/company"
readonly = true

Registries are resolved with precedence: explicit config > .ctx/ project-local > personal > company > community.

Project-Local Extensions

Place a .ctx/ directory in your project root with Lua scripts organized as connectors/<name>/connector.lua, tools/<name>/tool.lua, or agents/<name>/agent.lua. They are auto-discovered from any subdirectory.

Customize an Extension

# Copy to a writable registry for editing
ctx registry override connectors/jira --config ./config/ctx.toml

See the registry docs for the full specification.

Embedding Configuration

Context Harness supports three embedding providers:

Provider	Description	Requires
`local`	Built-in models via fastembed (primary) or tract (musl/Intel Mac) — fully offline	No system deps; model downloads on first use
`ollama`	Local Ollama instance	Running Ollama with an embedding model
`openai`	OpenAI API	`OPENAI_API_KEY` env var

Local (recommended for offline use)

[embedding]
provider = "local"
# model = "all-minilm-l6-v2"  # default, 384 dims — no config needed

Supported models: all-minilm-l6-v2 (384d), bge-small-en-v1.5 (384d), bge-base-en-v1.5 (768d), bge-large-en-v1.5 (1024d), nomic-embed-text-v1 (768d), nomic-embed-text-v1.5 (768d), multilingual-e5-small (384d), multilingual-e5-base (768d), multilingual-e5-large (1024d).

Ollama

[embedding]
provider = "ollama"
model = "nomic-embed-text"
dims = 768
# url = "http://localhost:11434"  # default

OpenAI

[embedding]
provider = "openai"
model = "text-embedding-3-small"
dims = 1536

Set the OPENAI_API_KEY environment variable before using embedding commands.

Platform support (release binaries)

Pre-built release binaries are built for six targets. The local embedding provider is included on all targets: primary platforms use fastembed (bundled ORT); Linux musl and macOS Intel use a pure-Rust (tract) backend.

Binary	Local embeddings	OpenAI / Ollama
Linux x86_64 (glibc)	✅ fastembed	✅
Linux x86_64 (musl)	✅ tract	✅
Linux aarch64	✅ fastembed	✅
macOS x86_64 (Intel)	✅ tract	✅
macOS aarch64 (Apple Silicon)	✅ fastembed	✅
Windows x86_64	✅ fastembed	✅

Minimal binary (no local embeddings): cargo install --path . --no-default-features
From source on musl or Intel Mac (tract backend): cargo build --no-default-features --features local-embeddings-tract
CI/release: Linux cross-builds (musl, aarch64) use Zig and cargo-zigbuild; no cross-rs or system OpenSSL.

See the configuration docs for full platform notes.

Hybrid Search

Hybrid search merges keyword (FTS5/BM25) and semantic (cosine similarity) signals with a configurable alpha:

[retrieval]
hybrid_alpha = 0.6  # 0.0 = keyword only, 1.0 = semantic only

Use ctx search "query" --mode hybrid --explain to see score breakdowns. See search reference and docs/HYBRID_SCORING.md for the full specification.

Server Configuration

[server]
bind = "127.0.0.1:7331"

For production (Docker, systemd, CI), see Deployment.

Configuration

See config/ctx.example.toml for a complete example, or the configuration reference on the docs site.

Documentation

Website: parallax-labs.github.io/context-harness

Link	Description
Getting started	Quick Start, Installation, Nix flake
Configuration	Full `ctx.toml` reference, embedding providers, platform table
CLI reference	Every command and flag
Connectors & registry	Built-in connectors, Lua connectors, extension registry
Guides	Agent integration, Cursor, RAG, multi-repo, deployment
API (Rustdoc)	Generated Rust API docs
Live demo	Search a pre-built knowledge base in the browser

The site also documents the search widget (ctx-search.js) for adding ⌘K search to static sites — see the docs for an example.

If Context Harness is useful to you, consider starring the repo.

License

MIT — see LICENSE

Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.

context-harness