AuraSDK

Cognitive Memory Engine for AI Agents

Sub-millisecond recall · No LLM calls · No cloud · Pure Rust · ~3 MB

LLMs forget everything. Every conversation starts from zero. Existing memory solutions — Mem0, Zep, Cognee — require LLM calls for basic recall, adding latency, cloud dependency, and cost to every operation.

Aura gives your AI agent persistent, hierarchical memory that decays, consolidates, and evolves — like a human brain. One pip install, works fully offline.

pip install aura-memory

from aura import Aura, Level

brain = Aura("./agent_memory")

brain.store("User prefers dark mode", level=Level.Identity, tags=["ui"])
brain.store("Deploy to staging first", level=Level.Decisions, tags=["workflow"])

context = brain.recall("user preferences")  # <1ms — inject into any LLM prompt

Your agent now remembers. No API keys. No embeddings. No config.

⭐ If AuraSDK is useful to you, a GitHub star helps us get funding to continue development from Kyiv.

Why Aura?

	Aura	Mem0	Zep	Cognee	Letta/MemGPT
LLM required	No	Yes	Yes	Yes	Yes
Recall latency	<1ms	~200ms+	~200ms	LLM-bound	LLM-bound
Works offline	Fully	Partial	No	No	With local LLM
Cost per operation	$0	API billing	Credit-based	LLM + DB cost	LLM cost
Binary size	~3 MB	~50 MB+	Cloud service	Heavy (Neo4j+)	Python pkg
Memory decay & promotion	Built-in	Via LLM	Via LLM	No	Via LLM
Trust & provenance	Built-in	No	No	No	No
Encryption at rest	ChaCha20 + Argon2	No	No	No	No
Language	Rust	Python	Proprietary	Python	Python

Performance

Benchmarked on 1,000 records (Windows 10 / Ryzen 7):

Operation	Latency	vs Mem0
Store	0.09 ms	~same
Recall (structured)	0.74 ms	~270× faster
Recall (cached)	0.48 µs	~400,000× faster
Maintenance cycle	1.1 ms	No equivalent

Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.

How Memory Works

Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:

CORE TIER (slow decay — weeks to months)
  Identity  [0.99]  Who the user is. Preferences. Personality.
  Domain    [0.95]  Learned facts. Domain knowledge.

COGNITIVE TIER (fast decay — hours to days)
  Decisions [0.90]  Choices made. Action items.
  Working   [0.80]  Current tasks. Recent context.

One call runs the full lifecycle — decay, promote, merge duplicates, archive expired:

report = brain.run_maintenance()  # 8 phases, <1ms

Key Features

Core Memory Engine

RRF Fusion Recall — Multi-signal ranking: SDR + MinHash + Tag Jaccard (+ optional embeddings)
Two-Tier Memory — Cognitive (ephemeral) + Core (permanent) with decay, promotion, and archival
Background Maintenance — 8-phase lifecycle: decay, reflect, insights, consolidation, archival
Namespace Isolation — namespace="sandbox" keeps test data invisible to production recall
Pluggable Embeddings — Optional 4th RRF signal: bring your own embedding function

Trust & Safety

Trust & Provenance — Source authority scoring: user input outranks web scrapes, automatically
Source Type Tracking — Every memory carries provenance: recorded, retrieved, inferred, generated
Auto-Protect Guards — Detects phone numbers, emails, wallets, API keys automatically
Encryption — ChaCha20-Poly1305 with Argon2id key derivation

Adaptive Memory

Feedback Learning — brain.feedback(id, useful=True) boosts useful memories, weakens noise
Semantic Versioning — brain.supersede(old_id, new_content) with full version chains
Snapshots & Rollback — brain.snapshot("v1") / brain.rollback("v1") / brain.diff("v1","v2")
Agent-to-Agent Sharing — export_context() / import_context() with trust metadata

Enterprise & Integrations

Multimodal Stubs — store_image() / store_audio_transcript() with media provenance
Prometheus Metrics — /metrics endpoint with 10+ business-level counters and histograms
OpenTelemetry — telemetry feature flag with OTLP export and 17 instrumented spans
MCP Server — Claude Desktop integration out of the box
WASM-Ready — StorageBackend trait abstraction (FsBackend + MemoryBackend)
Pure Rust Core — No Python dependencies, no external services

Quick Start

Trust & Provenance

from aura import Aura, TrustConfig

brain = Aura("./data")

tc = TrustConfig()
tc.source_trust = {"user": 1.0, "api": 0.8, "web_scrape": 0.5}
brain.set_trust_config(tc)

# User facts always rank higher than scraped data in recall
brain.store("User is vegan", channel="user")
brain.store("User might like steak restaurants", channel="web_scrape")

results = brain.recall_structured("food preferences", top_k=5)
# -> "User is vegan" scores higher, always

Pluggable Embeddings (Optional)

from aura import Aura

brain = Aura("./data")

# Plug in any embedding function: OpenAI, Ollama, sentence-transformers, etc.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
brain.set_embedding_fn(lambda text: model.encode(text).tolist())

# Now "login problems" matches "Authentication failed" via semantic similarity
brain.store("Authentication failed for user admin")
results = brain.recall_structured("login problems", top_k=5)

Without embeddings, Aura falls back to SDR + MinHash + Tag Jaccard — still fast, still effective.

Encryption

brain = Aura("./secret_data", password="my-secure-password")
brain.store("Top secret information")
assert brain.is_encrypted()  # ChaCha20-Poly1305 + Argon2id

Namespace Isolation

brain = Aura("./data")

brain.store("Real preference: dark mode", namespace="default")
brain.store("Test: user likes light mode", namespace="sandbox")

# Recall only sees "default" namespace — sandbox is invisible
results = brain.recall_structured("user preference", top_k=5)

Cookbook: Personal Assistant That Remembers

The killer use case: an agent that remembers your preferences after a week offline, with zero API calls.

See examples/personal_assistant.py for the full runnable script.

from aura import Aura, Level

brain = Aura("./assistant_memory")

# Day 1: User tells the agent about themselves
brain.store("User is vegan", level=Level.Identity, tags=["diet"])
brain.store("User loves jazz music", level=Level.Identity, tags=["music"])
brain.store("User works 10am-6pm", level=Level.Identity, tags=["schedule"])
brain.store("Discuss quarterly report tomorrow", level=Level.Working, tags=["task"])

# Simulate a week passing — run maintenance cycles
for _ in range(7):
    brain.run_maintenance()  # decay + reflect + consolidate + archive

# Day 8: What does the agent remember?
context = brain.recall("user preferences and personality")
# -> Still remembers: vegan, jazz, schedule (Identity, strength ~0.93)
# -> "quarterly report" decayed heavily (Working, strength ~0.21)

Identity persists. Tasks fade. Important patterns get promoted. Like a real brain.

MCP Server (Claude Desktop)

Give Claude persistent memory across conversations:

pip install aura-memory

Add to Claude Desktop config (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "aura": {
      "command": "python",
      "args": ["-m", "aura", "mcp", "C:\\Users\\YOUR_NAME\\aura_brain"]
    }
  }
}

Provides 8 tools: recall, recall_structured, store, store_code, store_decision, search, insights, consolidate.

Dashboard UI

Aura includes a standalone web dashboard for visual memory management. Download from GitHub Releases.

./aura-dashboard ./my_brain --port 8000

Features: Analytics · Memory Explorer with filtering · Recall Console with live scoring · Batch ingest

Platform	Binary
Windows x64	`aura-dashboard-windows-x64.exe`
Linux x64	`aura-dashboard-linux-x64`
macOS ARM	`aura-dashboard-macos-arm64`
macOS x64	`aura-dashboard-macos-x64`

Integrations & Examples

Try now: — zero install, runs in browser

Integration	Description	Link
Ollama	Fully local AI assistant, no API key needed	`ollama_agent.py`
LangChain	Drop-in Memory class + prompt injection	`langchain_agent.py`
LlamaIndex	Chat engine with persistent memory recall	`llamaindex_agent.py`
OpenAI Agents	Dynamic instructions with persistent memory	`openai_agents.py`
Claude SDK	System prompt injection + tool use patterns	`claude_sdk_agent.py`
CrewAI	Tool-based recall/store for crew agents	`crewai_agent.py`
AutoGen	Memory protocol implementation	`autogen_agent.py`
FastAPI	Per-user memory middleware with namespace isolation	`fastapi_middleware.py`

FFI (C/Go/C#): aura.h · go/main.go · csharp/Program.cs

More examples: basic_usage.py · encryption.py · agent_memory.py · edge_device.py · maintenance_daemon.py · research_bot.py

Architecture

52 Rust modules · ~23,500 lines · 272 Rust + 347 Python = 619 tests

Python  ──  from aura import Aura  ──▶  aura._core (PyO3)
                                              │
Rust    ──────────────────────────────────────┘
        ┌─────────────────────────────────────────────┐
        │  Aura Engine                                │
        │                                             │
        │  Two-Tier Memory                            │
        │  ├── Cognitive Tier (Working + Decisions)   │
        │  └── Core Tier (Domain + Identity)          │
        │                                             │
        │  Recall Engine (RRF Fusion, k=60)           │
        │  ├── SDR similarity (256k bit)              │
        │  ├── MinHash N-gram                         │
        │  ├── Tag Jaccard                            │
        │  └── Embedding (optional, pluggable)        │
        │                                             │
        │  Adaptive Memory                            │
        │  ├── Feedback learning (boost/weaken)       │
        │  ├── Snapshots & rollback                   │
        │  ├── Supersede (version chains)             │
        │  └── Agent-to-agent sharing protocol        │
        │                                             │
        │  Knowledge Graph · Living Memory            │
        │  Trust & Provenance · PII Guards            │
        │  Encryption (ChaCha20 + Argon2id)           │
        │  StorageBackend (Fs / Memory / WASM)        │
        │  Telemetry (Prometheus + OpenTelemetry)      │
        └─────────────────────────────────────────────┘

API Reference

See docs/API.md for the complete API reference (40+ methods).

Roadmap

See docs/ROADMAP.md for the full development roadmap.

Completed (6 phases):

Phase 1 — Community & Trust: benchmarks, CONTRIBUTING.md, issue templates
Phase 2 — Ecosystem Gaps: LlamaIndex, temporal queries, event callbacks
Phase 3 — Drop-in Adoption: LangChain Memory class, FastAPI middleware, Claude SDK
Phase 4 — New Markets: C FFI + Go/C# examples, WASM storage abstraction
Phase 5 — Enterprise: Prometheus + OpenTelemetry, multimodal stubs, stress tests (100K/1M)
Phase 6 — Competitive Moat: adaptive recall, snapshots, agent sharing, semantic versioning

Remaining:

TypeScript/WASM build via wasm-pack + NPM package (storage abstraction done)
Cloudflare Workers edge runtime (depends on WASM)
Java FFI example, PyPI publish, benchmark CI

Resources

Demo Video (30s) — Quick overview
API Reference — Complete API docs
Examples — Ready-to-run scripts
Roadmap — Development plan
Landing Page — Project overview

Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines, or check the open issues.

⭐ If Aura saves you time, a GitHub star helps others discover it and helps us continue development.

License & Intellectual Property

Code License: MIT — see LICENSE.
Patent Notice: The core cognitive architecture (DNA Layering, Cognitive Crystallization, SDR Indexing, Synaptic Synthesis) is Patent Pending (US Provisional Application No. 63/969,703). See PATENT for details. Commercial integration of these architectural concepts into enterprise products requires a commercial license. The open-source SDK is freely available under MIT for non-commercial, academic, and standard agent integrations.

Built in Kyiv, Ukraine 🇺🇦 — including during power outages.
_{Solo developer project. If you find this useful, your star means more than you think.}

aura-memory