MCP Hub
Back to servers

aura-memory

Persistent cognitive memory for AI agents. Sub-millisecond recall, fully offline, encrypted. 8 tools: recall, recall_structured, store, store_code, store_decision, search, insights, consolidate.

glama
Stars
16
Forks
2
Updated
Mar 5, 2026
Validated
Mar 7, 2026

AuraSDK

Cognitive Memory Engine for AI Agents

Sub-millisecond recall · No LLM calls · No cloud · Pure Rust · ~3 MB

CI PyPI Downloads GitHub stars License: MIT Tests Patent Pending

Open In Colab   Demo Video   Website


LLMs forget everything. Every conversation starts from zero. Existing memory solutions — Mem0, Zep, Cognee — require LLM calls for basic recall, adding latency, cloud dependency, and cost to every operation.

Aura gives your AI agent persistent, hierarchical memory that decays, consolidates, and evolves — like a human brain. One pip install, works fully offline.

pip install aura-memory
from aura import Aura, Level

brain = Aura("./agent_memory")

brain.store("User prefers dark mode", level=Level.Identity, tags=["ui"])
brain.store("Deploy to staging first", level=Level.Decisions, tags=["workflow"])

context = brain.recall("user preferences")  # <1ms — inject into any LLM prompt

Your agent now remembers. No API keys. No embeddings. No config.

⭐ If AuraSDK is useful to you, a GitHub star helps us get funding to continue development from Kyiv.


Why Aura?

AuraMem0ZepCogneeLetta/MemGPT
LLM requiredNoYesYesYesYes
Recall latency<1ms~200ms+~200msLLM-boundLLM-bound
Works offlineFullyPartialNoNoWith local LLM
Cost per operation$0API billingCredit-basedLLM + DB costLLM cost
Binary size~3 MB~50 MB+Cloud serviceHeavy (Neo4j+)Python pkg
Memory decay & promotionBuilt-inVia LLMVia LLMNoVia LLM
Trust & provenanceBuilt-inNoNoNoNo
Encryption at restChaCha20 + Argon2NoNoNoNo
LanguageRustPythonProprietaryPythonPython

Performance

Benchmarked on 1,000 records (Windows 10 / Ryzen 7):

OperationLatencyvs Mem0
Store0.09 ms~same
Recall (structured)0.74 ms~270× faster
Recall (cached)0.48 µs~400,000× faster
Maintenance cycle1.1 msNo equivalent

Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.


How Memory Works

Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:

CORE TIER (slow decay — weeks to months)
  Identity  [0.99]  Who the user is. Preferences. Personality.
  Domain    [0.95]  Learned facts. Domain knowledge.

COGNITIVE TIER (fast decay — hours to days)
  Decisions [0.90]  Choices made. Action items.
  Working   [0.80]  Current tasks. Recent context.

One call runs the full lifecycle — decay, promote, merge duplicates, archive expired:

report = brain.run_maintenance()  # 8 phases, <1ms

Key Features

Core Memory Engine

  • RRF Fusion Recall — Multi-signal ranking: SDR + MinHash + Tag Jaccard (+ optional embeddings)
  • Two-Tier Memory — Cognitive (ephemeral) + Core (permanent) with decay, promotion, and archival
  • Background Maintenance — 8-phase lifecycle: decay, reflect, insights, consolidation, archival
  • Namespace Isolationnamespace="sandbox" keeps test data invisible to production recall
  • Pluggable Embeddings — Optional 4th RRF signal: bring your own embedding function

Trust & Safety

  • Trust & Provenance — Source authority scoring: user input outranks web scrapes, automatically
  • Source Type Tracking — Every memory carries provenance: recorded, retrieved, inferred, generated
  • Auto-Protect Guards — Detects phone numbers, emails, wallets, API keys automatically
  • Encryption — ChaCha20-Poly1305 with Argon2id key derivation

Adaptive Memory

  • Feedback Learningbrain.feedback(id, useful=True) boosts useful memories, weakens noise
  • Semantic Versioningbrain.supersede(old_id, new_content) with full version chains
  • Snapshots & Rollbackbrain.snapshot("v1") / brain.rollback("v1") / brain.diff("v1","v2")
  • Agent-to-Agent Sharingexport_context() / import_context() with trust metadata

Enterprise & Integrations

  • Multimodal Stubsstore_image() / store_audio_transcript() with media provenance
  • Prometheus Metrics/metrics endpoint with 10+ business-level counters and histograms
  • OpenTelemetrytelemetry feature flag with OTLP export and 17 instrumented spans
  • MCP Server — Claude Desktop integration out of the box
  • WASM-ReadyStorageBackend trait abstraction (FsBackend + MemoryBackend)
  • Pure Rust Core — No Python dependencies, no external services

Quick Start

Trust & Provenance

from aura import Aura, TrustConfig

brain = Aura("./data")

tc = TrustConfig()
tc.source_trust = {"user": 1.0, "api": 0.8, "web_scrape": 0.5}
brain.set_trust_config(tc)

# User facts always rank higher than scraped data in recall
brain.store("User is vegan", channel="user")
brain.store("User might like steak restaurants", channel="web_scrape")

results = brain.recall_structured("food preferences", top_k=5)
# -> "User is vegan" scores higher, always

Pluggable Embeddings (Optional)

from aura import Aura

brain = Aura("./data")

# Plug in any embedding function: OpenAI, Ollama, sentence-transformers, etc.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
brain.set_embedding_fn(lambda text: model.encode(text).tolist())

# Now "login problems" matches "Authentication failed" via semantic similarity
brain.store("Authentication failed for user admin")
results = brain.recall_structured("login problems", top_k=5)

Without embeddings, Aura falls back to SDR + MinHash + Tag Jaccard — still fast, still effective.

Encryption

brain = Aura("./secret_data", password="my-secure-password")
brain.store("Top secret information")
assert brain.is_encrypted()  # ChaCha20-Poly1305 + Argon2id

Namespace Isolation

brain = Aura("./data")

brain.store("Real preference: dark mode", namespace="default")
brain.store("Test: user likes light mode", namespace="sandbox")

# Recall only sees "default" namespace — sandbox is invisible
results = brain.recall_structured("user preference", top_k=5)

Cookbook: Personal Assistant That Remembers

The killer use case: an agent that remembers your preferences after a week offline, with zero API calls.

See examples/personal_assistant.py for the full runnable script.

from aura import Aura, Level

brain = Aura("./assistant_memory")

# Day 1: User tells the agent about themselves
brain.store("User is vegan", level=Level.Identity, tags=["diet"])
brain.store("User loves jazz music", level=Level.Identity, tags=["music"])
brain.store("User works 10am-6pm", level=Level.Identity, tags=["schedule"])
brain.store("Discuss quarterly report tomorrow", level=Level.Working, tags=["task"])

# Simulate a week passing — run maintenance cycles
for _ in range(7):
    brain.run_maintenance()  # decay + reflect + consolidate + archive

# Day 8: What does the agent remember?
context = brain.recall("user preferences and personality")
# -> Still remembers: vegan, jazz, schedule (Identity, strength ~0.93)
# -> "quarterly report" decayed heavily (Working, strength ~0.21)

Identity persists. Tasks fade. Important patterns get promoted. Like a real brain.


MCP Server (Claude Desktop)

Give Claude persistent memory across conversations:

pip install aura-memory

Add to Claude Desktop config (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "aura": {
      "command": "python",
      "args": ["-m", "aura", "mcp", "C:\\Users\\YOUR_NAME\\aura_brain"]
    }
  }
}

Provides 8 tools: recall, recall_structured, store, store_code, store_decision, search, insights, consolidate.


Dashboard UI

Aura includes a standalone web dashboard for visual memory management. Download from GitHub Releases.

./aura-dashboard ./my_brain --port 8000

Features: Analytics · Memory Explorer with filtering · Recall Console with live scoring · Batch ingest

PlatformBinary
Windows x64aura-dashboard-windows-x64.exe
Linux x64aura-dashboard-linux-x64
macOS ARMaura-dashboard-macos-arm64
macOS x64aura-dashboard-macos-x64

Integrations & Examples

Try now: Open In Colab — zero install, runs in browser

IntegrationDescriptionLink
OllamaFully local AI assistant, no API key neededollama_agent.py
LangChainDrop-in Memory class + prompt injectionlangchain_agent.py
LlamaIndexChat engine with persistent memory recallllamaindex_agent.py
OpenAI AgentsDynamic instructions with persistent memoryopenai_agents.py
Claude SDKSystem prompt injection + tool use patternsclaude_sdk_agent.py
CrewAITool-based recall/store for crew agentscrewai_agent.py
AutoGenMemory protocol implementationautogen_agent.py
FastAPIPer-user memory middleware with namespace isolationfastapi_middleware.py

FFI (C/Go/C#): aura.h · go/main.go · csharp/Program.cs

More examples: basic_usage.py · encryption.py · agent_memory.py · edge_device.py · maintenance_daemon.py · research_bot.py


Architecture

52 Rust modules · ~23,500 lines · 272 Rust + 347 Python = 619 tests

Python  ──  from aura import Aura  ──▶  aura._core (PyO3)
                                              │
Rust    ──────────────────────────────────────┘
        ┌─────────────────────────────────────────────┐
        │  Aura Engine                                │
        │                                             │
        │  Two-Tier Memory                            │
        │  ├── Cognitive Tier (Working + Decisions)   │
        │  └── Core Tier (Domain + Identity)          │
        │                                             │
        │  Recall Engine (RRF Fusion, k=60)           │
        │  ├── SDR similarity (256k bit)              │
        │  ├── MinHash N-gram                         │
        │  ├── Tag Jaccard                            │
        │  └── Embedding (optional, pluggable)        │
        │                                             │
        │  Adaptive Memory                            │
        │  ├── Feedback learning (boost/weaken)       │
        │  ├── Snapshots & rollback                   │
        │  ├── Supersede (version chains)             │
        │  └── Agent-to-agent sharing protocol        │
        │                                             │
        │  Knowledge Graph · Living Memory            │
        │  Trust & Provenance · PII Guards            │
        │  Encryption (ChaCha20 + Argon2id)           │
        │  StorageBackend (Fs / Memory / WASM)        │
        │  Telemetry (Prometheus + OpenTelemetry)      │
        └─────────────────────────────────────────────┘

API Reference

See docs/API.md for the complete API reference (40+ methods).

Roadmap

See docs/ROADMAP.md for the full development roadmap.

Completed (6 phases):

  • Phase 1 — Community & Trust: benchmarks, CONTRIBUTING.md, issue templates
  • Phase 2 — Ecosystem Gaps: LlamaIndex, temporal queries, event callbacks
  • Phase 3 — Drop-in Adoption: LangChain Memory class, FastAPI middleware, Claude SDK
  • Phase 4 — New Markets: C FFI + Go/C# examples, WASM storage abstraction
  • Phase 5 — Enterprise: Prometheus + OpenTelemetry, multimodal stubs, stress tests (100K/1M)
  • Phase 6 — Competitive Moat: adaptive recall, snapshots, agent sharing, semantic versioning

Remaining:

  • TypeScript/WASM build via wasm-pack + NPM package (storage abstraction done)
  • Cloudflare Workers edge runtime (depends on WASM)
  • Java FFI example, PyPI publish, benchmark CI

Resources


Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines, or check the open issues.

If Aura saves you time, a GitHub star helps others discover it and helps us continue development.


License & Intellectual Property

  • Code License: MIT — see LICENSE.
  • Patent Notice: The core cognitive architecture (DNA Layering, Cognitive Crystallization, SDR Indexing, Synaptic Synthesis) is Patent Pending (US Provisional Application No. 63/969,703). See PATENT for details. Commercial integration of these architectural concepts into enterprise products requires a commercial license. The open-source SDK is freely available under MIT for non-commercial, academic, and standard agent integrations.

Built in Kyiv, Ukraine 🇺🇦 — including during power outages.
Solo developer project. If you find this useful, your star means more than you think.

Reviews

No reviews yet

Sign in to write a review