🧠 Cuba-Memorys
Persistent memory for AI agents — A Model Context Protocol (MCP) server that gives AI coding assistants long-term memory with a knowledge graph, Hebbian learning, GraphRAG enrichment, and anti-hallucination grounding.
12 tools with Cuban soul. Zero manual setup. Mathematically rigorous.
v1.6.0 — KG-neighbor query expansion, embedding LRU cache, community summaries, batch access tracking, pool race fix, session-aware search.
Why Cuba-Memorys?
AI agents forget everything between conversations. Cuba-Memorys solves this by giving them:
- A knowledge graph — Entities, observations, and relations that persist across sessions
- Error memory — Never repeat the same mistake twice (anti-repetition guard)
- Hebbian learning — Memories strengthen with use and fade adaptively (FSRS v4 spaced repetition)
- Anti-hallucination — Verify claims against stored knowledge with graduated confidence + source triangulation
- Semantic search — 4-signal RRF fusion (TF-IDF + pg_trgm + full-text + optional pgvector HNSW)
- KG-neighbor query expansion — Auto-expands low-recall searches via graph neighbors (PRF + Zep/Graphiti)
- GraphRAG — Top results enriched with degree-1 graph neighbors for topological context
- REM Sleep — Autonomous background consolidation (FSRS decay + prune + PageRank) after 15min idle
- Graph intelligence — Personalized PageRank, Louvain community detection with summaries, betweenness centrality
- Embedding cache — LRU cache eliminates redundant ONNX inference (~84% reduction)
- Session-aware search — Results matching active session goals boosted by 15%
| Feature | Cuba-Memorys | Basic Memory MCPs |
|---|---|---|
| Knowledge graph with relations | ✅ | ❌ |
| Hebbian learning (Oja's rule) | ✅ | ❌ |
| FSRS v4 adaptive spaced repetition | ✅ | ❌ |
| 4-signal RRF fusion search | ✅ | ❌ |
| KG-neighbor query expansion (V9) | ✅ | ❌ |
| Embedding LRU cache (V10) | ✅ | ❌ |
| GraphRAG topological enrichment | ✅ | ❌ |
| REM Sleep autonomous consolidation | ✅ | ❌ |
| Community summaries (V12) | ✅ | ❌ |
| Source triangulation (V3) | ✅ | ❌ |
| Session-aware search (V13) | ✅ | ❌ |
| Batch access tracking (V14) | ✅ | ❌ |
| Token-budget truncation (V1) | ✅ | ❌ |
| Post-fusion deduplication (V2) | ✅ | ❌ |
| Adaptive confidence weights (V4) | ✅ | ❌ |
| Information density gating (V5) | ✅ | ❌ |
| Session-aware FSRS decay (V6) | ✅ | ❌ |
| Conditional pgvector + HNSW | ✅ | ❌ |
| Modular architecture (CC avg A) | ✅ | ❌ |
| Optional BGE embeddings (ONNX) | ✅ | ❌ |
| Contradiction detection | ✅ | ❌ |
| Graduated confidence scoring | ✅ | ❌ |
| Personalized PageRank | ✅ | ❌ |
| Louvain community detection + summaries | ✅ | ❌ |
| Betweenness centrality (bridges) | ✅ | ❌ |
| Shannon entropy (knowledge diversity) | ✅ | ❌ |
| Chi-squared concept drift detection | ✅ | ❌ |
| Error pattern detection + MTTR | ✅ | ❌ |
| Entity duplicate detection (SQL similarity) | ✅ | ❌ |
| Observation versioning (audit trail) | ✅ | ❌ |
| Temporal validity (valid_from/valid_until) | ✅ | ❌ |
| Write-time dedup gate | ✅ | ❌ |
| Auto-supersede contradictions | ✅ | ❌ |
| Full JSON export/backup (bounded) | ✅ | ❌ |
| Fuzzy search (typo-tolerant) | ✅ | ❌ |
| Spreading activation | ✅ | ❌ |
| Batch observations (10x fewer calls) | ✅ | ❌ |
| Entity type validation | ✅ | ❌ |
| Graceful shutdown (SIGTERM/SIGINT) | ✅ | ❌ |
| Auto-provisions its own DB | ✅ | ❌ |
Quick Start
1. Prerequisites
- Python 3.11+
- Docker (for PostgreSQL)
2. Install
git clone https://github.com/lENADRO1910/cuba-memorys.git
cd cuba-memorys
docker compose up -d
pip install -e .
# Optional: BGE embeddings for semantic search (~130MB model)
pip install -e ".[embeddings]"
3. Configure your AI editor
Add to your MCP configuration:
{
"mcpServers": {
"cuba-memorys": {
"command": "/path/to/cuba_memorys_launcher.sh",
"disabled": false
}
}
}
Or run directly:
DATABASE_URL="postgresql://cuba:memorys2026@127.0.0.1:5488/brain" python -m cuba_memorys
The server auto-creates the brain database and all tables on first run.
🇨🇺 Las 12 Herramientas
Every tool is named after Cuban culture — memorable, professional, and meaningful.
Knowledge Graph
| Tool | Meaning | Description |
|---|---|---|
cuba_alma | Alma — soul, essence | CRUD knowledge entities. Types: concept, project, technology, person, pattern, config. Triggers spreading activation on neighbors. |
cuba_cronica | Crónica — chronicle | Attach observations to entities with contradiction detection, dedup gate, and information density gating (V5). Supports batch_add with density parity. Types: fact, decision, lesson, preference, context, tool_usage. |
cuba_puente | Puente — bridge | Connect entities with typed relations (uses, causes, implements, depends_on, related_to). Traverse walks the graph. Infer discovers transitive connections (A→B→C). |
Search & Verification
| Tool | Meaning | Description |
|---|---|---|
cuba_faro | Faro — lighthouse | Search with 4-signal RRF fusion + KG-neighbor expansion (V9) when recall is low. verify mode with source triangulation (V3) and adaptive confidence (V4). All results get batch access tracking (V14) for FSRS accuracy. Token-budget truncation (V1) + post-fusion dedup (V2). Session-aware: boosts results matching active goals (V13). |
Error Memory
| Tool | Meaning | Description |
|---|---|---|
cuba_alarma | Alarma — alarm | Report errors immediately. Auto-detects patterns (≥3 similar = warning). Hebbian boosting for retrieval. |
cuba_remedio | Remedio — remedy | Mark an error as resolved. Cross-references similar unresolved errors. |
cuba_expediente | Expediente — case file | Search past errors/solutions. Anti-repetition guard: warns if a similar approach previously failed. |
Sessions & Decisions
| Tool | Meaning | Description |
|---|---|---|
cuba_jornada | Jornada — workday | Track working sessions with goals and outcomes. Goals used for session-aware decay (V6) and search boost (V13). |
cuba_decreto | Decreto — decree | Record architecture decisions with context, alternatives, and rationale. |
Memory Maintenance
| Tool | Meaning | Description |
|---|---|---|
cuba_zafra | Zafra — sugar harvest 🎯 | Memory consolidation: decay (FSRS v4 adaptive, session-aware V6), prune, merge, summarize, pagerank (personalized), find_duplicates, export (bounded JSON backup), stats, backfill. |
cuba_eco | Eco — echo | RLHF feedback: positive (Oja's rule boost), negative (decrease), correct (update with versioning). |
cuba_vigia | Vigía — watchman | Graph analytics: summary (counts + token estimate), health (staleness, Shannon entropy, MTTR, DB size), drift (chi-squared), communities (Louvain + V12 summaries for ≥3 members), bridges (betweenness centrality). |
Mathematical Foundations
Cuba-Memorys is built on peer-reviewed algorithms, not ad-hoc heuristics:
FSRS v4 Adaptive Decay — Wozniak (1987) / Ye (2023)
R(t, S) = (1 + t/(9·S))^(-1)
S_new = S · (1 + e^0.1 · (11 - D) · S^(-0.2) · (e^((1-R)·0.9) - 1))
FSRS v4 provides adaptive memory decay. Stability grows with successful recalls — memories that are reinforced survive longer. V6: Active session goals exempt observations from decay.
Oja's Rule (1982) — Hebbian Learning
Positive: Δw = η · (1 - w²) → converges to 1.0, cannot explode
Negative: Δw = η · (1 + w²) → converges to 0.01 (floor)
Where η = 0.05. The w² term provides natural saturation — self-normalizing without explicit clipping.
TF-IDF + RRF Fusion — Salton (1975) / Cormack (2009)
tfidf(t, d) = tf(t, d) · log(N / df(t))
RRF(d) = Σ 1/(k + rank_i(d)) where k = 60
Reciprocal Rank Fusion combines multiple ranked lists from independent signals into a single robust ranking. V2: Post-fusion deduplication removes semantic duplicates via word-overlap ratio.
KG-Neighbor Query Expansion (V9) — Zep/Graphiti (2025)
R₁ = φ(q) # Initial search
q' = q ∪ {e.name : e ∈ neighbors(R₁, depth=1)} # Expand via KG
R₂ = φ_trgm(q') # Re-search
Final = RRF(R₁, R₂) # Fuse
Inspired by Zep/Graphiti BFS search (arXiv 2501.13956 §3.1). Only triggered when initial results < limit (low recall). Caps neighbor names to 5 to prevent query bloat. Literature shows +15-25% recall improvement.
Source Triangulation (V3) — Multi-Source Confidence
diversity = unique_entities / total_evidence
confidence_adjusted = confidence × (0.7 + 0.3 × diversity)
Penalizes single-source evidence to reduce confirmation bias. Multiple independent entities providing evidence increases confidence.
Shannon Entropy — Information Density Gating (V5)
H = -Σ (c/n) · log₂(c/n)
density = H / H_max ∈ [0, 1]
Observations with density < 0.3 get reduced initial importance (0.3 instead of 0.5). Prevents low-information noise from polluting the knowledge base. Applied in both add and batch_add with parity.
Optional BGE Embeddings — BAAI (2023)
model: Qdrant/bge-small-en-v1.5-onnx-Q (quantized, ~130MB)
runtime: ONNX (no PyTorch dependency)
similarity: cosine(embed(query), embed(observation))
Auto-downloads on first use. Falls back to TF-IDF if not installed. V10: LRU cache (256 entries) eliminates redundant ONNX calls (~84% reduction per session).
GraphRAG Enrichment
Top-3 search results are enriched with degree-1 graph neighbors via a single batched SQL query. Each result gets a graph_context array containing neighbor name, entity type, relation type, and Hebbian strength.
REM Sleep Daemon
After 15 minutes of user inactivity, an autonomous consolidation coroutine runs:
- FSRS Decay — Applies memory decay using Ye (2023) v4 algorithm. V6: Excludes observations linked to active session goals
- Prune — Removes low-importance (< 0.1), rarely-accessed observations
- PageRank — Recalculates personalized importance scores
- Relation Decay — Weakens unused relation strengths
- TF-IDF Rebuild — Refreshes the TF-IDF index
- Cache Clear — Clears search cache + V10 embedding LRU cache
Cancels immediately on new user activity. Prevents concurrent runs.
Conditional pgvector
IF pgvector extension detected:
→ Migrate embedding column: float4[] → vector(384)
→ Create HNSW index (m=16, ef_construction=64, vector_cosine_ops)
→ Add vector cosine distance as 4th RRF signal
→ Persist embeddings on observation insert (async, V7/V11)
ELSE:
→ Graceful degradation: TF-IDF + trigrams (unchanged)
Zero-downtime: auto-detects at startup, no configuration needed.
Personalized PageRank — Brin & Page (1998)
PR(v) = (1-α)/N + α · Σ PR(u)/deg(u) where α = 0.85
personalization: biased toward recently active entities
final_importance = 0.6·PR + 0.4·current_importance
Community Detection + Summaries (V12)
communities = Louvain(G, resolution=1.0) — Blondel et al. (2008)
For communities ≥3 entities, V12 generates structured summaries from top-3 observations per entity (by importance). Inspired by Zep/Graphiti hierarchical KG (arXiv 2501.13956 §2.3).
Other Algorithms
- Shannon Entropy — Knowledge diversity scoring (
diversity = H/H_max ∈ [0,1]) - Spreading Activation — Collins & Loftus (1975): neighbors get 0.6% importance boost per access
- Chi-Squared Drift — Pearson (1900): detects error type distribution changes
- Contradiction Detection — TF-IDF overlap (>0.7) + negation patterns (EN + ES)
Architecture
v1.6.0 uses a modular architecture — decomposed into focused modules with cyclomatic complexity avg grade A.
cuba-memorys/
├── docker-compose.yml # Dedicated PostgreSQL (port 5488)
├── pyproject.toml # Package metadata + optional deps
├── README.md
└── src/cuba_memorys/
├── __init__.py # Version (1.6.0)
├── __main__.py # Entry point
├── server.py # Thin re-export (24 LOC)
├── protocol.py # JSON-RPC transport, event loop, REM Sleep daemon
├── handlers.py # 12 MCP tool handlers (CC-reduced via sub-function extraction)
├── constants.py # Tool definitions, thresholds, enums
├── db.py # asyncpg pool + orjson + pgvector detection + V15 race fix
├── schema.sql # 5 tables, 15+ indexes, pg_trgm, versioning, V8 temporal index
├── hebbian.py # FSRS v4, Oja's rule, spreading activation, info density
├── search.py # LRU cache, RRF fusion + V2 dedup, NEIGHBORS_SQL
├── tfidf.py # TF-IDF semantic search (scikit-learn)
└── embeddings.py # Optional BGE embeddings (ONNX) + V10 LRU cache
Database Schema
| Table | Purpose | Key Features |
|---|---|---|
brain_entities | Knowledge graph nodes | tsvector + pg_trgm indexes, importance ∈ [0,1], FSRS stability/difficulty |
brain_observations | Facts attached to entities | 9 types, provenance, versioning, temporal validity, vector(384) embedding (if pgvector) |
brain_relations | Graph edges | 5 types, bidirectional delete, Hebbian strength |
brain_errors | Error memory | JSONB context, synapse weight, MTTR tracking |
brain_sessions | Working sessions | Goals (JSONB), outcome tracking |
Search Pipeline
Cuba-Memorys uses Reciprocal Rank Fusion (RRF, k=60) to combine up to 4 independent ranked signals:
| # | Signal | Source | Condition |
|---|---|---|---|
| 1 | Entities (ts_rank + trigrams + importance + freshness) | brain_entities | Always |
| 2 | Observations (ts_rank + trigrams + TF-IDF + importance) | brain_observations | Always |
| 3 | Errors (ts_rank + trigrams + synapse_weight) | brain_errors | Always |
| 4 | Vector cosine distance (HNSW) | brain_observations.embedding | pgvector installed |
Each signal produces an independent ranking. RRF fuses them: score(d) = Σ 1/(60 + rank_i(d)).
Post-fusion pipeline:
- V2 Deduplication — Removes semantic duplicates via word-overlap ratio
- V9 KG-neighbor expansion — When results < limit, expands query via graph neighbors
- V13 Session boost — Active session goal keywords boost matching results by 15%
- GraphRAG enrichment — Top-3 results get degree-1 neighbor context
- V1 Token-budget truncation — Character budget allocated proportionally to RRF score
- V14 Batch access tracking — All returned observations get
access_countbump for FSRS accuracy
Dependencies
Core:
asyncpg— PostgreSQL async driverorjson— Fast JSON serialization (handles UUID/datetime)scikit-learn— TF-IDF vectorizationnetworkx— PageRank + Louvain + betweenness centralityscipy— Chi-squared statistical testsrapidfuzz— Entity duplicate detectionnumpy— Numerical operations
Optional (pip install -e ".[embeddings]"):
onnxruntime— ONNX model inferencehuggingface-hub— Auto-download BGE modeltokenizers— Fast tokenization
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | — | PostgreSQL connection string (required) |
Docker Compose
Runs a dedicated PostgreSQL 18 Alpine instance:
- Port: 5488 (avoids conflicts with 5432/5433)
- Resources: 256MB RAM, 0.5 CPU
- Restart: always (auto-starts on boot)
- Healthcheck:
pg_isreadyevery 10s
How It Works in Practice
1. The agent learns from your project
Agent: I learned that FastAPI endpoints must use async def with response_model.
→ cuba_alma(create, "FastAPI", technology)
→ cuba_cronica(add, "FastAPI", "All endpoints must be async def with response_model")
2. Error memory prevents repeated mistakes
Agent: I got IntegrityError: duplicate key on numero_parte.
→ cuba_alarma("IntegrityError", "duplicate key on numero_parte")
→ Similar error found! Solution: "Add SELECT EXISTS before INSERT with FOR UPDATE"
3. Anti-hallucination grounding
Agent: Let me verify this claim before responding...
→ cuba_faro("FastAPI uses Django ORM", mode="verify")
→ confidence: 0.0, level: "unknown"
→ recommendation: "No supporting evidence found. High hallucination risk."
4. Memories adapt with FSRS v4
Initial stability: S = 1.0 (decays in ~9 days)
After 5 reviews: S = 8.2 (decays in ~74 days)
After 20 reviews: S = 45.0 (survives ~13 months)
5. KG-neighbor query expansion (V9)
Query: "deployment strategy" → 2 results (below limit of 5)
→ KG neighbors: [Docker, CI/CD, NGINX]
→ Expanded: "deployment strategy Docker CI/CD NGINX"
→ 5 results (recall improved by graph context)
6. Community summaries (V12)
→ cuba_vigia(metric="communities")
→ Community 0 (4 members): [FastAPI, Pydantic, SQLAlchemy, PostgreSQL]
Summary: "FastAPI: All endpoints async def; Pydantic: V2 strict models; ..."
→ Community 1 (3 members): [React, Next.js, TypeScript]
Summary: "React: 19 hooks patterns; Next.js: App Router; ..."
Verification
Tested with NEMESIS protocol (3-tier) — v1.6.0 results:
🟢 Normal (12/12) — All 12 tools, all CRUD actions, GraphRAG enrichment,
Hebbian cross-reference, RRF fusion, Oja's rule,
dedup gate, anti-repetition guard, batch_add,
V14 access_count bump confirmed, V12 community summaries,
V9 KG expansion, V10 embedding cache transparent
🟡 Pessimist (4/4) — Empty queries, non-existent entities, duplicate observations
(dedup gate: similarity 0.85), ungrounded claims
(confidence: 0.0, level: "unknown", hallucination warning)
🔴 Extreme (3/3) — SQL injection (parametrized queries — stored as text, not executed),
XSS (HTML entities escaped), repetitive content AAA×430
(density gating: importance reduced to 0.3)
Previous versions: v1.3.0 (12/12, 8/8, 9/9), v1.1.0 (13/13), v1.0.1 (18/18).
Performance
| Operation | Avg latency |
|---|---|
| RRF hybrid search | < 5ms |
| Analytics | < 2.5ms |
| Entity CRUD | < 1ms |
| PageRank (100 entities) | < 50ms |
| GraphRAG enrichment | < 2ms |
Version History
| Version | Key Changes |
|---|---|
| 1.6.0 | V9 KG-neighbor query expansion, V10 embedding LRU cache, V11 async rebuild_embeddings, V12 community summaries, V14 batch access tracking, V15 pool race fix |
| 1.5.0 | V1 token-budget truncation, V2 post-fusion dedup, V3 source triangulation, V4 adaptive confidence, V5 batch_add density parity, V6 session-aware decay, V7 async embed, V8 temporal index |
| 1.3.0 | Modular architecture (CC avg D→A), 87% CC reduction |
| 1.1.0 | GraphRAG, REM Sleep, conditional pgvector, 4-signal RRF |
| 1.0.0 | Initial release: 12 tools, Hebbian learning, FSRS |
License
CC BY-NC 4.0 — Free to use and modify, not for commercial use.
Author
Leandro Pérez G.
- GitHub: @lENADRO1910
- Email: leandropatodo@gmail.com
Credits
Mathematical foundations: Wozniak (1987), Ye (2023, FSRS v4), Oja (1982), Salton (1975, TF-IDF), Cormack (2009, RRF), Brin & Page (1998, PageRank), Collins & Loftus (1975), Shannon (1948), Pearson (1900, χ²), Blondel et al. (2008, Louvain), McCabe (1976, CC), BAAI (2023, BGE), Malkov & Yashunin (2018, HNSW), Karpicke & Roediger (2008, testing effect), Zep/Graphiti (2025, arXiv 2501.13956).