Cognio
Persistent semantic memory server for AI assistants via Model Context Protocol (MCP)
Cognio is a Model Context Protocol (MCP) server that provides persistent semantic memory for AI assistants. Unlike ephemeral chat history, Cognio stores context permanently and enables semantic search across conversations.
Built for:
- Personal knowledge base that grows over time
- Multi-project context management
- Research notes and learning journal
- Conversation history with semantic retrieval
Features
- Semantic Search: Find memories by meaning using sentence-transformers
- Multilingual Support: Search in 100+ languages seamlessly
- Persistent Storage: SQLite-based storage that survives across sessions
- Project Organization: Organize memories by project and tags
- Auto-Tagging: Automatic tag generation via LLM (GPT-4, Groq, etc)
- Text Summarization: Extractive and abstractive summarization for long texts
- MCP Integration: One-click setup for VS Code, Claude, Cursor, and more
- RESTful API: Standard HTTP API with OpenAPI documentation
- Export Capabilities: Export to JSON or Markdown format
- Docker Support: Simple deployment with docker-compose
Quick Start
1. Start the Server
git clone https://github.com/0xReLogic/Cognio.git
cd Cognio
docker-compose up -d
Server runs at http://localhost:8080
2. Auto-Configure AI Clients
The MCP server automatically configures supported AI clients on first start:
Supported Clients:
- Claude Desktop
- Claude Code (CLI)
- VS Code (GitHub Copilot)
- Cursor
- Continue.dev
- Cline
- Windsurf
- Kiro
- Gemini CLI
Quick Setup:
Run the auto-setup script to configure all clients at once:
cd mcp-server
npm run setup
This generates MCP configs for all 9 supported clients automatically.
Manual Configuration:
See mcp-server/README.md for client-specific MCP configuration examples.
On first run, Cognio auto-generates cognio.md in your workspace with usage guide for AI tools.
3. Test It
# Save a memory
curl -X POST http://localhost:8080/memory/save \
-H "Content-Type: application/json" \
-d '{"text": "Docker allows running apps in containers", "project": "LEARNING"}'
# Search memories
curl "http://localhost:8080/memory/search?q=containers"
Or use naturally in your AI client:
"Search my memories for Docker information"
"Remember this: FastAPI is a modern Python web framework"
4. Web UI Dashboard
Access the interactive memory dashboard:
http://localhost:8080/ui
Features:
- Browse and search all memories
- Add/edit memories with markdown preview
- View statistics and insights
- Organize by project and tags
- Bulk operations (select, delete)
- Dark/light theme toggle
- Works locally and in Docker
The dashboard auto-detects the API server, so it works on localhost, Docker containers, and remote deployments.
Documentation
- API Reference - Complete endpoint documentation
- Examples - Usage patterns and integrations
- Quickstart - Installation and configuration
MCP Tools
When using the MCP server, you have access to 11 specialized tools:
| Tool | Description |
|---|---|
save_memory | Save text with optional project/tags (auto-tagging enabled) |
search_memory | Semantic search with project filtering |
list_memories | List memories with pagination and filters |
get_memory_stats | Get storage statistics and insights |
archive_memory | Soft delete a memory (recoverable) |
delete_memory | Permanently delete a memory by ID |
export_memories | Export memories to JSON or Markdown |
summarize_text | Summarize long text (extractive or LLM-based) |
set_active_project | Set active project context (auto-applies to all operations) |
get_active_project | View currently active project |
list_projects | List all available projects from database |
Active Project Workflow:
1. list_projects() → See: Helios-LoadBalancer (45), Cognio-Memory (23), ...
2. set_active_project("Helios-LoadBalancer")
3. save_memory("Cache TTL is 300s") → Auto-saves to Helios-LoadBalancer
4. search_memory("cache settings") → Auto-searches in Helios-LoadBalancer only
5. list_memories() → Lists only Helios-LoadBalancer memories
Project Isolation:
Always specify a project name OR use set_active_project to keep memories organized and prevent mixing contexts between different workspaces.
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /health | Health check |
| POST | /memory/save | Save new memory |
| GET | /memory/search | Semantic/Hybrid search |
| GET | /memory/list | List memories with filters |
| DELETE | /memory/{id} | Delete memory by ID |
| POST | /memory/bulk-delete | Bulk delete by project |
| GET | /memory/stats | Get statistics |
| GET | /memory/export | Export memories |
| POST | /memory/summarize | Summarize long text |
Interactive docs: http://localhost:8080/docs
Configuration
Environment variables (see .env.example):
Copy the example and edit your local overrides:
cp .env.example .env
# Database
DB_PATH=./data/memory.db
# Embeddings
EMBED_MODEL=all-MiniLM-L6-v2
EMBED_DEVICE=cpu
EMBEDDING_CACHE_PATH=./data/embedding_cache.pkl
# API
API_HOST=0.0.0.0
API_PORT=8080
# Optional API key for auth
API_KEY=your-secret-key
# Search
DEFAULT_SEARCH_LIMIT=5
SIMILARITY_THRESHOLD=0.4
HYBRID_ENABLED=true
HYBRID_MODE=rerank # candidate | rerank
HYBRID_ALPHA=0.6 # 0..1, higher = more semantic
HYBRID_RERANK_TOPK=100 # rerank candidate pool size
# Summarization
SUMMARIZATION_ENABLED=true
SUMMARIZATION_METHOD=abstractive # extractive | abstractive
SUMMARIZATION_EMBED_MODEL=all-MiniLM-L6-v2
# Auto-tagging (Optional)
AUTOTAG_ENABLED=true
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-key
GROQ_MODEL=openai/gpt-oss-120b
# OPENAI_API_KEY=your-openai-api-key
# OPENAI_MODEL=gpt-4o-mini
# Performance
MAX_TEXT_LENGTH=10000
BATCH_SIZE=32
SUMMARIZE_THRESHOLD=50
# Logging
LOG_LEVEL=info
Auto-Tagging Models:
openai/gpt-oss-120b- High qualitygpt-4o-mini- OpenAI, fast and cheapllama-3.3-70b-versatile- Groq, balancedllama-3.1-8b-instant- Groq, fastest
See .env.example for all available options and recommendations.
Project Structure
cognio/
├── src/ # Core application
│ ├── main.py # FastAPI app
│ ├── config.py # Environment config
│ ├── models.py # Data schemas
│ ├── database.py # SQLite operations
│ ├── embeddings.py # Semantic search
│ ├── memory.py # Memory CRUD
│ ├── autotag.py # Auto-tagging
│ └── utils.py # Helpers
│
├── mcp-server/ # MCP integration
│ ├── index.js # MCP server
│ └── package.json # Dependencies
│
├── scripts/ # Utilities
│ ├── setup-clients.js # Auto-config AI clients
│ ├── backup.sh # Database backup
│ └── migrate.py # Schema migrations
│
├── tests/ # Test suite
├── docs/ # Documentation
└── examples/ # Usage examples
Development
# Install dependencies
poetry install
# Run tests
pytest
# Start development server
uvicorn src.main:app --reload
Tech Stack
- Backend: Python 3.11+, FastAPI, Uvicorn
- Database: SQLite with JSON support
- Embeddings: sentence-transformers (paraphrase-multilingual-mpnet-base-v2, 768-dim)
- MCP Server: Node.js, @modelcontextprotocol/sdk
- Auto-Tagging: Api
- Testing: pytest, pytest-asyncio, pytest-cov
- Deployment: Docker, docker-compose
Performance
| Operation | Time | Notes |
|---|---|---|
| Save memory | ~20ms | Including embedding |
| Search (1k memories) | ~15ms | Semantic similarity |
| Search (10k memories) | ~50ms | Still fast |
| Model load | ~3s | One-time on startup |
License
MIT License - see LICENSE
Links
- Documentation: docs/
- Issues: GitHub Issues
- Releases: GitHub Releases
Built for better AI conversations