MCP Hub
Back to servers

entroly

The Context Engineering Engine. Your AI sees 5% of your codebase — Entroly shows it everything. 78% fewer tokens. Works with Cursor, Claude Code, Copilot, OpenClaw.

GitHub
Stars
69
Forks
27
Updated
Mar 25, 2026
Validated
Mar 26, 2026

Entroly Logo

Entroly

The Context Engineering Engine for AI Coding Agents
Your AI sees 5% of your codebase. Entroly shows it everything — 78% fewer tokens.

pip install entroly  —  Works with Cursor, Claude Code, Copilot, Windsurf, OpenClaw

Install  •  Demo  •  Integrations  •  OpenClaw  •  CLI  •  Discuss

Rust Python License PRs Welcome PyPI Tests Docker


See It In Action

Entroly Demo — 78% token savings in 8ms

Run it yourself: pip install entroly && entroly demo

Open the interactive HTML demo for the full animated experience, or generate your own with python docs/generate_demo.py.


The Value

The Entroly Difference — before and after comparison

Every AI coding tool — Cursor, Copilot, Claude Code, Cody — stuffs tokens into the context window until it's full, then cuts. Your AI sees 5-10 files and the rest of your codebase is invisible.

Entroly fixes this. It compresses your entire codebase into the context window at variable resolution, removes duplicates and boilerplate, and learns which context produces better AI responses over time.

You install it once. It runs invisibly. Your AI gives better answers and you spend less on tokens.

BenefitDetails
78% fewer tokens per requestDuplicate code, boilerplate, and low-information content are stripped automatically
100% codebase visibilityEvery file is represented — critical files in full, supporting files as signatures, peripheral files as references
AI responses improve over timeReinforcement learning adjusts context selection weights from session outcomes — no manual tuning
Built-in security scanning55 SAST rules catch hardcoded secrets, SQL injection, command injection, and 5 more CWE categories
Codebase health gradesClone detection, dead symbol finder, god file detection — get an A-F health grade for your project
< 10ms overheadThe Rust engine adds under 10ms per request. You won't notice it
Works with any AI toolMCP server for Cursor/Claude Code, or transparent HTTP proxy for anything else
Runs on Linux, macOS, and WindowsNative support. No WSL required on Windows. Docker optional on all platforms

Context Engineering, Automated

"The LLM is the CPU, the context window is RAM."

Today, every AI coding tool fills that RAM manually — you craft system prompts, configure RAG, curate docs. Entroly automates the entire process.

LayerWhat it solves
Documentation toolsGive your agent up-to-date API docs
Memory systemsRemember things across conversations
RAG / retrievalFind relevant code chunks
Entroly (optimization)Makes everything fit — optimally compresses your entire codebase + docs + memory into the token budget

These layers are complementary. Doc tools give you better docs. Memory gives you persistence. RAG retrieves relevant chunks. Entroly is the optimization layer that makes sure all of it actually fits in your context window without wasting tokens.

Entroly works standalone, or on top of any doc tool, memory system, RAG pipeline, or context source.


Install

pip install entroly

That's it. One command. Works on Linux, macOS, and Windows.

Windows users: If pip is not on your PATH, use python -m pip install entroly.

Connect to your AI tool

Cursor — run entroly init in your project. It generates .cursor/mcp.json automatically.

Claude Code — run claude mcp add entroly -- entroly.

VS Code / Windsurf — run entroly init. Auto-detected.

Any other AI tool — use proxy mode:

pip install entroly[proxy]
entroly proxy --quality balanced

Then point your AI tool's API base URL to http://localhost:9377/v1. Done.

Verify it's working

entroly status     # check if the server/proxy is running
entroly demo       # see a before/after comparison of token savings on your project
entroly dashboard  # open the live metrics dashboard at localhost:9378

Install options

pip install entroly           # Core — MCP server + Python fallback engine
pip install entroly[proxy]    # Add proxy mode (transparent HTTP interception)
pip install entroly[native]   # Add native Rust engine (50-100x faster)
pip install entroly[full]     # Everything

Docker

docker pull ghcr.io/juyterman1000/entroly:latest
docker run --rm -p 9377:9377 -p 9378:9378 -v .:/workspace:ro ghcr.io/juyterman1000/entroly:latest

Multi-arch: linux/amd64 and linux/arm64 (Apple Silicon, AWS Graviton).

Or with Docker Compose: docker compose up -d


Works With Everything

AI ToolSetupMethod
Cursorentroly initMCP server
Claude Codeclaude mcp add entroly -- entrolyMCP server
VS Code + Copilotentroly initMCP server
Windsurfentroly initMCP server
Clineentroly initMCP server
OpenClawSee belowContext Engine
Codyentroly proxyHTTP proxy
Any LLM APIentroly proxyHTTP proxy

OpenClaw Integration

OpenClaw users get the deepest integration. Entroly plugs in as a Context Engine that optimizes every agent type automatically:

Agent TypeWhat Entroly DoesToken Savings
Main agentFull codebase visibility at variable resolution~78%
HeartbeatOnly loads what changed since last check~90%
SubagentsParent context inherited + budget-split via Nash bargaining~70% per agent
Cron jobsMinimal context — just relevant memories + schedule~85%
Group chatEntropy-based message filtering — only high-signal kept~60%
ACP sessionsCross-agent context sharing without duplication~75%

When OpenClaw spawns multiple agents, Entroly's multi-agent budget allocator splits your token budget optimally across all of them. No agent starves. No tokens wasted.

from entroly.context_bridge import MultiAgentContext

ctx = MultiAgentContext(workspace_path="~/.openclaw/workspace")
ctx.ingest_workspace()

# Spawn subagents with automatic budget splitting
sub = ctx.spawn_subagent("main", "researcher", "find auth bugs")

# Schedule background checks
ctx.schedule_cron("email_checker", "check inbox", interval_seconds=900)

# Every agent gets optimized context automatically

How It Works

Entroly Pipeline — 5-stage context optimization

  1. Ingest — Indexes your codebase, builds dependency graphs, fingerprints every fragment for instant dedup
  2. Score — Ranks fragments by information density — high-value code scores high, boilerplate scores low
  3. Select — Picks the mathematically optimal subset that fits your token budget, with diversity (auth + DB + API, not 3x auth files)
  4. Deliver — 3 resolution levels: critical files in full, supporting files as signatures, peripheral files as one-line references
  5. Learn — Tracks which context produced good AI responses, improves selection weights over time

Why Not Just RAG?

Most AI tools use embedding-based retrieval (RAG). Entroly takes a fundamentally different approach:

RAG (vector search)Entroly
Picks context byCosine similarity to your queryInformation-theoretic optimization
Codebase coverageTop-K similar files only100% — every file represented at some resolution
Handles duplicatesSends the same code 3xSimHash dedup catches copies in O(1)
Learns from usageNoYes — RL updates weights from AI response quality
Dependency-awareNoYes — includes auth_config.py when you include auth.py
Budget optimalApproximate (top-K)Mathematically optimal (knapsack solver)
Needs embeddings APIYes (cost + latency)No — runs locally in <10ms

Platform Support

LinuxmacOSWindows
Python 3.10+YesYesYes
Pre-built Rust wheelYesYes (Intel + Apple Silicon)Yes
DockerOptionalOptional (Docker Desktop)Optional (Docker Desktop)
WSL requiredN/AN/ANo
Admin rights requiredNoNoNo

CLI Commands

CommandWhat it does
entroly initDetects your project and AI tool, generates config — one command setup
entroly proxyStarts the invisible proxy. Point your AI tool to localhost:9377
entroly demoShows before/after token savings on your actual project
entroly doctorRuns 7 diagnostic checks — finds problems before you do
entroly dashboardLive metrics: tokens saved, cost reduction, health grade, security findings
entroly healthCodebase health grade (A-F): clones, dead code, god files, architecture violations
entroly roleWeight presets for your workflow: frontend, backend, sre, data, fullstack
entroly autotuneAuto-optimizes engine parameters using mutation-based search
entroly digestWeekly summary of value delivered — tokens saved, cost reduction, improvements
entroly statusCheck if server/proxy/dashboard are running
entroly migrateUpgrades config and checkpoints when you update Entroly
entroly cleanClear cached state and start fresh
entroly benchmarkRun competitive benchmark: Entroly vs raw context vs top-K retrieval
entroly completionsGenerate shell completions for bash, zsh, or fish

Production Ready

Entroly is built for real-world reliability, not demos.

  • Connection recovery — auto-reconnects dropped connections without restarting
  • Large file protection — 500 KB ceiling prevents out-of-memory on giant logs or vendor files
  • Binary file detection — 40+ file types (images, audio, video, archives, databases) are auto-skipped
  • Crash recovery — gzipped checkpoints restore state in under 100ms
  • Cross-platform file locking — safe to run multiple instances
  • Schema migrationentroly migrate handles config upgrades between versions
  • Fragment feedbackPOST /feedback lets your AI tool rate context quality, improving future selections
  • Explainable decisionsGET /explain shows exactly why each code fragment was included or excluded

Need Help?

Self-service:

entroly doctor    # runs 7 diagnostic checks automatically
entroly --help    # see all available commands

Get support:

If you run into any issue, email autobotbugfix@gmail.com with:

  1. The output of entroly doctor
  2. A screenshot of the error
  3. Your OS (Windows/macOS/Linux) and Python version

We respond within 24 hours.

Common issues:

macOS: "externally-managed-environment" error

Homebrew Python requires a virtual environment:

python3 -m venv ~/.venvs/entroly
source ~/.venvs/entroly/bin/activate
pip install entroly[full]
Windows: pip not found
python -m pip install entroly
Port 9377 already in use
entroly proxy --port 9378
Rust engine not loading

Entroly falls back to the Python engine automatically. For the Rust speedup:

pip install entroly[native]

If no pre-built wheel exists for your platform, install the Rust toolchain first.


Part of the Ebbiforge Ecosystem

Entroly integrates with hippocampus-sharp-memory for persistent cross-session memory and Ebbiforge for TF embeddings and RL weight learning. Both are optional.


Quality Presets

Control the speed vs. quality tradeoff:

entroly proxy --quality speed       # minimal optimization, lowest latency
entroly proxy --quality fast        # light optimization
entroly proxy --quality balanced    # recommended for most projects
entroly proxy --quality quality     # deeper analysis, more context diversity
entroly proxy --quality max         # full pipeline, best results
entroly proxy --quality 0.7         # or any float from 0.0 to 1.0

Environment Variables

VariableDefaultWhat it does
ENTROLY_QUALITY0.5Quality dial (0.0-1.0 or preset name)
ENTROLY_PROXY_PORT9377Proxy port
ENTROLY_MAX_FILES5000Max files to auto-index
ENTROLY_RATE_LIMIT0Max requests/min (0 = unlimited)
ENTROLY_NO_DOCKER-Skip Docker, run natively
ENTROLY_MCP_TRANSPORTstdioMCP transport (stdio or sse)

Technical Deep Dive

How Entroly Compares

Cody / CopilotEntroly
ApproachEmbedding similarity searchInformation-theoretic compression + online RL
Coverage5-10 files (the rest is invisible)100% codebase at variable resolution
SelectionTop-K by cosine distanceKKT-optimal bisection with submodular diversity
DedupNoneSimHash + LSH in O(1)
LearningStaticREINFORCE with KKT-consistent baseline
SecurityNoneBuilt-in SAST (55 rules, taint-aware)
TemperatureUser-setSelf-calibrating (no tuning needed)

Architecture

Hybrid Rust + Python. All math runs in Rust via PyO3 (50-100x faster). MCP protocol and orchestration run in Python.

+-----------------------------------------------------------+
|  IDE (Cursor / Claude Code / Cline / Copilot)             |
|                                                           |
|  +---- MCP mode ----+    +---- Proxy mode ----+          |
|  | entroly MCP server|    | localhost:9377     |          |
|  | (JSON-RPC stdio)  |    | (HTTP reverse proxy)|         |
|  +--------+----------+    +--------+-----------+          |
|           |                        |                      |
|  +--------v------------------------v-----------+          |
|  |          Entroly Engine (Python)             |          |
|  |  +-------------------------------------+    |          |
|  |  |  entroly-core (Rust via PyO3)       |    |          |
|  |  |  21 modules . 380 KB . 249 tests    |    |          |
|  |  +-------------------------------------+    |          |
|  +---------------------------------------------+          |
+-----------------------------------------------------------+

Rust Core (21 modules)

ModuleWhatHow
hierarchical.rs3-level codebase compressionSkeleton map + dep-graph expansion + knapsack-optimal fragments
knapsack.rsContext subset selectionKKT dual bisection O(30N) or exact 0/1 DP
knapsack_sds.rsInformation-Optimal SelectionSubmodular diversity + multi-resolution knapsack
prism.rsWeight optimizerSpectral natural gradient on 4x4 gradient covariance
entropy.rsInformation density scoringShannon entropy + boilerplate detection + redundancy
depgraph.rsDependency graphAuto-linking imports, type refs, function calls
skeleton.rsCode skeleton extractionPreserves signatures, strips bodies (60-80% reduction)
dedup.rsDuplicate detection64-bit SimHash, Hamming threshold 3, LSH buckets
lsh.rsSemantic recall index12-table multi-probe LSH, ~3 us over 100K fragments
sast.rsSecurity scanning55 rules, 8 CWE categories, taint-flow analysis
health.rsCodebase healthClone detection, dead symbols, god files, arch violations
guardrails.rsSafety-critical pinningCriticality levels with task-aware budget multipliers
query.rsQuery analysisVagueness scoring, keyword extraction, intent classification
query_persona.rsQuery archetype discoveryRBF kernel + Pitman-Yor process + per-archetype weights
anomaly.rsEntropy anomaly detectionMAD-based robust Z-scores, grouped by directory
semantic_dedup.rsSemantic redundancy removalGreedy marginal information gain, (1-1/e) optimal
utilization.rsResponse utilization scoringTrigram + identifier overlap feedback loop
nkbe.rsMulti-agent budget allocationArrow-Debreu KKT bisection + Nash bargaining + REINFORCE
cognitive_bus.rsEvent routing for agent swarmsISA routing, Poisson rate models, Welford spike detection
fragment.rsCore data structureContent, metadata, scoring dimensions, SimHash fingerprint
lib.rsPyO3 bridgeAll modules exposed to Python, 249 tests

Python Layer

ModuleWhat
proxy.pyInvisible HTTP reverse proxy
proxy_transform.pyRequest parsing, context formatting, temperature calibration
server.pyMCP server with 10+ tools and Python fallbacks
auto_index.pyFile-system crawler for automatic codebase indexing
checkpoint.pyGzipped JSON state serialization
prefetch.pyPredictive context pre-loading
provenance.pyHallucination risk detection
multimodal.pyImage OCR, diagram parsing, voice transcript extraction
context_bridge.pyMulti-agent orchestration for OpenClaw (LOD, HCC, AutoTune)

MCP Tools

ToolPurpose
remember_fragmentStore context with auto-dedup, entropy scoring, dep linking
optimize_contextSelect optimal context subset for a token budget
recall_relevantSub-linear semantic recall via multi-probe LSH
record_outcomeFeed the reinforcement learning loop
explain_contextPer-fragment scoring breakdown
checkpoint_stateSave full session state
resume_stateRestore from checkpoint
prefetch_relatedPredict and pre-load likely-needed context
get_statsSession statistics and cost savings
health_checkClone detection, dead symbols, god files

Novel Algorithms

Entropic Context Compression (ECC) — 3-level hierarchical codebase representation. L1: skeleton map of all files (5% budget). L2: dependency cluster expansion (25%). L3: submodular diversified full fragments (70%).

IOS (Information-Optimal Selection) — Combines Submodular Diversity Selection with Multi-Resolution Knapsack in one greedy pass. (1-1/e) optimality guarantee.

KKT-REINFORCE — The dual variable from the forward budget constraint serves as a per-item REINFORCE baseline. Forward and backward use the same probability.

PRISM — Natural gradient preconditioning via exact Jacobi eigendecomposition of the 4x4 gradient covariance.

PSM (Persona Spectral Manifold) — RBF kernel mean embedding in RKHS for automatic query archetype discovery. Each archetype learns specialized selection weights via Pitman-Yor process.

ADGT — Duality gap as a self-regulating temperature signal. No decay constant needed.

PCNT — PRISM spectral condition number as a weight-uncertainty-aware temperature modulator.

NKBE (Nash-KKT Budgetary Equilibrium) — Game-theoretic multi-agent token allocation. Arrow-Debreu KKT bisection finds the dual price, Nash bargaining ensures fairness, REINFORCE gradient learns from outcomes.

ISA Cognitive Bus — Information-Surprise-Adaptive event routing for agent swarms. Poisson rate models compute KL divergence surprise. Welford accumulators detect anomalous spikes in real-time.

References

Shannon (1948), Charikar (2002), Ebbinghaus (1885), Nemhauser-Wolsey-Fisher (1978), Sviridenko (2004), Boyd & Vandenberghe (Convex Optimization), Williams (1992), Muandet-Fukumizu-Sriperumbudur (2017), LLMLingua (EMNLP 2023), RepoFormer (ICML 2024), FILM-7B (NeurIPS 2024), CodeSage (ICLR 2024).


License

MIT

Reviews

No reviews yet

Sign in to write a review