Curate-Ipsum

A graph-spectral MCP server for verified code synthesis through belief revision

Curate-Ipsum bridges the gap between LLM-generated code (fast, plausible, unverified) and formally verified patches (slow, correct, trustworthy). It treats mutation testing as one component of a larger system for maintaining robust, self-healing codebase metadata that supports reachability analysis, symbolic execution, and automated test generation.

Install

# PyPI
pip install curate-ipsum

# or with uv
uv pip install curate-ipsum

# Docker (includes baked-in embedding model)
docker pull ghcr.io/egoughnour/curate-ipsum:latest

Claude Desktop / MCP Client

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "curate-ipsum": {
      "command": "uvx",
      "args": ["curate-ipsum"]
    }
  }
}

Or with Docker (embedding model pre-loaded, no Python needed):

{
  "mcpServers": {
    "curate-ipsum": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "ghcr.io/egoughnour/curate-ipsum:latest"]
    }
  }
}

MCP Tools

Curate-Ipsum exposes 30 tools over the MCP stdio transport, organised into six groups:

Testing — run_unit_tests, run_integration_tests, run_mutation_tests, get_run_history, get_region_metrics, detect_frameworks, parse_region, check_region_relationship, create_region

Belief Revision — add_assertion, contract_assertion, revise_theory, get_entrenchment, list_assertions, get_theory_snapshot, store_evidence, get_provenance, why_believe, belief_stability

Rollback & Failure — rollback_to, undo_last_operations, analyze_failure, list_world_history

Graph-Spectral — extract_call_graph, compute_partitioning, query_reachability, get_hierarchy, find_function_partition, incremental_update, persistent_graph_stats, graph_query

Verification — verify_property (Z3/angr), verify_with_orchestrator (CEGAR budget escalation), list_verification_backends

Synthesis & RAG — synthesize_patch (CEGIS + genetic + LLM), synthesis_status, cancel_synthesis, list_synthesis_runs, rag_index_nodes, rag_search, rag_stats

Current Status

Last Updated: 2026-02-08

Component	Status
Multi-framework parsing (5 frameworks)	Complete
Graph Infrastructure (Spectral/Kameda)	Complete
Belief Revision Engine (AGM/Provenance)	Complete
Synthesis Loop (CEGIS/Genetic)	Complete
Verification Backends (Z3/angr)	Complete
Graph Persistence (SQLite/Kuzu)	Complete
RAG / Semantic Search (Chroma)	Complete

The Problem

LLMs produce code that is:

✅ Syntactically valid (usually)
✅ Statistically plausible
❌ Semantically correct (sometimes)
❌ Type-safe (by accident)
❌ Formally verified (never)

Current approaches either trust LLM output blindly or reject it entirely. Neither is optimal.

The Solution

Use LLMs for cheap candidate generation, then invest computational resources to achieve formal guarantees:

LLM Candidates (k samples)
        ↓
   Seed Population
        ↓
┌───────────────────────────┐
│  CEGIS + CEGAR + Genetic  │  ← Verification loop
│  + Belief Revision        │
└───────────────────────────┘
        ↓
  Strongly Typed Patch
  (with proof certificate)

Key Differentiators from State of the Art

vs. Traditional Mutation Testing (Stryker, mutmut, cosmic-ray)

Traditional	Curate-Ipsum
Single tool, single language	Multi-framework orchestration
Flat file-level analysis	Hierarchical graph-spectral decomposition
Mutation score as output	Mutation testing as input to synthesis
No formal verification	CEGIS/CEGAR verification loop
Manual test writing	Automated patch generation

vs. LLM Code Generation (Copilot, Claude, GPT)

LLM-only	Curate-Ipsum
Trust model output	Verify model output
Single sample or best-of-k	Population-based refinement
No formal guarantees	Proof certificates
Stateless generation	Belief revision with provenance
Plausible code	Provably correct code

vs. Program Synthesis (Sketch, Rosette, SyGuS)

Traditional Synthesis	Curate-Ipsum
Hand-written sketches	LLM-generated candidates
Cold-start search	Warm-start from LLM population
No learning across runs	Totalizing theory accumulates knowledge
Single specification	Multi-framework implicit regions

vs. Symbolic Execution (KLEE, S2E)

Symbolic Execution	Curate-Ipsum
Path exploration only	Integrated with synthesis
Boolean constraint solving	Mathematical reformulation (SymPy)
Single-tool analysis	Graph DB + SMT + mutation orchestration
No code generation	Generates verified patches

Novel Contributions

Graph-Spectral Code Decomposition
- Fiedler vector partitioning for optimal reachability
- Hierarchical SCC condensation
- Planar subgraph identification → O(1) Kameda queries
- Kuratowski subgraphs as atomic non-planar units
Belief Revision for Synthesis
- AGM-compliant theory revision
- Entrenchment ordering for minimal contraction
- Provenance DAG for failure mode analysis
- Rollback sharpens validity (failures refine the universal model)
Implicit Region Detection
- Spectral anomalies reveal undertested code
- Cross-framework mutation resistance identifies critical regions
- Historical mutability guides partition optimization
Mathematical Constraint Reformulation
- Boolean-intractable → differential/root-finding
- SymPy path condition encoding
- Hybrid SMT + numerical solving

Architecture

flowchart TB
    subgraph MCP["MCP Interface"]
        direction TB

        subgraph Sources["Analysis Sources"]
            direction LR
            MUT["🧬 Mutation<br/>Orchestrator<br/><small>Stryker<br/>mutmut<br/>cosmic-ray</small>"]
            SYM["🔬 Symbolic<br/>Execution<br/><small>KLEE · Z3<br/>SymPy</small>"]
            GRAPH["📊 Graph<br/>Analysis<br/><small>Joern<br/>Neo4j<br/>Fiedler</small>"]
        end

        MUT --> BRE
        SYM --> BRE
        GRAPH --> BRE

        BRE["🧠 Belief Revision Engine<br/><small>AGM Theory · Entrenchment · Provenance DAG</small>"]

        BRE --> SYNTH

        SYNTH["⚙️ Synthesis Loop<br/><small>CEGIS · CEGAR · Genetic Algorithm</small>"]

        SYNTH --> |"counterexample"| BRE

        SYNTH --> OUTPUT

        OUTPUT["✅ Strongly Typed Patch<br/><small>Proof Certificate ·Type Signature<br/>Pre/Post Conditions</small>"]
    end

    LLM["🤖 LLM Candidates<br/><small>top-k samples</small>"] --> SYNTH

    style MCP fill:#1a1a2e,stroke:#16213e,color:#eee
    style Sources fill:#16213e,stroke:#0f3460,color:#eee
    style MUT fill:#0f3460,stroke:#e94560,color:#eee
    style SYM fill:#0f3460,stroke:#e94560,color:#eee
    style GRAPH fill:#0f3460,stroke:#e94560,color:#eee
    style BRE fill:#533483,stroke:#e94560,color:#eee
    style SYNTH fill:#e94560,stroke:#ff6b6b,color:#fff
    style OUTPUT fill:#06d6a0,stroke:#118ab2,color:#000
    style LLM fill:#ffd166,stroke:#ef476f,color:#000

Roadmap

Phase 1: Foundation ✅

MCP server infrastructure
Stryker report parsing
Run history and PID metrics
Flexible region model (hierarchical: file → class → function → lines)
mutmut parser integration
Framework auto-detection
Unified parser interface

Phase 2: Graph Infrastructure ✅

Phase 3: Multi-Framework Orchestration ✅

Unified mutation framework interface
cosmic-ray parser
poodle parser
universalmutator parser

Phase 4: Belief Revision Engine ✅

py-brs library integration (AGM core)
Evidence adapter (mutation results → beliefs)
Theory manager for curate-ipsum
AGM contraction (py-brs v2.0.0 released)
Entrenchment calculation (py-brs v2.0.0)
Provenance DAG storage and queries
Failure mode analyzer
Rollback mechanism

Phase 5: Synthesis Loop ✅

CEGIS implementation with LLM seeding
Genetic algorithm with AST-aware crossover
Entropy monitoring and diversity injection
Counterexample-directed mutation
CEGAR budget escalation (10s → 30s → 120s)

Phase 6: Verification Backends ✅

Z3 integration for SMT solving (default backend)
angr Docker symbolic execution (expensive tier)
CEGAR orchestrator with budget escalation
Verification harness builder (C source generation)
Mock backend for testing
Alternative solvers (CVC5, Boolector)
SymPy path condition encoding

Phase 7: Graph Persistence ✅

Phase 8: RAG / Semantic Search ✅

ChromaDB vector store integration
sentence-transformers embedding provider (all-MiniLM-L6-v2)
Graph-expanded RAG pipeline (vector top-k → neighbor expansion → rerank)
Decay scoring for temporal relevance
CEGIS integration for context-aware synthesis

Phase 9: Production Hardening ✅

CI/CD (GitHub Actions — lint, test matrix, integration, lockfile)
Release pipeline (tag push → PyPI + GHCR + MCP registry)
uv lockfile (149 packages)
pre-commit hooks (ruff format + lint + lock check)
MCP bundle packaging (server.json, smithery.yaml, manifest.json)
HTML/SARIF reporting
IDE extensions (VSCode)
Regression detection and alerting

Future Work

Advanced Orchestration (Deferred)

Implicit region detection (spectral anomalies)
Non-contradictory framework assignment
Cross-framework survival analysis

Semantic Search & RAG

Code Graph RAG for semantic search
Semantic search index (ChromaDB)
RAG retrieval pipeline with graph expansion
Text-to-Cypher queries

Quick Start

# Clone and install (dev)
git clone https://github.com/egoughnour/curate-ipsum.git
cd curate-ipsum
uv sync --extra dev --extra verify --extra rag --extra graph --extra synthesis

# Run the MCP server
uv run curate-ipsum

# Or run tests
make test                     # fast suite (no Docker/model needed)
make test-all                 # including integration tests

Configuration

All configuration is via environment variables (see .env.example):

CURATE_IPSUM_GRAPH_BACKEND=sqlite   # or kuzu
MUTATION_TOOL_DATA_DIR=.mutation_tool_data
MUTATION_TOOL_LOG_LEVEL=INFO
CHROMA_HOST=                         # empty = in-process, or localhost:8000
EMBEDDING_MODEL=all-MiniLM-L6-v2

For the full service stack (ChromaDB + angr runner):

make docker-up-verify         # starts Chroma + angr via Docker Compose

Documentation

Planning & Design

Phase 2 Plan - Active: Graph-spectral infrastructure (9 steps)
Progress - Current status, what's done, what's next
Decisions - Architectural decisions with reasoning (D-001 through D-008)
M1 Multi-Framework Plan - Region model & parser design (done)
BRS Integration Plan - Belief revision integration
BRS v2 Refactoring Plan - Modular architecture
ROADMAP - Full milestone tracker

Architecture

Architectural Vision - Graph-spectral framework
Synthesis Framework - CEGIS/CEGAR/genetic approach
Belief Revision - AGM theory and provenance

Reference

Summary - Functionality catalog
Potential Directions - Enhancement roadmap
Synergies - Tool ecosystem integration
CONTEXT - Session context for AI assistants
DOCS_INDEX - Documentation quick reference

Key References

Alchourrón, Gärdenfors, Makinson (1985). On the Logic of Theory Change
Fiedler (1973). Algebraic Connectivity of Graphs
Kameda (1975). On the Vector Representation of Reachability in Planar Directed Graphs
Solar-Lezama (2008). Program Synthesis by Sketching (CEGIS)
Clarke et al. (2000). Counterexample-Guided Abstraction Refinement (CEGAR)

License

MIT License - see LICENSE

Curate-Ipsum

Quick Install

Curate-Ipsum

Install

Claude Desktop / MCP Client

MCP Tools

Current Status

The Problem

The Solution

Key Differentiators from State of the Art

vs. Traditional Mutation Testing (Stryker, mutmut, cosmic-ray)

vs. LLM Code Generation (Copilot, Claude, GPT)

vs. Program Synthesis (Sketch, Rosette, SyGuS)

vs. Symbolic Execution (KLEE, S2E)

Novel Contributions

Architecture

Roadmap

Phase 1: Foundation ✅

Phase 2: Graph Infrastructure ✅

Phase 3: Multi-Framework Orchestration ✅

Phase 4: Belief Revision Engine ✅

Phase 5: Synthesis Loop ✅

Phase 6: Verification Backends ✅

Phase 7: Graph Persistence ✅

Phase 8: RAG / Semantic Search ✅

Phase 9: Production Hardening ✅

Future Work

Advanced Orchestration (Deferred)

Semantic Search & RAG

Quick Start

Configuration

Documentation

Planning & Design

Architecture

Reference

Key References

License

Reviews