MCP Hub
Back to servers

roam-code

Architectural intelligence layer for AI coding agents — structural graph, architecture governance, multi-agent orchestration, vulnerability mapping. 94 commands, 26 languages, 100% local.

Stars
338
Forks
28
Updated
Feb 24, 2026
Validated
Feb 26, 2026

Roam Code

The architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping, runtime analysis -- one CLI, zero API keys.

136 canonical commands (+1 legacy alias = 137 invokable names) · 26 languages · architecture OS · 100% local

PyPI version GitHub stars CI Python 3.9+ License: MIT


What is Roam?

Roam is a structural intelligence engine for software. It pre-indexes your codebase into a semantic graph -- symbols, dependencies, call graphs, architecture layers, git history, and runtime traces -- stored in a local SQLite DB. Agents query it via CLI or MCP instead of repeatedly grepping files and guessing structure.

Unlike LSPs (editor-bound, language-specific) or Sourcegraph (hosted search), Roam provides architecture-level graph queries -- offline, cross-language, and compact. It goes beyond comprehension: Roam governs architecture through budget gates, simulates refactoring outcomes, orchestrates multi-agent swarms with zero-conflict guarantees, maps vulnerability reachability paths, and enables graph-level code editing without syntax errors.

Codebase ──> [Index] ──> Semantic Graph ──> 136 Commands ──> AI Agent
              │              │                  │
           tree-sitter    symbols            comprehend
           26 languages   + edges            govern
           git history    + metrics          refactor
           runtime traces + architecture     orchestrate

The problem

Coding agents explore codebases inefficiently: dozens of grep/read cycles, high token cost, no structural understanding. Roam replaces this with one graph query:

$ roam context Flask
Callers: 47  Callees: 3
Affected tests: 31

Files to read:
  src/flask/app.py:76-963              # definition
  src/flask/__init__.py:1-15           # re-export
  src/flask/testing.py:22-45           # caller: FlaskClient.__init__
  tests/test_basic.py:12-30            # caller: test_app_factory
  ...12 more files

Terminal demo

roam terminal demo

Core commands

$ roam understand              # full codebase briefing
$ roam context <name>          # files-to-read with exact line ranges
$ roam preflight <name>        # blast radius + tests + complexity + architecture rules
$ roam health                  # composite score (0-100)
$ roam diff                    # blast radius of uncommitted changes

What's New in v11

MCP v2 for Agent-First Workflows

  • In-process MCP execution removes per-call subprocess overhead.
  • 4 compound operations (roam_explore, roam_prepare_change, roam_review_change, roam_diagnose_issue) reduce multi-step agent workflows to single calls.
  • Preset-based tool surfacing (core, review, refactor, debug, architecture, full) keeps default tool choice tight for agents while retaining full depth on demand.
  • MCP tools now expose structured schemas and richer annotations for safer planner behavior.
  • MCP token overhead for default core context dropped from ~36K to <3K tokens (about 92% reduction).

Performance and Retrieval

  • Symbol search moved to SQLite FTS5/BM25: typical search moved from seconds to milliseconds (about 1000x on benchmarked paths).
  • Incremental indexing shifted from O(N) full-edge rebuild behavior to O(changed) updates.
  • DB/runtime optimizations (mmap_size, safer large-graph guards, batched writes) reduce first-run and reindex friction on larger repos.

CI, Governance, and Delivery

  • GitHub Action supports quality gates, SARIF upload, sticky PR comments, and cache-aware execution.
  • CI hardening includes changed-only analysis mode, trend-aware gates, and SARIF pre-upload guardrails (size/result caps + truncation signaling).
  • Agent governance expanded with verification and AI-quality tooling (roam verify, roam vibe-check, roam ai-readiness, roam ai-ratio) for teams managing agent-written code.

Best for

  • Agent-assisted coding -- structured answers that reduce token usage vs raw file exploration
  • Large codebases (100+ files) -- graph queries beat linear search at scale
  • Architecture governance -- health scores, CI quality gates, budget enforcement, fitness functions
  • Safe refactoring -- blast radius, affected tests, pre-change safety checks, graph-level editing
  • Multi-agent orchestration -- partition codebases for parallel agent work with zero-conflict guarantees
  • Security analysis -- vulnerability reachability mapping, auth gaps, CVE path tracing
  • Algorithm optimization -- detect O(n^2) loops, N+1 queries, and 21 other anti-patterns with suggested fixes
  • Backend quality -- auth gaps, missing indexes, over-fetching models, non-idempotent migrations, orphan routes, API drift
  • Runtime analysis -- overlay production trace data onto the static graph for hotspot detection
  • Multi-repo projects -- cross-repo API edge detection between frontend and backend

When NOT to use Roam

  • Real-time type checking -- use an LSP (pyright, gopls, tsserver). Roam is static and offline.
  • Small scripts (<10 files) -- just read the files directly.
  • Pure text search -- ripgrep is faster for raw string matching.

Why use Roam

Speed. One command replaces 5-10 tool calls (in typical workflows). Under 0.5s for any query.

Dependency-aware. Computes structure, not string matches. Knows Flask has 47 dependents and 31 affected tests. grep knows it appears 847 times.

LLM-optimized output. Plain ASCII, compact abbreviations (fn, cls, meth), --json envelopes. Designed for agent consumption, not human decoration.

Fully local. No API keys, telemetry, or network calls. Works in air-gapped environments.

Algorithm-aware. Built-in catalog of 23 anti-patterns. Detects suboptimal algorithms (quadratic loops, N+1 queries, unbounded recursion) and suggests fixes with Big-O improvements and confidence scores. Receiver-aware loop-invariant analysis minimizes false positives.

CI-ready. --json output, --gate quality gates, GitHub Action, SARIF 2.1.0.

Without RoamWith Roam
Tool calls81
Wall time~11s<0.5s
Tokens consumed~15,000~3,000

Measured on a typical agent workflow in a 200-file Python project (Flask). See benchmarks for more.

Table of Contents

Getting Started: What is Roam? · What's New in v11 · Best for · Why use Roam · Install · Quick Start

Using Roam: Commands · Walkthrough · AI Coding Tools · MCP Server

Operations: CI/CD Integration · SARIF Output · For Teams

Reference: Language Support · Performance · How It Works · How Roam Compares · FAQ

More: Limitations · Troubleshooting · Update / Uninstall · Development · Contributing

Install

pip install roam-code

# Recommended: isolated environment
pipx install roam-code
# or
uv tool install roam-code

# From source
pip install git+https://github.com/Cranot/roam-code.git

Requires Python 3.9+. Works on Linux, macOS, and Windows.

Windows: If roam is not found after installing with uv, run uv tool update-shell and restart your terminal.

Docker (alpine-based)

docker build -t roam-code .
docker run --rm -v "$PWD:/workspace" roam-code index
docker run --rm -v "$PWD:/workspace" roam-code health

Quick Start

cd your-project
roam init                  # indexes codebase, creates config + CI workflow
roam understand            # full codebase briefing

First index takes ~5s for 200 files, ~15s for 1,000 files. Subsequent runs are incremental and near-instant.

Next steps:

  • Set up your AI agent: roam describe --write (auto-detects CLAUDE.md, AGENTS.md, .cursor/rules, etc. — see integration instructions)
  • Explore: roam healthroam weatherroam map
  • Add to CI: roam init already generated a GitHub Action
Try it on Roam itself
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
roam init
roam understand
roam health

Works With

Claude CodeCursorWindsurfGitHub CopilotAiderClineGemini CLIOpenAI Codex CLIMCPGitHub ActionsGitLab CIAzure DevOps

Commands

The 5 core commands shown above cover ~80% of agent workflows. 136 canonical commands (+1 legacy alias = 137 invokable names) are organized into 7 categories.

Full command reference

Getting Started

CommandDescription
roam index [--force] [--verbose]Build or rebuild the codebase index
roam watch [--interval N] [--debounce N] [--webhook-port P] [--guardian]Long-running index daemon: poll/webhook-triggered refreshes plus optional continuous architecture-guardian snapshots and JSONL compliance artifacts
roam initGuided onboarding: creates .roam/fitness.yaml, CI workflow, runs index, shows health
roam hooks [--install] [--uninstall]Manage git hooks for automated roam index updates and health gates
roam doctorDiagnose installation and environment: verify tree-sitter grammars, SQLite, git, and config health
roam reset [--hard]Reset the roam index and cached data. --hard removes all .roam/ artifacts
roam clean [--all]Remove stale or orphaned index entries without a full rebuild
roam understandFull codebase briefing: tech stack, architecture, key abstractions, health, conventions, complexity overview, entry points
roam onboardStructured onboarding guide: architecture map, key files, suggested reading order, and first tasks
roam tour [--write PATH]Auto-generated onboarding guide: top symbols, reading order, entry points, language breakdown. --write saves to Markdown
roam describe [--write] [--force] [-o PATH] [--agent-prompt]Auto-generate project description for AI agents. --write auto-detects your agent's config file. --agent-prompt returns a compact (<500 token) system prompt
roam agent-export [--format F] [--write]Generate agent-context bundle from project analysis (AGENTS.md + provider-specific overlays)
roam minimap [--update] [-o FILE] [--init-notes]Compact annotated codebase snapshot for CLAUDE.md injection: stack, annotated directory tree, key symbols by PageRank, high fan-in symbols to avoid touching, hotspots, conventions. Sentinel-based in-place updates
roam config [--set-db-dir PATH] [--semantic-backend MODE]Manage .roam/config.json (DB path, excludes, optional ONNX semantic settings)
roam map [-n N] [--full] [--budget N]Project skeleton: files, languages, entry points, top symbols by PageRank. --budget caps output to N tokens
roam schema [--diff] [--version V]JSON envelope schema versioning: view, diff, and validate output schemas
roam mcp [--list-tools] [--transport T]Start MCP server (stdio/SSE/streamable-http), inspect available tools, and expose roam to coding agents
roam mcp-setup <platform>Generate MCP config snippets for AI platforms: claude-code, cursor, windsurf, vscode, gemini-cli, codex-cli

Daily Workflow

CommandDescription
roam file <path> [--full] [--changed] [--deps-of PATH]File skeleton: all definitions with signatures, cognitive load index, health score
roam symbol <name> [--full]Symbol definition + callers + callees + metrics. Supports file:symbol disambiguation
roam context <symbol> [--task MODE] [--for-file PATH]AI-optimized context: definition + callers + callees + files-to-read with line ranges
roam search <pattern> [--kind KIND]Find symbols by name pattern, PageRank-ranked
roam grep <pattern> [-g glob] [-n N]Text search annotated with enclosing symbol context
roam deps <path> [--full]What a file imports and what imports it
roam trace <source> <target> [-k N]Dependency paths with coupling strength and hub detection
roam impact <symbol>Blast radius: what breaks if a symbol changes (Personalized PageRank weighted)
roam diff [--staged] [--full] [REV_RANGE]Blast radius of uncommitted changes or a commit range
roam pr-risk [REV_RANGE]PR risk score (0-100, multiplicative model) + structural spread + suggested reviewers
roam pr-diff [--staged] [--range R] [--format markdown]Structural PR diff: metric deltas, edge analysis, symbol changes, footprint. Not text diff — graph delta
roam api-changes [REV_RANGE]API change classifier: breaking/non-breaking changes, severity, and affected contracts
roam semantic-diff [REV_RANGE]Structural change summary: symbols added/removed/modified and changed call edges
roam test-gaps [REV_RANGE]Changed-symbol test gap detection: what changed and what still lacks test coverage
roam affected [REV_RANGE]Monorepo/package impact analysis: what components are affected by a change
roam attest [REV_RANGE] [--format markdown] [--sign]Proof-carrying PR attestation: bundles blast radius, risk, breaking changes, fitness, budget, tests, effects into one verifiable artifact
roam annotate <symbol> <note>Attach persistent notes to symbols (agentic memory across sessions)
roam annotations [--file F] [--symbol S]View stored annotations
roam diagnose <symbol> [--depth N]Root cause analysis: ranks suspects by z-score normalized risk
roam preflight <symbol|file>Compound pre-change check: blast radius + tests + complexity + coupling + fitness
roam guard <symbol>Compact sub-agent preflight bundle: definition, 1-hop callers/callees, test files, breaking-risk score, and layer signals
roam agent-plan --agents NDecompose partitions into dependency-ordered agent tasks with merge sequencing and handoffs
roam agent-context --agent-id N [--agents M]Generate per-agent execution context: write scope, read-only dependencies, and interface contracts
roam syntax-check [--changed] [PATHS...]Tree-sitter syntax integrity check for changed files and multi-agent judge workflows
roam verify [--threshold N]Pre-commit AI-code consistency check across naming, imports, error handling, and duplication signals
roam verify-imports [--file F]Import hallucination firewall: validate all imports against indexed symbol table, suggest corrections via FTS5 fuzzy matching
roam safe-delete <symbol>Safe deletion check: SAFE/REVIEW/UNSAFE verdict
roam test-map <name>Map a symbol or file to its test coverage
roam adversarial [--staged] [--range R]Adversarial architecture review: generates targeted challenges based on changes
roam plan [--staged] [--range R] [--agents N]Agent work planner: decompose changes into sequenced, dependency-aware steps
roam closure <symbol> [--rename] [--delete]Minimal-change synthesis: all files to touch for a safe rename/delete
roam mutate move|rename|add-call|extractGraph-level code editing: move symbols, rename across codebase, add calls, extract functions. Dry-run by default

Codebase Health

CommandDescription
roam health [--no-framework] [--gate]Composite health score (0-100): weighted geometric mean of tangle ratio, god components, bottlenecks, layer violations. --gate runs quality gate checks from .roam-gates.yml (exit 5 on failure)
roam smells [--file F] [--min-severity S]Code smell detection: 15 deterministic detectors (brain methods, god classes, feature envy, shotgun surgery, data clumps, etc.) with per-file health scores
roam dashboardUnified single-screen project status: health, hotspots, risks, ownership, and AI-rot indicators
roam vibe-check [--threshold N]AI-rot auditor: 8-pattern taxonomy with composite risk score and prioritized findings
roam ai-readiness0-100 score for how well this codebase supports AI coding agents
roam ai-ratio [--since N]Statistical estimate of AI-generated code ratio using commit-behavior signals
roam trends [--record] [--days N] [--metric M]Historical metrics snapshots with sparklines and trend deltas
roam complexity [--bumpy-road]Per-function cognitive complexity (SonarSource-compatible, triangular nesting penalty) + Halstead metrics (volume, difficulty, effort, bugs) + cyclomatic density
roam algo [--task T] [--confidence C] [--profile P]Algorithm anti-pattern detection: 23-pattern catalog detects suboptimal algorithms (O(n^2) loops, N+1 queries, quadratic string building, branching recursion, loop-invariant calls) and suggests better approaches with Big-O improvements. Confidence calibration via caller-count + runtime traces, evidence paths, impact scoring, framework-aware N+1 packs, and language-aware fix templates. Alias: roam math
roam n1 [--confidence C] [--verbose]Implicit N+1 I/O detection: finds ORM model computed properties ($appends/accessors) that trigger lazy-loaded DB queries in collection contexts. Cross-references with eager loading config. Supports Laravel, Django, Rails, SQLAlchemy, JPA
roam over-fetch [--threshold N] [--confidence C]Detect models serializing too many fields: large $fillable without $hidden/$visible, direct controller returns bypassing API Resources, poor exposed-to-hidden ratio
roam missing-index [--table T] [--confidence C]Find queries on non-indexed columns: cross-references WHERE/ORDER BY clauses, foreign keys, and paginated queries against migration-defined indexes
roam weather [-n N]Hotspots ranked by geometric mean of churn x complexity (percentile-normalized)
roam debt [--roi]Hotspot-weighted tech debt prioritization with SQALE remediation costs and optional refactoring ROI estimates
roam fitness [--explain]Architectural fitness functions from .roam/fitness.yaml
roam alertsHealth degradation trend detection (Mann-Kendall + Sen's slope)
roam snapshot [--tag TAG]Persist health metrics snapshot for trend tracking
roam trendHealth score history with sparkline visualization
roam digest [--brief] [--since TAG]Compare current metrics against last snapshot
roam forecast [--symbol S] [--horizon N] [--alert-only]Predict when metrics will exceed thresholds: Theil-Sen regression on snapshot history + churn-weighted per-symbol risk
roam budget [--init] [--staged] [--range R]Architectural budget enforcement: per-PR delta limits on health, cycles, complexity. CI gate (exit 1 on violation)
roam bisect [--metric M] [--range R]Architectural git bisect: find the commit that degraded a specific metric
roam ingest-trace <file> [--otel|--jaeger|--zipkin|--generic]Ingest runtime trace data (OpenTelemetry, Jaeger, Zipkin) for hotspot overlay
roam hotspots [--runtime] [--discrepancy]Runtime hotspot analysis: find symbols missed by static analysis but critical at runtime
roam algo — algorithm anti-pattern catalog (23 patterns)

roam algo scans every indexed function against a 23-pattern catalog, ranks findings by runtime-aware impact score, and shows the exact Big-O improvement available. Findings include semantic evidence paths, precision metadata, and language-aware tips/fixes (Python, JS, Go, Rust, Java, etc.):

$ roam algo
VERDICT: 8 algorithmic improvements found (3 high, 4 medium, 1 low)
Ordering: highest impact first
Profile: balanced (filtered 0 low-signal findings)

Nested loop lookup (2):
  fn   resolve_permissions          src/auth/rbac.py:112     [high, impact=86.4]
        Current: Nested iteration -- O(n*m)
        Better:  Hash-map join -- O(n+m)
        Tip: Build a dict/set from one collection, iterate the other

  fn   find_matching_rule           src/rules/engine.py:67   [high, impact=78.1]
        Current: Nested iteration -- O(n*m)
        Better:  Hash-map join -- O(n+m)
        Tip: Build a dict/set from one collection, iterate the other

String building (1):
  meth build_query                  src/db/query.py:88       [high, impact=74.0]
        Current: Loop concatenation -- O(n^2)
        Better:  Join / StringBuilder -- O(n)
        Tip: Collect parts in a list, join once at the end

Branching recursion without memoization (1):
  fn   compute_cost                 src/pricing/calc.py:34   [medium, impact=49.5]
        Current: Naive branching recursion -- O(2^n)
        Better:  Memoized / iterative DP -- O(n)
        Tip: Add @cache / @lru_cache, or convert to iterative with a table

Full catalog — 23 patterns:

PatternAnti-pattern detectedBetter approachImprovement
Nested loop lookupfor x in a: for y in b: if x==yHash-map joinO(n·m) → O(n+m)
Membership testif x in list in a loopSet lookupO(n) → O(1) per check
SortingBubble / selection sortBuilt-in sortO(n²) → O(n log n)
Search in sorted dataLinear scan on sorted sequenceBinary searchO(n) → O(log n)
String buildings += chunk in loopjoin() / StringBuilderO(n²) → O(n)
DeduplicationNested loop dedupset() / dict.fromkeysO(n²) → O(n)
Max / minManual tracking loopmax() / min()idiom
AccumulationManual accumulatorsum() / reduce()idiom
Group by keyManual key-existence checkdefaultdict / groupingByidiom
FibonacciNaive recursionIterative / @lru_cacheO(2ⁿ) → O(n)
ExponentiationLoop multiplicationpow(b, e, mod)O(n) → O(log n)
GCDManual loopmath.gcd()O(n) → O(log n)
Matrix multiplyNaive triple loopNumPy / BLASsame asymptotic, ~1000× faster via SIMD
Busy waitwhile True: sleep() pollEvent / condition variableO(k) → O(1) wake-up
Regex in loopre.match() compiled per iterationPre-compiled patternO(n·(p+m)) → O(p + n·m)
N+1 queryPer-item DB / API call in loopBatch WHERE IN (...)n round-trips → 1
List front operationslist.insert(0, x) in loopcollections.dequeO(n) → O(1) per op
Sort to selectsorted(x)[0] or sorted(x)[:k]min() / heapq.nsmallestO(n log n) → O(n) or O(n log k)
Repeated lookup.index() / .contains() inside loopPre-built set / dictO(m) → O(1) per lookup
Branching recursionNaive f(n-1) + f(n-2) without cache@cache / iterative DPO(2ⁿ) → O(n)
Quadratic string buildingresult += chunk across multiple scopesparts.append + join at endO(n²) → O(n)
Loop-invariant callget_config() / compile_schema() inside loop bodyHoist before loopper-iter cost → O(1)
String reversalManual char-by-char loops[::-1] / .reverse()idiom

Filtering:

roam algo --task nested-lookup       # one pattern type only
roam algo --confidence high          # high-confidence findings only
roam algo --profile strict           # precision-first filtering
roam algo --task io-in-loop -n 5    # top 5 N+1 query sites
roam --json algo                     # machine-readable output
roam --sarif algo > roam-algo.sarif  # SARIF with fingerprints + fixes

Confidence calibration: high = strong structural signal (unbounded loop + high caller/runtime impact + pattern confirmed); medium = pattern matched but uncertainty remains; low = heuristic signal only.

Profiles: balanced (default), strict (precision-first), aggressive (surface more candidates).

roam minimap — annotated codebase snapshot for CLAUDE.md

roam minimap generates a compact block (stack, annotated directory tree, key symbols, hotspots, conventions) wrapped in sentinel comments for in-place CLAUDE.md updates:

$ roam minimap
<!-- roam:minimap generated=2026-02-18 -->
**Stack:** Python · JavaScript · YAML

.github/ (4 files) benchmarks/ (75 files) src/ roam/ bridges/ base.py # LanguageBridge registry.py # register_bridge, detect_bridges commands/ (93 files) # is_test_file, get_changed_files db/ connection.py # find_project_root, batched_in schema.py graph/ builder.py # build_symbol_graph, build_file_graph pagerank.py # compute_pagerank, compute_centrality languages/ (18 files) # ApexExtractor output/ formatter.py # to_json, json_envelope cli.py # cli, LazyGroup mcp_server.py tests/ (70 files) `

Key symbols (PageRank): open_db · ensure_index · json_envelope · to_json · LanguageExtractor

Touch carefully (fan-in >= 15): to_json (116 callers) · json_envelope (116 callers) · open_db (105 callers) · ensure_index (100 callers)

Hotspots (churn x complexity): cmd_context.py · csharp_lang.py · cmd_dead.py

Conventions: snake_case fns, PascalCase classes


**Workflow:**

```bash
roam minimap                    # print to stdout
roam minimap --update           # replace sentinel block in CLAUDE.md in-place
roam minimap -o docs/AGENTS.md  # target a different file
roam minimap --init-notes       # scaffold .roam/minimap-notes.md for project gotchas

The sentinel pair <!-- roam:minimap --> / <!-- /roam:minimap --> is replaced on each run — surrounding content is left intact. Add project-specific gotchas to .roam/minimap-notes.md and they appear in every subsequent output.

Tree annotations come from the top exported symbols by fan-in per file. Non-source root directories (.github/, benchmarks/, docs/) are collapsed immediately. Large subdirectories (e.g. commands/, languages/) are collapsed at depth 2+ with a file count.

Architecture

CommandDescription
roam clusters [--min-size N]Community detection vs directory structure. Modularity Q-score (Newman 2004) + per-cluster conductance
roam spectral [--depth N] [--compare] [--gap-only] [--k K]Spectral bisection: Fiedler vector partition tree with algebraic connectivity gap verdict
roam layersTopological dependency layers + upward violations + Gini balance
roam dead [--all] [--summary] [--clusters]Unreferenced exported symbols with safety verdicts + confidence scoring (60-95%)
roam fan [symbol|file] [-n N] [--no-framework]Fan-in/fan-out: most connected symbols or files
roam risk [-n N] [--domain KW] [--explain]Domain-weighted risk ranking
roam why <name> [name2 ...]Role classification (Hub/Bridge/Core/Leaf), reach, criticality
roam split <file>Internal symbol groups with isolation % and extraction suggestions
roam entry-pointsEntry point catalog with protocol classification
roam patternsArchitectural pattern recognition: Strategy, Factory, Observer, etc.
roam visualize [--format mermaid|dot] [--focus NAME] [--limit N]Generate Mermaid or DOT architecture diagrams. Smart filtering via PageRank, cluster grouping, cycle highlighting
roam effects [TARGET] [--file F] [--type T]Side-effect classification: DB writes, network I/O, filesystem, global mutation. Direct + transitive effects through call graph
roam dark-matter [--min-cochanges N]Detect hidden co-change couplings not explained by import/call edges
roam simulate move|extract|merge|deleteCounterfactual architecture simulator: test refactoring ideas in-memory, see metric deltas before writing code
roam orchestrate --agents N [--files P]Multi-agent swarm partitioning: split codebase for parallel agents with zero-conflict guarantees
roam partition [--agents N]Multi-agent partition manifest: conflict risk, complexity, and suggested ownership splits
roam fingerprint [--compact] [--compare F]Topology fingerprint: extract/compare architectural signatures across repos
roam cut <target> [--depth N]Minimum graph cuts: find critical edges whose removal disconnects components
roam safe-zonesGraph-based containment boundaries
roam coverage-gapsUnprotected entry points with no path to gate symbols
roam duplicates [--threshold T] [--min-lines N]Semantic duplicate detector: functionally equivalent code clusters with divergent edge-case handling

Exploration

CommandDescription
roam module <path>Directory contents: exports, signatures, dependencies, cohesion
roam sketch <dir> [--full]Compact structural skeleton of a directory
roam uses <name>All consumers: callers, importers, inheritors
roam owner <path>Code ownership: who owns a file or directory
roam coupling [-n N] [--set]Temporal coupling: file pairs that change together (NPMI + lift)
roam fn-couplingFunction-level temporal coupling across files
roam bus-factor [--brain-methods]Knowledge loss risk per module
roam doc-stalenessDetect stale docstrings
roam docs-coveragePublic-symbol doc coverage + stale docs + PageRank-ranked missing-doc hotlist
roam suggest-refactoring [--limit N] [--min-score N]Proactive refactoring recommendations ranked by complexity, coupling, churn, smells, coverage gaps, and debt
roam plan-refactor <symbol> [--operation auto|extract|move]Ordered refactor plan with blast radius, test gaps, layer risk, and simulation-based strategy preview
roam conventionsAuto-detect naming styles, import preferences. Flags outliers
roam breaking [REV_RANGE]Breaking change detection: removed exports, signature changes
roam affected-tests <symbol|file>Trace reverse call graph to test files
roam relate <sym1> <sym2>Show relationship between two symbols: shared callers, shortest path, common ancestors
roam endpoints [--routes] [--api]Enumerate all HTTP/API endpoint definitions and surface them for review or cross-repo matching
roam metrics <file|symbol>Unified vital signs: complexity, fan-in/out, PageRank, churn, test coverage, dead code risk -- all in one call
roam search-semantic <query>Hybrid semantic search: BM25 + TF-IDF + optional local ONNX vectors (select via --backend) with framework/library packs
roam intent [--staged] [--range R]Doc-to-code linking: match documentation to symbols, detect drift
roam schema [--diff] [--version V]JSON envelope schema versioning: view, diff, and validate output schemas
roam x-lang [--bridges] [--edges]Cross-language edge browser: inspect bridge-resolved connections

Reports & CI

CommandDescription
roam report [--list] [--config FILE] [PRESET]Compound presets: first-contact, security, pre-pr, refactor, guardian
roam describe --writeGenerate agent config (auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.)
roam auth-gaps [--routes-only] [--controllers-only] [--min-confidence C]Find endpoints missing authentication or authorization: routes outside auth middleware groups, CRUD methods without $this->authorize() / Gate::allows() checks. String-aware PHP brace parsing
roam orphan-routes [-n N] [--confidence C]Detect backend routes with no frontend consumer: parses route definitions, searches frontend for API call references, reports controller methods with no route mapping
roam migration-safety [-n N] [--include-archive]Detect non-idempotent migrations: missing hasTable/hasColumn guards, raw SQL without IF NOT EXISTS, index operations without existence checks
roam api-drift [--model M] [--confidence C]Detect mismatches between PHP model $fillable/$appends fields and TypeScript interface properties. Auto-converts snake_case/camelCase for comparison. Single-repo; cross-repo planned for roam ws api-drift
roam codeowners [--unowned] [--owner NAME]CODEOWNERS coverage analysis: owned/unowned files, top owners, and ownership risk
roam drift [--threshold N]Ownership drift detection: declared ownership vs observed maintenance activity
roam suggest-reviewers [REV_RANGE]Reviewer recommendation via ownership, recency, breadth, and impact signals
roam simulate-departure <developer>Knowledge-loss simulation: what breaks if a key contributor leaves
roam dev-profile [--developer NAME] [--since N]Developer productivity profile: commit patterns, specialization, impact, and knowledge concentration per contributor
roam secrets [--fail-on-found] [--include-tests]Secret scanning with masking, entropy detection, env-var suppression, remediation suggestions, and optional CI gate failure
roam vulns [--import-file F] [--reachable-only]Vulnerability scanning: ingest npm/pip/trivy/osv reports, auto-detect format, reachability filtering, SARIF output
roam path-coverage [--from P] [--to P] [--max-depth N]Find critical call paths (entry -> sink) with zero test protection. Suggests optimal test insertion points
roam capsule [--redact-paths] [--no-signatures] [--output F]Export sanitized structural graph (no code bodies) for external architectural review
roam rules [--init] [--ci] [--rules-dir D]Plugin DSL for governance: user-defined path/symbol/AST rules via .roam/rules/ YAML ($METAVAR captures supported)
roam check-rules [--severity S] [--fix]Evaluate built-in and user-defined governance rules (10 built-in: no-circular-imports, max-fan-out, etc.)
roam vuln-map --generic|--npm-audit|--trivy FIngest vulnerability reports and match to codebase symbols
roam vuln-reach [--cve C] [--from E]Vulnerability reachability: exact paths from entry points to vulnerable calls
roam supply-chain [--top N]Dependency risk dashboard: pin coverage, risk scoring, supply-chain health
roam invariants [--staged] [--range R]Discover architectural contracts (invariants) from the codebase structure

Multi-Repo Workspace

CommandDescription
roam ws init <repo1> <repo2> [--name NAME]Initialize a workspace from sibling repos. Auto-detects frontend/backend roles
roam ws statusShow workspace repos, index ages, cross-repo edge count
roam ws resolveScan for REST API endpoints and match frontend calls to backend routes
roam ws understandUnified workspace overview: per-repo stats + cross-repo connections
roam ws healthWorkspace-wide health report with cross-repo coupling assessment
roam ws context <symbol>Cross-repo augmented context: find a symbol across repos + show API callers
roam ws trace <source> <target>Trace cross-repo paths via API edges

Global Options

OptionDescription
roam --json <command>Structured JSON output with consistent envelope
roam --compact <command>Token-efficient output: TSV tables, minimal JSON envelope
roam --sarif <command>SARIF 2.1.0 output for dead, health, complexity, rules, secrets, and algo (GitHub/CI integration)
roam <command> --gate EXPRCI quality gate (e.g., --gate score>=70). Exit code 1 on failure

Walkthrough: Investigating a Codebase

10-step walkthrough using Flask as an example (click to expand)

Here's how you'd use Roam to understand a project you've never seen before. Using Flask as an example:

Step 1: Onboard and get the full picture

$ roam init
Created .roam/fitness.yaml (6 starter rules)
Created .github/workflows/roam.yml
Done. 226 files, 1132 symbols, 233 edges.
Health: 78/100

$ roam understand
Tech stack: Python (flask, jinja2, werkzeug)
Architecture: Monolithic — 3 layers, 5 clusters
Key abstractions: Flask, Blueprint, Request, Response
Health: 78/100 — 1 god component (Flask)
Entry points: src/flask/__init__.py, src/flask/cli.py
Conventions: snake_case functions, PascalCase classes, relative imports
Complexity: avg 4.2, 3 high (>15), 0 critical (>25)

Step 2: Drill into a key file

$ roam file src/flask/app.py
src/flask/app.py  (python, 963 lines)

  cls  Flask(App)                                   :76-963
    meth  __init__(self, import_name, ...)           :152
    meth  route(self, rule, **options)               :411
    meth  register_blueprint(self, blueprint, ...)   :580
    meth  make_response(self, rv)                    :742
    ...12 more methods

Step 3: Who depends on this?

$ roam deps src/flask/app.py
Imported by:
file                        symbols
--------------------------  -------
src/flask/__init__.py       3
src/flask/testing.py        2
tests/test_basic.py         1
...18 files total

Step 4: Find the hotspots

$ roam weather
=== Hotspots (churn x complexity) ===
Score  Churn  Complexity  Path                    Lang
-----  -----  ----------  ----------------------  ------
18420  460    40.0        src/flask/app.py        python
12180  348    35.0        src/flask/blueprints.py python

Step 5: Check architecture health

$ roam health
Health: 78/100
  Tangle: 0.0% (0/1132 symbols in cycles)
  1 god component (Flask, degree 47, actionable)
  0 bottlenecks, 0 layer violations

=== God Components (degree > 20) ===
Sev      Name   Kind  Degree  Cat  File
-------  -----  ----  ------  ---  ------------------
WARNING  Flask  cls   47      act  src/flask/app.py

Step 6: Get AI-ready context for a symbol

$ roam context Flask
Files to read:
  src/flask/app.py:76-963              # definition
  src/flask/__init__.py:1-15           # re-export
  src/flask/testing.py:22-45           # caller: FlaskClient.__init__
  tests/test_basic.py:12-30            # caller: test_app_factory
  ...12 more files

Callers: 47  Callees: 3

Step 7: Pre-change safety check

$ roam preflight Flask
=== Preflight: Flask ===
Blast radius: 47 callers, 89 transitive
Affected tests: 31 (DIRECT: 12, TRANSITIVE: 19)
Complexity: cc=40 (critical), nesting=6
Coupling: 3 hidden co-change partners
Fitness: 1 violation (max-complexity exceeded)
Verdict: HIGH RISK — consider splitting before modifying

Step 8: Decompose a large file

$ roam split src/flask/app.py
=== Split analysis: src/flask/app.py ===
  87 symbols, 42 internal edges, 95 external edges
  Cross-group coupling: 18%

  Group 1 (routing) — 12 symbols, isolation: 83% [extractable]
    meth  route              L411  PR=0.0088
    meth  add_url_rule       L450  PR=0.0045
    ...

=== Extraction Suggestions ===
  Extract 'routing' group: route, add_url_rule, endpoint (+9 more)
    83% isolated, only 3 edges to other groups

Step 9: Understand why a symbol matters

$ roam why Flask url_for Blueprint
Symbol     Role          Fan         Reach     Risk      Verdict
---------  ------------  ----------  --------  --------  --------------------------------------------------
Flask      Hub           fan-in:47   reach:89  CRITICAL  God symbol (47 in, 12 out). Consider splitting.
url_for    Core utility  fan-in:31   reach:45  HIGH      Widely used utility (31 callers). Stable interface.
Blueprint  Bridge        fan-in:18   reach:34  moderate  Coupling point between clusters.

Step 10: Generate docs and set up CI

$ roam describe --write
Wrote CLAUDE.md (98 lines)  # auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.

$ roam health --gate score>=70
Health: 78/100 — PASS

Ten commands. Complete picture: structure, dependencies, hotspots, health, context, safety checks, decomposition, and CI gates.

Integration with AI Coding Tools

Roam is designed to be called by coding agents via shell commands. Instead of repeatedly grepping and reading files, the agent runs one roam command and gets structured output.

Decision order for agents:

SituationCommand
First time in a reporoam understand then roam tour
Need to modify a symbolroam preflight <name> (blast radius + tests + fitness)
Debugging a failureroam diagnose <name> (root cause ranking)
Need files to readroam context <name> (files + line ranges)
Need to find a symbolroam search <pattern>
Need file structureroam file <path>
Pre-PR checkroam pr-risk HEAD~3..HEAD
What breaks if I change X?roam impact <symbol>
Check for N+1 queriesroam n1 (implicit lazy-load detection)
Check auth coverageroam auth-gaps (routes + controllers)
Check migration safetyroam migration-safety (idempotency guards)

Fastest setup:

roam describe --write               # auto-detects your agent's config file
roam describe --write -o AGENTS.md  # or specify an explicit path
roam describe --agent-prompt        # compact ~500-token prompt (append to any config)
roam minimap --update               # inject/refresh annotated codebase minimap in CLAUDE.md

Agent not using Roam correctly? If your agent is ignoring Roam and falling back to grep/read exploration, it likely doesn't have the instructions. Run:

roam describe --write          # writes instructions to your agent's config (CLAUDE.md, AGENTS.md, etc.)

If you already have a config file and don't want to overwrite it:

roam describe --agent-prompt   # prints a compact prompt — copy-paste into your existing config
roam minimap --update          # injects an annotated codebase snapshot into CLAUDE.md (won't touch other content)

This teaches the agent which Roam command to use for each situation (e.g., roam preflight before changes, roam context for files to read, roam diagnose for debugging).

Copy-paste agent instructions
## Codebase navigation

This project uses `roam` for codebase comprehension. Always prefer roam over Glob/Grep/Read exploration.

Before modifying any code:
1. First time in the repo: `roam understand` then `roam tour`
2. Find a symbol: `roam search <pattern>`
3. Before changing a symbol: `roam preflight <name>` (blast radius + tests + fitness)
4. Need files to read: `roam context <name>` (files + line ranges, prioritized)
5. Debugging a failure: `roam diagnose <name>` (root cause ranking)
6. After making changes: `roam diff` (blast radius of uncommitted changes)

Additional: `roam health` (0-100 score), `roam impact <name>` (what breaks),
`roam pr-risk` (PR risk), `roam file <path>` (file skeleton).

Run `roam --help` for all commands. Use `roam --json <cmd>` for structured output.
Where to put this for each tool
ToolConfig file
Claude CodeCLAUDE.md in your project root
OpenAI Codex CLIAGENTS.md in your project root
Gemini CLIGEMINI.md in your project root
Cursor.cursor/rules/roam.mdc (add alwaysApply: true frontmatter)
Windsurf.windsurf/rules/roam.md (add trigger: always_on frontmatter)
GitHub Copilot.github/copilot-instructions.md
AiderCONVENTIONS.md
Continue.devconfig.yaml rules
Cline.clinerules/ directory
Roam vs native tools
TaskUse RoamUse native tools
"What calls this function?"roam symbol <name>LSP / Grep
"What files do I need to read?"roam context <name>Manual tracing (5+ calls)
"Is it safe to change X?"roam preflight <name>Multiple manual checks
"Show me this file's structure"roam file <path>Read the file directly
"Understand project architecture"roam understandManual exploration
"What breaks if I change X?"roam impact <symbol>No direct equivalent
"What tests to run?"roam affected-tests <name>Grep for imports (misses indirect)
"What's causing this bug?"roam diagnose <name>Manual call-chain tracing
"Codebase health score for CI"roam health --gate score>=70No equivalent

MCP Server

Roam includes a Model Context Protocol server for direct integration with tools that support MCP.

pip install "roam-code[mcp]"
roam mcp

101 tools, 10 resources, and 5 prompts are available in the full preset. Most tools are read-only index queries; side-effect tools are explicitly annotated.

MCP v2 highlights (v11):

  • In-process MCP execution (no subprocess shell-out per call)
  • Preset-based tool surfacing (core, review, refactor, debug, architecture, full)
  • Compound tools that collapse multi-step exploration/review flows into one call
  • Structured output schemas + tool annotations for safer planner behavior

Default preset: core (24 tools: 23 core + roam_expand_toolset meta-tool).

# Default
roam mcp

# Full toolset
ROAM_MCP_PRESET=full roam mcp

# Legacy compatibility (same as full preset)
ROAM_MCP_LITE=0 roam mcp

Core preset tools: roam_affected_tests, roam_batch_get, roam_batch_search, roam_complexity_report, roam_context, roam_dead_code, roam_deps, roam_diagnose, roam_diagnose_issue, roam_diff, roam_expand_toolset, roam_explore, roam_file_info, roam_health, roam_impact, roam_pr_risk, roam_preflight, roam_prepare_change, roam_review_change, roam_search_symbol, roam_syntax_check, roam_trace, roam_understand, roam_uses.

MCP tool list (all 101)
ToolDescription
roam_understandFull codebase briefing
roam_healthHealth score (0-100) + issues
roam_preflightPre-change safety check
roam_search_symbolFind symbols by name
roam_contextFiles-to-read for modifying a symbol
roam_traceDependency path between two symbols
roam_impactBlast radius of changing a symbol
roam_file_infoFile skeleton with all definitions
roam_pr_riskRisk score for pending changes
roam_breaking_changesDetect breaking changes between refs
roam_affected_testsFind tests affected by a change
roam_dead_codeList unreferenced exports
roam_complexity_reportPer-symbol cognitive complexity
roam_repo_mapProject skeleton with key symbols
roam_tourAuto-generated onboarding guide
roam_diagnoseRoot cause analysis for debugging
roam_visualizeGenerate Mermaid or DOT architecture diagrams
roam_algoAlgorithm anti-pattern detection with language-aware tips
roam_ws_understandUnified multi-repo workspace overview
roam_ws_contextCross-repo augmented symbol context
roam_pr_diffStructural PR diff: metric deltas, edge analysis, symbol changes
roam_budget_checkCheck changes against architectural budgets
roam_effectsSide-effect classification (DB writes, network, filesystem)
roam_attestProof-carrying PR attestation with all evidence bundled
roam_capsule_exportExport sanitized structural graph (no code bodies)
roam_path_coverageFind critical untested call paths (entry -> sink)
roam_forecastPredict when metrics will exceed thresholds
roam_simulateCounterfactual architecture simulator
roam_orchestrateMulti-agent swarm partitioning
roam_fingerprintTopology fingerprint comparison
roam_mutateGraph-level code editing (move/rename/extract)
roam_dark_matterHidden co-change coupling detection
roam_closureMinimal-change synthesis for rename/delete
roam_adversarial_reviewAdversarial architecture review
roam_generate_planAgent work planner
roam_get_invariantsArchitectural invariant discovery
roam_bisect_blameArchitectural git bisect
roam_doc_intentDoc-to-code linking
roam_cut_analysisMinimum graph cut analysis
roam_annotate_symbolAttach persistent notes to symbols
roam_get_annotationsView stored annotations
roam_relateShow relationship between two symbols
roam_search_semanticSemantic search by meaning
roam_rules_checkPlugin DSL governance rules
roam_check_rulesBuilt-in + user-defined governance rule evaluation with autofix templates
roam_supply_chainDependency risk dashboard: pin coverage and supply-chain health
roam_spectralSpectral bisection: Fiedler vector partition tree and modularity gap
roam_vuln_mapVulnerability report ingestion
roam_vuln_reachVulnerability reachability paths
roam_ingest_traceIngest runtime trace data
roam_runtime_hotspotsRuntime hotspot analysis
roam_diffBlast radius of uncommitted/committed changes
roam_symbolSymbol definition, callers, callees, metrics
roam_depsFile-level import/imported-by relationships
roam_usesAll consumers of a symbol by edge type
roam_weatherCode hotspots: churn x complexity ranking
roam_debtHotspot-weighted technical debt prioritization with optional ROI estimate
roam_docs_coverageDoc coverage and stale-doc drift with PageRank-ranked missing docs
roam_suggest_refactoringRank proactive refactoring candidates using complexity, coupling, churn, smells, and coverage gaps
roam_plan_refactorBuild an ordered refactor plan for one symbol with risk/test/simulation context
roam_n1Detect N+1 I/O patterns in ORM code
roam_auth_gapsFind endpoints missing auth
roam_over_fetchDetect models serializing too many fields
roam_missing_indexFind queries on non-indexed columns
roam_orphan_routesDetect dead backend routes
roam_migration_safetyDetect non-idempotent migrations
roam_api_driftBackend/frontend model mismatch detection
roam_expand_toolsetDiscover presets, active toolset, and switch instructions
roam_exploreCompound first-contact exploration bundle for fast repo orientation
roam_prepare_changeCompound pre-change bundle: context, blast radius, risk, and tests
roam_review_changeCompound review bundle for changed code and architecture checks
roam_diagnose_issueCompound debugging bundle with ranked suspects and dependency context
roam_onboardStructured onboarding brief for new contributors/agents
roam_syntax_checkTree-sitter syntax integrity validation for changed paths
roam_agent_exportGenerate multi-agent instruction bundles (AGENTS.md + overlays)
roam_vibe_checkAI-rot auditor with 8-pattern taxonomy and composite score
roam_ai_readinessAI-agent effectiveness readiness scoring and recommendations
roam_dashboardUnified status snapshot across health, risk, churn, and quality
roam_codeownersCODEOWNERS coverage analysis and unowned file discovery
roam_driftOwnership drift detection from declared vs observed ownership
roam_suggest_reviewersReviewer recommendations with multi-signal scoring
roam_simulate_departureKnowledge-loss simulation for contributor departure scenarios
roam_verifyPre-commit consistency verification and policy checks
roam_api_changesAPI signature change classification and severity labeling
roam_test_gapsChanged-symbol test gap analysis
roam_ai_ratioEstimated AI-generated code ratio from repository signals
roam_duplicatesSemantic duplicate detection across structurally similar functions
roam_partitionMulti-agent partition manifest with conflict and complexity scores
roam_affectedMonorepo/package affected-set analysis for diffs
roam_semantic_diffStructural diff of symbol/edge changes
roam_trendsHistorical metric trend retrieval with sparkline output
roam_secretsSecret scanning with masking and CI-friendly fail behavior
roam_endpointsEnumerate HTTP/API endpoint definitions across the codebase
roam_doctorDiagnose installation and environment health
roam_initInitialize roam workspace state and build the first index
roam_reindexRefresh or force-rebuild the index with task-mode support
roam_resetReset the roam index and cached data
roam_cleanRemove stale or orphaned index entries
roam_batch_searchBatch symbol search: run multiple pattern queries in a single call
roam_batch_getBatch context retrieval: fetch multiple symbols/files in a single call
roam_dev_profileDeveloper productivity profile: commit patterns, specialization, and impact

Resources: roam://health (current health score), roam://summary (project overview)

Claude Code
claude mcp add roam-code -- roam mcp

Or add to .mcp.json in your project root:

{
  "mcpServers": {
    "roam-code": {
      "command": "roam",
      "args": ["mcp"]
    }
  }
}
Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "roam-code": {
      "command": "roam",
      "args": ["mcp"],
      "cwd": "/path/to/your/project"
    }
  }
}
Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "roam-code": {
      "command": "roam",
      "args": ["mcp"]
    }
  }
}
VS Code + Copilot

Add to .vscode/mcp.json:

{
  "servers": {
    "roam-code": {
      "type": "stdio",
      "command": "roam",
      "args": ["mcp"]
    }
  }
}

CI/CD Integration

All you need is Python 3.9+ and pip install roam-code.

GitHub Actions

# .github/workflows/roam.yml
name: Roam Analysis
on: [pull_request]

jobs:
  roam:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: Cranot/roam-code@main
        with:
          command: health --gate score>=70
          comment: true
          fail-on-violation: true

Use roam init to auto-generate this workflow.

InputDefaultDescription
commandhealthRoam command to run
python-version3.12Python version
commentfalsePost results as PR comment
fail-on-violationfalseFail the job on violations
roam-version(latest)Pin to a specific version
GitLab CI
roam-analysis:
  stage: test
  image: python:3.12-slim
  before_script:
    - pip install roam-code
  script:
    - roam index
    - roam health --gate score>=70
    - roam --json pr-risk origin/main..HEAD > roam-report.json
  artifacts:
    paths:
      - roam-report.json
  rules:
    - if: $CI_MERGE_REQUEST_IID
Azure DevOps / any CI

Universal pattern:

pip install roam-code
roam index
roam health --gate score>=70    # exit 1 on failure
roam --json health > report.json

SARIF Output

Roam exports analysis results in SARIF 2.1.0 format for GitHub Code Scanning.

from roam.output.sarif import health_to_sarif, write_sarif

sarif = health_to_sarif(health_data)
write_sarif(sarif, "roam-health.sarif")
- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: roam-health.sarif

For Teams

Zero infrastructure, zero vendor lock-in, zero data leaving your network.

ToolAnnual cost (20-dev team)InfrastructureSetup time
SonarQube Server$15,000-$45,000Self-hosted serverDays
CodeScene$20,000-$60,000SaaS or on-premHours
Code Climate$12,000-$36,000SaaSHours
Roam$0 (MIT license)None (local)5 minutes
Team rollout guide

Week 1-2 (pilot): 1-2 developers run roam init on one repo. Use roam preflight before changes, roam pr-risk before PRs.

Week 3-4 (expand): Add roam health --gate score>=60 to CI as a non-blocking check.

Month 2+ (standardize): Tighten to --gate score>=70. Expand to additional repos. Track trajectory with roam trend.

Complements your existing stack
If you use...Roam adds...
SonarQubeArchitecture-level analysis: dependency cycles, god components, blast radius, health scoring
CodeSceneFree, local alternative for health scoring and hotspot analysis
ESLint / PylintCross-language architecture checks. Linters enforce style per file; Roam enforces architecture across the codebase
LSPAI-agent-optimized queries. roam context answers "what calls this?" with PageRank-ranked results in one call

Language Support

Tier 1 -- Full extraction (dedicated parsers)

LanguageExtensionsSymbolsReferencesInheritance
Python.py .pyiclasses, functions, methods, decorators, variablesimports, calls, inheritanceextends, __all__ exports
JavaScript.js .jsx .mjs .cjsclasses, functions, arrow functions, CJS exportsimports, require(), callsextends
TypeScript.ts .tsx .mts .ctsinterfaces, type aliases, enums + all JSimports, calls, type refsextends, implements
Java.javaclasses, interfaces, enums, constructors, fieldsimports, callsextends, implements
Go.gostructs, interfaces, functions, methods, fieldsimports, callsembedded structs
Rust.rsstructs, traits, impls, enums, functionsuse, callsimpl Trait for Struct
C / C++.c .h .cpp .hpp .ccstructs, classes, functions, namespaces, templatesincludes, callsextends
C#.csclasses, interfaces, structs, enums, records, methods, constructors, properties, delegates, events, fieldsusing directives, calls, new, attributesextends, implements
PHP.phpclasses, interfaces, traits, enums, methods, propertiesnamespace use, calls, static calls, newextends, implements, use (traits)
Visual FoxPro.prgfunctions, procedures, classes, methods, properties, constantsDO, SET PROCEDURE/CLASSLIB, CREATEOBJECT, =func(), obj.method()DEFINE CLASS ... AS
YAML (CI/CD).yml .yamlGitLab CI: jobs, template anchors, stages. GitHub Actions: workflow name, jobs, reusable workflows. Generic: top-level keysextends:, needs:, !reference, uses:
HCL / Terraform.tf .tfvars .hclresource, data, variable, output, module, provider, locals entriesvar.*, module.*, data.*, local.*, resource cross-refs
Vue.vuevia <script> block extraction (TS/JS)imports, calls, type refsextends, implements
Svelte.sveltevia <script> block extraction (TS/JS)imports, calls, type refsextends, implements
Salesforce ecosystem (Tier 1)
LanguageExtensionsSymbolsReferences
Apex.cls .triggerclasses, triggers, SOQL, annotationsimports, calls, System.Label, generic type refs
Aura.cmp .app .evt .intf .designcomponents, attributes, methods, eventscontroller refs, component refs
LWC (JavaScript).js (in LWC dirs)anonymous class from filename@salesforce/apex/, @salesforce/schema/, @salesforce/label/
Visualforce.page .componentpages, componentscontroller/extensions, merge fields, includes
SF Metadata XML*-meta.xmlobjects, fields, rules, layoutsApex class refs, formula field refs, Flow actionCalls

Cross-language edges mean roam impact AccountService shows blast radius across Apex, LWC, Aura, Visualforce, and Flows.

| Ruby | .rb | classes, modules, methods, singleton methods, constants | require, require_relative, include/extend, calls, ClassName.new | class inheritance | | JSONC | .jsonc | via JSON grammar | -- | -- | | MDX | .mdx | via Markdown grammar | -- | -- |

Tier 2 -- Generic extraction

Kotlin (.kt .kts), Swift (.swift), Scala (.scala .sc)

Tier 2 languages get symbol extraction and basic inheritance via a generic tree-sitter walker.

Performance

MetricValue
Index 200 files~3-5s
Index 3,000 files~2 min
Incremental (no changes)<1s
Any query command<0.5s
Detailed benchmarks

Indexing Speed

ProjectLanguageFilesSymbolsEdgesIndex TimeRate
ExpressJS2116248043s70 files/s
AxiosJS2371,0658686s41 files/s
VueTS6975,3358,98425s28 files/s
LaravelPHP3,05839,09738,0451m46s29 files/s
SvelteTS8,44516,44519,6182m40s52 files/s

Quality Benchmark

RepoLanguageScoreCoverageEdge DensityCommands
LaravelPHP9.5591.2%0.9729/29
VueTS9.2785.8%1.6829/29
SvelteTS9.0494.7%1.1929/29
AxiosJS8.9885.9%0.8229/29
ExpressJS8.4696.0%1.2929/29

Token Efficiency

MetricValue
1,600-line file → roam file~5,000 chars (~70:1 compression)
Full project map~4,000 chars
--compact mode40-50% additional token reduction
roam preflight replaces5-7 separate agent tool calls

Agent-efficiency benchmark write-up: reports/09_agent_efficiency_benchmarks.md.

How It Works

Codebase
    |
[1] Discovery ──── git ls-files (respects .gitignore + .roamignore)
    |
[2] Parse ──────── tree-sitter AST per file (26 languages)
    |
[3] Extract ────── symbols + references (calls, imports, inheritance)
    |
[4] Resolve ────── match references to definitions → edges
    |
[5] Metrics ────── adaptive PageRank, betweenness, cognitive complexity, Halstead
    |
[6] Algorithms ── 23-pattern anti-pattern catalog (O(n^2) loops, N+1, recursion)
    |
[7] Git ────────── churn, co-change matrix, authorship, Renyi entropy
    |
[8] Clusters ───── Louvain community detection
    |
[9] Health ─────── per-file scores (7-factor) + composite score (0-100)
    |
[10] Store ─────── .roam/index.db (SQLite, WAL mode)

After the first full index, roam index only re-processes changed files (mtime + SHA-256 hash). Incremental updates are near-instant.

Graph algorithms
  • Adaptive PageRank -- damping factor auto-tunes based on cycle density (0.82-0.92); identifies the most important symbols (used by map, search, context)
  • Personalized PageRank -- distance-weighted blast radius for impact (Gleich, 2015)
  • Adaptive betweenness centrality -- exact for small graphs, sqrt-scaled sampling for large (Brandes & Pich, 2007); finds bottleneck symbols
  • Edge betweenness centrality -- identifies critical cycle-breaking edges in SCCs (Brandes, 2001)
  • Tarjan's SCC -- detects dependency cycles with tangle ratio
  • Propagation Cost -- fraction of system affected by any change, via transitive closure (MacCormack, Rusnak & Baldwin, 2006)
  • Algebraic connectivity (Fiedler value) -- second-smallest Laplacian eigenvalue; measures architectural robustness (Fiedler, 1973)
  • Louvain community detection -- groups related symbols into clusters
  • Modularity Q-score -- measures if cluster boundaries match natural community structure (Newman, 2004)
  • Conductance -- per-cluster boundary tightness: cut(S, S_bar) / min(vol(S), vol(S_bar)) (Yang & Leskovec)
  • Topological sort -- computes dependency layers, Gini coefficient for layer balance (Gini, 1912), weighted violation severity
  • k-shortest simple paths -- traces dependency paths with coupling strength
  • Renyi entropy (order 2) -- measures co-change distribution; more robust to outliers than Shannon (Renyi, 1961)
  • Mann-Kendall trend test -- non-parametric degradation detection, robust to noise (Mann, 1945; Kendall, 1975)
  • Sen's slope estimator -- robust trend magnitude, resistant to outliers (Sen, 1968)
  • NPMI -- Normalized Pointwise Mutual Information for coupling strength (Bouma, 2009)
  • Lift -- association rule mining metric for co-change statistical significance (Agrawal & Srikant, 1994)
  • Halstead metrics -- volume, difficulty, effort, and predicted bugs from operator/operand counts (Halstead, 1977)
  • SQALE remediation cost -- time-to-fix estimates per issue type for tech debt prioritization (Letouzey, 2012)
  • Algorithm anti-pattern catalog -- 23 patterns detecting suboptimal algorithms (quadratic loops, N+1 queries, quadratic string building, branching recursion, manual top-k, loop-invariant calls) with confidence calibration via caller-count and bounded-loop analysis
Health scoring

Composite health score (0-100) using a weighted geometric mean of sigmoid health factors. Non-compensatory: a zero in any dimension cannot be masked by high scores in others.

FactorWeightWhat it measures
Tangle ratio30%% of symbols in dependency cycles
God components20%Symbols with extreme fan-in/fan-out
Bottlenecks15%High-betweenness chokepoints
Layer violations15%Upward dependency violations (severity-weighted by layer distance)
Per-file health20%Average of 7-factor file health scores

Each factor uses sigmoid health: h = e^(-signal/scale) (1 = pristine, approaches 0 = worst). Score = 100 * product(h_i ^ w_i). Also reports propagation cost (MacCormack 2006) and algebraic connectivity (Fiedler 1973). Per-file health (1-10) combines: cognitive complexity (triangular nesting penalty per Sweller's Cognitive Load Theory), indentation complexity, cycle membership, god component membership, dead export ratio, co-change entropy, and churn amplification.

How Roam Compares

roam-code is the only tool that combines graph algorithms (PageRank, Tarjan SCC, Louvain clustering), git archaeology, architecture simulation, and multi-agent partitioning in a single local CLI with zero API keys.

Interactive docs site (GitHub Pages artifact, published by .github/workflows/pages.yml):

  • docs/site/index.html (landing)
  • docs/site/getting-started.html (tutorial)
  • docs/site/command-reference.html (examples)
  • docs/site/architecture.html (diagram + internals)
  • docs/site/landscape.html (competitor matrix)
Capabilityroam-codeAI IDEs (Cursor, Windsurf)AI Agents (Claude Code, Codex)SAST (SonarQube, CodeQL)
Persistent local indexSQLiteCloud embeddingsNonePer-scan
Call graph analysisYesNoNoYes (CodeQL)
PageRank / centralityYesNoNoNo
Cycle detection (Tarjan)YesNoNoDeprecated (SonarQube)
Community detection (Louvain)YesNoNoNo
Git churn / co-changeYesNoNoNo
Architecture simulationYesNoNoNo
Multi-agent partitioningYesNoNoNo
MCP tools for agents101 (24 in default core preset)Client onlyClient only34 (SonarQube)
Languages2670+50+12-42
100% local, zero API keysYesNoNoPartial
Open sourceMITNoPartialPartial

Key Differentiators

  • vs AI IDEs (Cursor, Windsurf, Augment): roam-code provides deterministic structural analysis. AI IDEs use probabilistic embeddings that can't guarantee reproducible results.
  • vs AI Agents (Claude Code, Codex CLI, Gemini CLI): These agents read files one at a time. roam-code pre-computes relationships so agents get instant answers about architecture, blast radius, and dependencies.
  • vs SAST Tools (SonarQube, CodeQL, Semgrep): SAST tools find bugs and vulnerabilities. roam-code understands architecture -- how code is structured, where it's coupled, and what breaks when you change it. Complementary, not competitive.
  • vs Code Search (Sourcegraph/Amp, Greptile): Text search finds where code is. roam-code understands why code matters -- which functions are central, which modules are tangled, which files are high-risk.

FAQ

Does Roam send any data externally? No. Zero network calls. No telemetry, no analytics, no update checks.

Can Roam run in air-gapped environments? Yes. Once installed, no internet access is required.

Does Roam modify my source code? Read-only by default. Creates .roam/ with an index database. The roam mutate command can apply code changes (move/rename/extract) but defaults to --dry-run mode — you must explicitly pass --apply to write changes.

How does Roam handle monorepos? Indexes from the root. Batched SQL handles 100k+ symbols. Incremental updates stay fast.

How does Roam handle multi-repo projects (e.g., frontend + backend)? Use roam ws init <repo1> <repo2> to create a workspace. Each repo keeps its own index; a workspace overlay DB stores cross-repo API edges. roam ws resolve scans for REST endpoints and matches frontend calls to backend routes. Then roam ws context, roam ws trace, etc. work across repos.

Is Roam compatible with SonarQube / CodeScene? Yes. Roam complements existing tools. Both can run in the same CI pipeline. SARIF output integrates with GitHub Code Scanning.

Limitations

Static analysis trade-offs:

  • Static analysis primarily -- can't trace dynamic dispatch, reflection, or eval'd code. Runtime trace ingestion (roam ingest-trace) adds production data but requires external trace export
  • Import resolution is heuristic -- complex re-exports or conditional imports may not resolve
  • Limited cross-language edges -- Salesforce, Protobuf, REST API, and multi-repo edges are supported, but not arbitrary FFI
  • Tier 2 languages (Kotlin, Swift, Scala) get basic symbol extraction only
  • Large monorepos (100k+ files) may have slow initial indexing

Troubleshooting

ProblemSolution
roam: command not foundEnsure install location is on PATH. For uv: uv tool update-shell
Another indexing process is runningDelete .roam/index.lock and retry
database is lockedroam index --force to rebuild
Unicode errors on Windowschcp 65001 for UTF-8
Symbol resolves to wrong fileUse file:symbol syntax: roam symbol myfile:MyFunction
Health score seems wrongroam health --json for factor breakdown
Index stale after git pullroam index (incremental). After major refactors: roam index --force

Update / Uninstall

# Update
pipx upgrade roam-code
uv tool upgrade roam-code
pip install --upgrade roam-code

# Uninstall
pipx uninstall roam-code
uv tool uninstall roam-code
pip uninstall roam-code

Delete .roam/ from your project root to clean up local data.

Development

git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e ".[dev]"   # includes pytest, ruff
pytest tests/              # 2656 tests, Python 3.9-3.13

# Or use Make targets:
make dev      # install with dev extras
make test     # run tests
make lint     # ruff check
Project structure
roam-code/
├── pyproject.toml
├── action.yml                         # Reusable GitHub Action
├── src/roam/
│   ├── __init__.py                    # Version (from pyproject.toml)
│   ├── cli.py                         # Click CLI (136 canonical commands + 1 legacy alias)
│   ├── mcp_server.py                  # MCP server (101 tools, 10 resources, 5 prompts)
│   ├── db/
│   │   ├── connection.py              # SQLite (WAL, pragmas, batched IN)
│   │   ├── schema.py                  # Tables, indexes, migrations
│   │   └── queries.py                 # Named SQL constants
│   ├── index/
│   │   ├── indexer.py                 # Orchestrates full pipeline
│   │   ├── discovery.py               # git ls-files, .gitignore
│   │   ├── parser.py                  # Tree-sitter parsing
│   │   ├── symbols.py                 # Symbol + reference extraction
│   │   ├── relations.py               # Reference resolution -> edges
│   │   ├── complexity.py              # Cognitive complexity (SonarSource) + Halstead metrics
│   │   ├── git_stats.py               # Churn, co-change, blame, Renyi entropy
│   │   ├── incremental.py             # mtime + hash change detection
│   │   ├── file_roles.py              # Smart file role classifier
│   │   └── test_conventions.py        # Pluggable test naming adapters
│   ├── languages/
│   │   ├── base.py                    # Abstract LanguageExtractor
│   │   ├── registry.py                # Language detection + aliasing
│   │   ├── *_lang.py                  # One file per language (17 Tier 1)
│   │   └── generic_lang.py            # Tier 2 fallback
│   ├── bridges/
│   │   ├── base.py, registry.py       # Cross-language bridge framework
│   │   ├── bridge_salesforce.py       # Apex <-> Aura/LWC/Visualforce
│   │   └── bridge_protobuf.py         # .proto -> Go/Java/Python stubs
│   ├── catalog/
│   │   ├── tasks.py                  # Universal algorithm catalog (23 patterns)
│   │   └── detectors.py              # Anti-pattern detectors with confidence calibration
│   ├── workspace/
│   │   ├── config.py                  # .roam-workspace.json
│   │   ├── db.py                      # Workspace overlay DB
│   │   ├── api_scanner.py             # REST API endpoint detection
│   │   └── aggregator.py              # Cross-repo aggregation
│   ├── graph/
│   │   ├── builder.py, pagerank.py    # DB -> NetworkX, PageRank
│   │   ├── cycles.py, clusters.py     # Tarjan SCC, propagation cost, Louvain, modularity Q
│   │   ├── layers.py, pathfinding.py  # Topo layers, k-shortest paths
│   │   ├── split.py, why.py           # Decomposition, role classification
│   │   └── anomaly.py                 # Statistical anomaly detection
│   ├── commands/
│   │   ├── resolve.py                 # Shared symbol resolution
│   │   ├── graph_helpers.py           # Shared graph utilities (adj builders, BFS)
│   │   ├── context_helpers.py         # Data-gathering helpers for context command
│   │   ├── gate_presets.py            # Framework-specific gate rules
│   │   └── cmd_*.py                   # One module per command
│   ├── analysis/
│   │   └── effects.py                 # Side-effect classification engine
│   ├── refactor/
│   │   ├── codegen.py                 # Import generation (Python/JS/Go)
│   │   └── transforms.py             # move/rename/add-call/extract transforms
│   ├── rules/
│   │   └── engine.py                  # YAML rule parser + graph query evaluator
│   ├── runtime/
│   │   ├── trace_ingest.py            # OpenTelemetry/Jaeger/Zipkin ingestion
│   │   └── hotspots.py                # Runtime hotspot analysis
│   ├── search/
│   │   ├── tfidf.py                   # TF-IDF semantic search engine
│   │   ├── index_embeddings.py        # Embedding index builder
│   │   └── onnx_embeddings.py         # Optional local ONNX semantic backend
│   ├── security/
│   │   ├── vuln_store.py              # CVE/vulnerability storage
│   │   └── vuln_reach.py              # Vulnerability reachability paths
│   └── output/
│       ├── formatter.py               # Token-efficient formatting
│       ├── sarif.py                   # SARIF 2.1.0 output
│       └── schema_registry.py         # JSON envelope schema versioning
└── tests/                             # Test suite across 70 test files

Dependencies

PackagePurpose
click >= 8.0CLI framework
tree-sitter >= 0.23AST parsing
tree-sitter-language-pack >= 0.6165+ grammars
networkx >= 3.0Graph algorithms

Optional: fastmcp >= 2.0 (MCP server — install with pip install "roam-code[mcp]")

Optional: Local semantic ONNX stack (numpy, onnxruntime, tokenizers) via pip install "roam-code[semantic]"

Roadmap

Shipped

  • MCP v2 agent surface: in-process execution, compound operations, presets, schemas, annotations, and compatibility profiles.
  • Full command and MCP inventory parity in docs: 136 canonical CLI commands (+1 alias) and 101 MCP tools.
  • CI hardening: composite action, changed-only mode, trend-aware gates, sticky PR updater, and SARIF guardrails.
  • Performance foundation: FTS5/BM25 search, O(changed) incremental indexing, DB/index optimizations.
  • Agent governance suite: vibe-check, ai-readiness, verify, ai-ratio, duplicates, advanced algo scoring/SARIF.
  • Ownership/review intelligence: codeowners, drift, simulate-departure, suggest-reviewers, api-changes, test-gaps, semantic-diff, secrets.
  • Multi-agent operations: partition, affected, syntax-check, workspace-aware context and traces.
  • Budget-aware context delivery: --budget (partial rollout), PageRank-weighted truncation, conversation-aware ranking.

Next (v11 Closeout + immediate follow-up)

  • Terminal demo GIF in README (#26).
  • GitHub repo topics (#24).
  • GitHub Discussions enabled (#29).
  • MCP directory + awesome-list submissions (#31).
  • Optional community launch note (#32, post-v11, non-blocking).
  • Keep roadmap synchronized with shipped state (#112 recurring hygiene).

Contributing

git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
pytest tests/   # All 2656 tests must pass

Good first contributions: add a Tier 1 language (see go_lang.py or php_lang.py as templates), improve reference resolution, add benchmark repos, extend SARIF converters, add MCP tools.

Please open an issue first to discuss larger changes.

License

MIT

Reviews

No reviews yet

Sign in to write a review