MCP Hub
Back to servers

open-ontologies

AI-native ontology engineering MCP server for OWL/RDF/SPARQL. Validate, query, diff, lint, version, and govern knowledge graphs via Oxigraph triple store.

glama
Stars
17
Forks
3
Updated
Mar 13, 2026
Validated
Mar 14, 2026

A Terraforming MCP for Knowledge Graphs: validate, classify, and govern AI-generated ontologies.

CI License: MIT MCP Badge Glama PitchHut ClawHub Skill

Open Ontologies is a standalone MCP server and CLI for AI-native ontology engineering. It exposes 42 tools and 5 workflow prompts that let Claude validate, query, diff, lint, version, and persist RDF/OWL ontologies using an in-memory Oxigraph triple store — plus plan changes, detect drift, enforce design patterns, monitor health, align ontologies, track lineage, and learn from user feedback.

Written in Rust, ships as a single binary. No JVM, no Protege, no GUI.

Quick Start

1. Install

Pre-built binaries

Download from GitHub Releases:

# macOS (Apple Silicon)
curl -LO https://github.com/fabio-rovai/open-ontologies/releases/latest/download/open-ontologies-aarch64-apple-darwin
chmod +x open-ontologies-aarch64-apple-darwin && mv open-ontologies-aarch64-apple-darwin /usr/local/bin/open-ontologies

# macOS (Intel)
curl -LO https://github.com/fabio-rovai/open-ontologies/releases/latest/download/open-ontologies-x86_64-apple-darwin
chmod +x open-ontologies-x86_64-apple-darwin && mv open-ontologies-x86_64-apple-darwin /usr/local/bin/open-ontologies

# Linux (x86_64)
curl -LO https://github.com/fabio-rovai/open-ontologies/releases/latest/download/open-ontologies-x86_64-unknown-linux-gnu
chmod +x open-ontologies-x86_64-unknown-linux-gnu && mv open-ontologies-x86_64-unknown-linux-gnu /usr/local/bin/open-ontologies

Docker

docker pull ghcr.io/fabio-rovai/open-ontologies:latest
docker run -i ghcr.io/fabio-rovai/open-ontologies serve

From source (Rust 1.85+)

git clone https://github.com/fabio-rovai/open-ontologies.git
cd open-ontologies
cargo build --release
./target/release/open-ontologies init

2. Connect to your MCP client

Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "open-ontologies": {
      "command": "/path/to/open-ontologies/target/release/open-ontologies",
      "args": ["serve"]
    }
  }
}

Restart Claude Code. The onto_* tools are now available.

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "open-ontologies": {
      "command": "/path/to/open-ontologies/target/release/open-ontologies",
      "args": ["serve"]
    }
  }
}
Cursor / Windsurf / any MCP-compatible IDE

Add to your MCP settings (usually .cursor/mcp.json or equivalent):

{
  "mcpServers": {
    "open-ontologies": {
      "command": "/path/to/open-ontologies/target/release/open-ontologies",
      "args": ["serve"]
    }
  }
}
Docker
{
  "mcpServers": {
    "open-ontologies": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "ghcr.io/fabio-rovai/open-ontologies", "serve"]
    }
  }
}

3. Build your first ontology

Build me a Pizza ontology following the Manchester University tutorial.
Include all 49 toppings, 22 named pizzas, spiciness value partition,
and defined classes (VegetarianPizza, MeatyPizza, SpicyPizza).
Validate it, load it, and show me the stats.

Claude generates Turtle, then automatically calls onto_validateonto_loadonto_statsonto_lintonto_query, fixing errors along the way.

Demo: Database → Ontology in 3 commands

# Import a PostgreSQL schema as OWL
open-ontologies import-schema postgres://demo:demo@localhost/shop

# Classify with native OWL2-DL reasoner
open-ontologies reason --profile owl-dl

# Query the result
open-ontologies query "SELECT ?c ?label WHERE { ?c a owl:Class . ?c rdfs:label ?label }"

Why This Exists

You can ask Claude to generate an ontology in a single prompt — and it will. But single-shot generation has real problems:

ProblemWhat goes wrong
No validationInvalid Turtle — wrong prefixes, unclosed brackets, bad URIs
No verificationNo way to check counts or structure without SPARQL
No iterationCan't diff versions, lint for missing labels, or run competency questions
No persistenceOntology lives only in chat context — no versioning, no rollback
No scaleContext window holds ~2,000 triples; real ontologies need a triple store
No integrationCan't push to SPARQL endpoints or resolve owl:imports chains

Open Ontologies solves all of these. It's a proper RDF/SPARQL engine (Oxigraph) exposed as MCP tools that Claude calls automatically.

LayerToolWhat it does
GenerationClaude / GPT / LLaMAGenerates OWL/RDF from natural language
ValidationOpen OntologiesValidates, classifies, enforces, monitors
StorageSPARQL endpoint / triplestorePersists the production ontology
ConsumptionYour app / API / pipelineQueries the knowledge graph

How It Works

You provide domain requirements in natural language. Claude generates Turtle/OWL, then dynamically decides which MCP tools to call based on what each tool returns — validating, fixing, re-loading, querying, iterating until the ontology is correct.

flowchart TD
    You["You — 'Build me a Pizza ontology'"]
    Claude["Claude generates Turtle"]
    Validate["onto_validate"]
    Fix["Claude fixes errors"]
    Load["onto_load"]
    Stats["onto_stats"]
    Lint["onto_lint"]
    Query["onto_query — SPARQL"]
    Save["onto_save"]
    Version["onto_version"]

    You --> Claude
    Claude --> Validate
    Validate -->|"syntax errors"| Fix
    Fix --> Validate
    Validate -->|"ok"| Load
    Load --> Stats
    Stats -->|"wrong counts"| Claude
    Stats -->|"ok"| Lint
    Lint -->|"issues"| Fix
    Lint -->|"clean"| Query
    Query -->|"gaps found"| Claude
    Query -->|"all correct"| Version
    Version --> Save

This is not a fixed pipeline. Claude is the orchestrator — it decides what to call next based on results.

Tools

42 tools organized by function:

CategoryToolsPurpose
Corevalidate, load, save, clear, stats, query, diff, lint, convert, statusRDF/OWL validation, querying, and management
Remotepull, push, import-owlFetch/push ontologies, resolve owl:imports
Schemaimport-schemaPostgreSQL → OWL conversion
Datamap, ingest, shacl, reason, extendStructured data → RDF pipeline
Versioningversion, history, rollbackNamed snapshots and rollback
Lifecycleplan, apply, lock, drift, enforce, monitor, monitor-clear, lineageTerraform-style change management
Alignmentalign, align-feedbackCross-ontology class matching with self-calibrating confidence
Clinicalcrosswalk, enrich, validate-clinicalICD-10 / SNOMED / MeSH crosswalks
Feedbacklint-feedback, enforce-feedbackSelf-calibrating suppression — teach lint/enforce to stop repeating dismissed warnings
Embeddingsembed, search, similarityDual-space semantic search (text + Poincaré structural)
Reasoningreason (rdfs, owl-rl, owl-rl-ext, owl-dl), dl_explain, dl_checkNative SHOIQ tableaux reasoner

All tools are available both as MCP tools (prefixed onto_) and as CLI subcommands.

open-ontologies <command> [args] [--pretty] [--data-dir ~/.open-ontologies]

Data Pipeline

Take any structured data — CSV, JSON, Parquet, XLSX, XML, YAML — and terraform it into a validated, reasoned knowledge graph.

flowchart LR
    Data["CSV / JSON / XLSX / ..."]
    Map["onto_map — generate mapping"]
    Ingest["onto_ingest — parse to RDF"]
    Validate["onto_shacl — check constraints"]
    Reason["onto_reason — infer new facts"]
    Query["onto_query — ask questions"]

    Data --> Map
    Map --> Ingest
    Ingest --> Validate
    Validate -->|violations| Map
    Validate -->|ok| Reason
    Reason --> Query
Manual processOpen Ontologies equivalent
Domain expert defines classes by handimport-schema or Claude generates OWL
Analyst maps spreadsheet columns to ontologymap auto-generates mapping config
Data engineer writes ETL to RDFingest parses CSV/JSON/Parquet/XLSX → RDF
Ontologist validates data constraintsshacl checks cardinality, datatypes, classes
Reasoner classifies instances (Protege + HermiT)reason runs native OWL2-DL classification
Quality reviewer checks consistencyenforce + lint + monitor

Supported formats

FormatExtension
CSV.csv
JSON.json
NDJSON.ndjson
XML.xml
YAML.yaml
Excel.xlsx
Parquet.parquet

Mapping config

The mapping bridges tabular data and RDF:

{
  "base_iri": "http://www.co-ode.org/ontologies/pizza/pizza.owl#",
  "id_field": "name",
  "class": "http://www.co-ode.org/ontologies/pizza/pizza.owl#NamedPizza",
  "mappings": [
    { "field": "base", "predicate": "pizza:hasBase", "lookup": true },
    { "field": "topping1", "predicate": "pizza:hasTopping", "lookup": true },
    { "field": "price", "predicate": "pizza:hasPrice", "datatype": "xsd:decimal" }
  ]
}
  • lookup: true — IRI reference (links to another entity)
  • datatype — typed literal (decimal, integer, date)
  • Neither — plain string literal

Ontology Lifecycle

Production ontologies change over time. Open Ontologies provides Terraform-style lifecycle management.

flowchart LR
    Plan["onto_plan"]
    Enforce["onto_enforce"]
    Apply["onto_apply"]
    Monitor["onto_monitor"]
    Drift["onto_drift"]

    Plan -->|"risk score"| Enforce
    Enforce -->|"compliance"| Apply
    Apply -->|"safe / migrate"| Monitor
    Monitor -->|"watchers"| Drift
    Drift -->|"velocity"| Plan

Plan — Diffs current vs proposed ontology. Reports added/removed classes, blast radius, risk score (low/medium/high). Locked IRIs (onto_lock) prevent accidental removal.

Enforce — Design pattern checks. Built-in packs: generic (orphan classes, missing labels), boro (IES4/BORO compliance), value_partition (disjointness). Custom SPARQL rules supported.

Apply — Two modes: safe (clear + reload) or migrate (add owl:equivalentClass/Property bridges for consumers).

Monitor — SPARQL watchers with threshold alerts. Actions: notify, block_next_apply, auto_rollback, log.

Drift — Compares versions, detects renames via Jaro-Winkler similarity, computes drift velocity. Self-calibrating confidence via SQLite feedback loop.

Lineage — Append-only audit trail of all lifecycle operations.

Feedback — Lint and enforce learn from your decisions. Dismiss a warning 3 times and it's suppressed; accept it once and it sticks. Same self-calibrating pattern used by align and drift.

Semantic Embeddings (Poincaré Vector Store)

Open Ontologies includes a built-in dual-space vector store for semantic search and alignment:

  • Text embeddings via ONNX model (bge-small-en-v1.5) — captures label/definition similarity
  • Structural embeddings via Poincaré ball — captures hierarchy position (root classes near center, leaves near boundary)
  • Product search — combines both spaces for best results
onto_load → onto_embed → onto_search "domestic animal"

The embedding model (~33MB) is downloaded on open-ontologies init. All inference runs locally via tract (pure Rust ONNX runtime) — no API keys or external services needed.

ToolPurpose
onto_embedGenerate embeddings for all classes in the loaded ontology
onto_searchSemantic search by natural language query
onto_similarityCompare two IRIs by embedding similarity

Schema Alignment

Detect owl:equivalentClass, skos:exactMatch, rdfs:subClassOf candidates between two ontologies using 7 weighted signals:

SignalWeightWhat it measures
Label similarity0.20Jaro-Winkler on normalized labels (camelCase split, lowercased)
Property overlap0.15Jaccard on domain property + range signatures
Parent overlap0.12Jaccard on rdfs:subClassOf parent local names
Instance overlap0.12Jaccard on shared individuals by local name
Restriction similarity0.12Jaccard on OWL restriction signatures (property→filler)
Neighborhood similarity0.09Jaccard on 2-hop property neighborhood
Embedding similarity0.20Cosine similarity on text embeddings (requires onto_embed)

When compiled without the embeddings feature, alignment uses the first 6 signals with the original weights (0.25, 0.20, 0.15, 0.15, 0.15, 0.10). Candidates above the confidence threshold are auto-applied to the main graph. Use onto_align_feedback to accept/reject candidates — feedback is stored in SQLite and used to self-calibrate signal weights over time.

# Compare two ontology files (dry run)
open-ontologies align source.ttl target.ttl --min-confidence 0.7 --dry-run

# Accept a candidate
open-ontologies align-feedback --source http://ex.org/Dog --target http://other.org/Canine --accept

Clinical Crosswalks

For healthcare ontologies, three tools bridge clinical coding systems:

  • onto_crosswalk — Look up mappings between ICD-10 (diagnoses), SNOMED CT (clinical terms), and MeSH (medical literature) from a Parquet-backed crosswalk file
  • onto_enrich — Insert skos:exactMatch triples linking ontology classes to clinical codes
  • onto_validate_clinical — Check that class labels align with standard clinical terminology

OWL2-DL Reasoning

Native Rust SHOIQ tableaux reasoner — no JVM required.

DL FeatureSymbolOWL Construct
Atomic negation¬AcomplementOf
ConjunctionC ⊓ DintersectionOf
DisjunctionC ⊔ DunionOf
Existential∃R.CsomeValuesFrom
Universal∀R.CallValuesFrom
Min cardinality≥n R.CminQualifiedCardinality
Max cardinality≤n R.CmaxQualifiedCardinality
Role hierarchyR ⊑ SsubPropertyOf
Transitive rolesTrans(R)TransitiveProperty
Inverse rolesR⁻inverseOf
Symmetric rolesSym(R)SymmetricProperty
FunctionalFun(R)FunctionalProperty
ABox reasoninga:CNamedIndividual

Agent-based parallel classification:

  1. Satisfiability Agent — Tests each class in parallel using rayon
  2. Subsumption Agent — Pairwise subsumption tests, pruned by told-subsumer closure
  3. Explanation Agent — Traces clash derivations for unsatisfiable classes
  4. ABox Agent — Individual consistency and type inference
ReasonerLanguageJVMParallelSHOIQ
Open OntologiesRustNoYes (rayon)Yes
HermiTJavaYesNoYes
PelletJavaYesNoYes

Benchmarks

Ontology Generation

Pizza Ontology — Manchester Tutorial

The Manchester Pizza Tutorial is the most widely used OWL teaching material. Students build a Pizza ontology in Protege over ~4 hours.

Input: One sentence — "Build a Pizza ontology following the Manchester tutorial specification."

MetricReference (Protege)AI-GeneratedCoverage
Classes999596%
Properties88100%
Toppings4949100%
Named Pizzas2424100%
Time~4 hours~5 minutes

The 4 missing classes are teaching artifacts (e.g., UnclosedPizza) that exist only to demonstrate OWL syntax variants. Files: benchmark/

IES4 Building Domain — BORO/4D

The IES4 standard is the UK government's Information Exchange Standard for defence/intelligence. Built on BORO methodology and 4D perdurantist modeling.

Input: Three context documents — BORO/4D methodology, structural requirements, and a domain brief with 9 competency questions.

MetricValue
Compliance checks86/86 passed (100%)
Triples318
Classes36
Properties12
GenerationOne pass — valid Turtle directly

Files: benchmark/

Ontology Extension — Pizza Menu Mapping

Given the Manchester Pizza OWL (expert-crafted reference) and a 13-row restaurant CSV, map the data into the ontology.

MetricValue
Topping coverage vs reference94% (62/66 matched)
IRI accuracy (naive)5%
IRI accuracy (Claude-refined)94–100%
Vegetarian classification92% (100% with refined mapping)

The 4 mismatches are naming convention gaps (e.g., "Anchovy" vs AnchoviesTopping). Files: benchmark/

Mushroom Classification — OWL Reasoning vs Expert Labels

Dataset: UCI Mushroom Dataset — 8,124 specimens classified by mycology experts.

MetricValue
Accuracy98.33%
Recall (poisonous)100% — zero toxic mushrooms missed
False positives136 (1.67%) — conservative by design
False negatives0
Classification rules6 OWL axioms

The reasoner is conservative — it flags safe mushrooms as suspicious before ever classifying a toxic one as edible. Files: benchmark/mushroom/

Vision Benchmark — Image to Knowledge Graph

Dataset: 10 real photographs with manually annotated ground truth. RDF pipeline data extracted directly from TTL files; triple counts from onto_validate.

MetricManualPure ClaudeRDF Pipeline
Object Recall100%89%95%
Category Recall100%79%32%
Total RDF Triples002,540
skos:altLabel Synonyms00612
SPARQL QueryableNoNoYes
Confidence ScoresNoNoYes
Effort per image~2 min~8 sec~8 sec

Category recall is lower because TTL files use fine-grained categories ("animal body part", "vehicle part") while ground truth uses broad labels ("animal", "vehicle"). The value is in queryability — you can't ask "find all images containing animals near water" with flat text labels.

The benchmark runs the full MCP pipeline: onto_clearonto_validate (×10) → onto_load (×10) → onto_statsonto_lint (×10) → onto_query (×6) via the real MCP server using the official MCP Python SDK over JSON-RPC 2.0 stdio. Files: benchmark/vision/

OntoAxiom Benchmark — Three Approaches to Axiom Identification

OntoAxiom tests LLM axiom identification across 9 ontologies and 3,042 ground truth axioms. We test three approaches:

ApproachInputF1vs o1
o1 (paper's best)Name lists only0.197
Bare Claude OpusName lists only0.431+119%
MCP extractionFull OWL files0.717+264%

MCP extraction per axiom type:

Axiom TypeMCP Extractiono1 (paper)Improvement
subClassOf0.8350.359+133%
disjointWith0.9760.095+927%
domain0.6620.038+1642%
range0.5650.030+1783%
subPropertyOf0.6170.106+482%
OVERALL0.7170.197+264%

13 individual results scored PERFECT (F1 = 1.000). Full writeup: benchmark/ontoaxiom/ONTOAXIOM_SHOWDOWN.md

Reasoning Performance — HermiT vs Open Ontologies

Java 25, HermiT 1.4.3.456, OWL API 4.5.29.

Pizza Ontology (4,179 triples)

ToolTimeResult
HermiT213ms312 subsumptions
Open Ontologies (OWL-RL)43msLoad + rule-based inference
Open Ontologies (OWL-DL)19msConsistency check, SHOIQ tableaux

LUBM Scaling (load + reason cycle)

AxiomsOpen OntologiesHermiTSpeedup
1,00015ms112ms7.5x
5,00014ms410ms29x
10,00014ms1,200ms86x
50,00015ms24,490ms1,633x

OO's time stays flat — OWL-RL is SPARQL-based rule application, not tableaux expansion. HermiT's tableaux algorithm grows super-linearly with ontology size. Both produce correct results; different reasoning strategies for different use cases.

Scripts and results: benchmark/reasoner/

Architecture

flowchart TD
    Claude["🤖 Claude / LLM"]
    MCP["Open Ontologies MCP Server"]

    subgraph Core["Core Engine"]
        GraphStore["Oxigraph Triple Store"]
        StateDb["SQLite State"]
    end

    subgraph Tools["42 Tools + 5 Prompts"]
        direction LR
        Ontology["validate · load · query\nsave · diff · lint · convert"]
        Data["map · ingest · shacl\nreason · extend"]
        Lifecycle["plan · apply · lock\nenforce · monitor · drift"]
        Advanced["align · crosswalk · enrich\nlineage · embed · search"]
    end

    Claude -->|"MCP stdio"| MCP
    MCP --> Tools
    Tools --> Core

Stack

  • Rust (edition 2024) — single binary, no JVM
  • Oxigraph 0.4 — pure Rust RDF/SPARQL engine
  • rmcp — MCP protocol implementation
  • SQLite (rusqlite) — state, versions, lineage, monitor, enforcer rules, drift feedback, embeddings
  • Apache Arrow/Parquet — clinical crosswalk file format
  • tract-onnx — pure Rust ONNX runtime for text embeddings (optional, embeddings feature)
  • tokenizers — HuggingFace tokenizer for bge-small-en-v1.5 (optional, embeddings feature)

Optional companion: OpenCheir adds workflow enforcement, audit trails, and multi-agent orchestration.

License

MIT

Reviews

No reviews yet

Sign in to write a review