1000 Genomes Project Dataset MCP Server
Real time access to 1000 Genomes Project dataset, hosted online in Dnaerys variant store
Sequenced & aligned by New York Genome Center (GRCh38). 3202 samples: 2504 unrelated samples from phase three panel + 698 samples from 602 family trios (dataset details)
Key Features
-
real-time access to 138 044 723 unique variants and ~442 billion individual genotypes
-
variant, sample and genotype selection based on coordinates, annotations, zygosity
-
filtering by VEP (impact, biotype, feature type, variant class, consequences), ClinVar Clinical Significance (202502), gnomADe + gnomADg 4.1, AlphaMissense Score & AlphaMissense Class annotations
-
filtering by inheritance model (de novo, heterozygous dominant, homozygous recessive)
-
returned variants annotated with HGVSp, gnomADe + gnomADg, AlphaMissense score + cohort-wide statistics
Online Service
Remote MCP service via Streamable HTTP:
Architecture
Implemented as a Java EE service, accessing KGP dataset via gRPC calls to public Dnaerys variant store service.
- provides MCP over Streamable HTTP, HTTP/SSE and STDIO transports
- service implementation is based on Quarkus MCP Server framework
Available Tools
MCP tools and parameters are here
Installation
Project can be run locally with MCP over stdio and/or http transports
Option A - build & run locally
- build the project and package it as a single über-jar:
- jar is located in
target/onekgpd-mcp-runner.jarand includes all dependencies
- jar is located in
./mvnw package -DskipTests -Dquarkus.package.jar.type=uber-jar
- run it locally with dev profile
- both stdio and http transports are enabled
- http transport is on quarkus http.port
- project expects JRE 21 to be available at runtime
java -Dquarkus.profile=dev -jar <full path>/onekgpd-mcp-runner.jar
Option B - build & run in docker
-
in order to run in docker, stdio transport needs to be disabled to prevent application from stopping itself due to closed stdio in containers
- it's already configured in prod profile
- it's the default configuration overall
-
build with prod profile
docker build -f Dockerfile -t onekgpd-mcp .
- run as you prefer, e.g.
docker run -p 9000:9000 --name onekgpd-mcp --rm onekgpd-mcp
Option C - pull from Docker Hub
- pull prebuilt image; stdio transport disabled, http transport on port 9000
docker pull dnaerys/onekgpd-mcp:latest
- run
docker run -p 9000:9000 --name onekgpd-mcp --rm onekgpd-mcp
Connecting with MCP clients
-
to connect via http transport, remote or local, simply direct the client to a destination, e.g.
http://localhost:9000/mcporhttps://db.dnaerys.org:443/mcp -
to connect via stdio transport, MCP client should start application with dev profile and with a full path to the jar file
- e.g. add to Claude config files (e.g.
claude_desktop_config.json):
- e.g. add to Claude config files (e.g.
{
"mcpServers": {
"OneKGPd": {
"command": "java",
"args": ["-Dquarkus.profile=dev", "-jar", "/full/path/onekgpd-mcp-runner.jar"]
}
}
}
Verification
How many variants exist in 1000 Genome Project ?
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.