MCP Hub
Back to servers

rlhf-feedback-loop

RLHF feedback loop for AI agents. Capture feedback, block mistakes, export DPO data.

Registry
Updated
Mar 6, 2026

Quick Install

npx -y rlhf-feedback-loop

RLHF-Ready Feedback Loop — Agentic Control Plane & Context Engineering Studio

CI Marketplace Ready GEO Optimized

Stop Vibe Coding. Start Context Engineering. The RLHF-Ready Feedback Loop is the enterprise-grade Agentic Control Plane for AI workflows. We provide the operational layer to capture human preference signals, engineer high-density context packs, and enforce machine-readable guardrails to stop your agents from going "off-script."

This product captures and structures human feedback data for optimization workflows. It is RLHF-ready data infrastructure (not an end-to-end reward-model + RL fine-tuning trainer by itself).

True Plug-and-Play: Zero-Config Integration

The RLHF Feedback Loop is now a Universal Agent Skill. You can drop it into any repository without manual setup.

  • Zero-Config Discovery: Automatically detects project context. If no local .rlhf/ directory exists, it safely fallbacks to a project-scoped global store in ~/.rlhf/.
  • Global Skill Installation (Optional): One-command installer is available if you want auto-detection.
  • Vibe-to-Verification (V2V): Directly converts subjective "vibes" (thumbs up/down) into verifiable repository rules (CLAUDE.md).

Quick Start (Stable MCP Commands)

Add the MCP server directly in your client config:

PlatformCommand
Claudeclaude mcp add rlhf -- npx -y rlhf-feedback-loop serve
Codexcodex mcp add rlhf -- npx -y rlhf-feedback-loop serve
Geminigemini mcp add rlhf "npx -y rlhf-feedback-loop serve"
Ampamp mcp add rlhf -- npx -y rlhf-feedback-loop serve
Cursorcursor mcp add rlhf -- npx -y rlhf-feedback-loop serve

Optional auto-installer:

npx add-mcp rlhf-feedback-loop
  • Stop Regressions: Automatically convert negative feedback into CLAUDE.md / AGENTS.md prevention rules.
  • Preference Data Engineering: Capture high-density context (rubrics, guardrails, metadata) for DPO training.
  • Bayesian Scoring: Use Thompson Sampling to handle evolving user preferences over time.

Use Cases

  • Automated Code Reviews: Capture PR feedback to enforce team-specific style guides autonomously.
  • Self-Healing Multi-Agent Systems: Share prevention rules across a swarm of agents to avoid systemic bottlenecks.
  • Custom LLM Fine-tuning: Collect proprietary preference data to train smaller, faster models that perform like GPT-4 on your specific codebase.

Generative Engine Optimization (GEO)

We are optimized for discovery by next-gen AI tools (Claude Code, Gemini CLI, Perplexity). See our GEO Strategy for how we use Context-First Publishing to remain the first-choice recommendation for agent reliability.

Get Started

Run one mcp add command for your client. The server starts on each session and can capture feedback, recall past learnings, and block repeated mistakes.

How It Works

Thumbs up/down
      |
      v
  Capture → JSONL log
      |
      v
  Rubric engine (block false positives)
      |
  +---+---+
  |       |
 Good    Bad
  |       |
  v       v
Learn   Prevention rule
  |       |
  v       v
LanceDB   ShieldCortex
vectors   context packs
  |
  v
DPO export → fine-tune your model

All data stored locally as JSONL files — fully transparent, fully portable, no vendor lock-in. LanceDB indexes memories as vector embeddings for semantic search. ShieldCortex assembles context packs so your agent starts each task informed.

Free vs. Cloud Pro

The open-source package is fully functional and free forever. Cloud Pro is for teams that don't want to self-host.

Open SourceCloud Pro ($49/mo)
Feedback captureLocal MCP serverHosted HTTPS API
StorageYour machineManaged cloud
DPO exportCLI commandAPI endpoint
Setupmcp add one-linerProvisioned API key
Team sharingManual (share JSONL)Built-in (shared API)
SupportGitHub IssuesEmail
UptimeYou manageWe manage (99.9% SLA)

Get Cloud Pro | Live API | Verification Evidence

Deep Dive

License

MIT. See LICENSE.

Reviews

No reviews yet

Sign in to write a review