Praxis (πρᾶξις)

The practice of doing — a filesystem-based methodology for agentic development.

From the Greek πράσσω (prássō) — "to do, to act, to practice."

Zero dependencies. Just folders, markdown, and native AI tools.

In philosophy, Aristotle coined the modern usage of praxis to mean the process by which theory becomes practice. It is the bridge between knowing and doing — you have theory (theōría / θεωρία) on one side, and praxis on the other, where knowledge is enacted through deliberate action.

That is exactly what this methodology does. It bridges the gap between what AI agents know (their training, their context window, their capabilities) and what they do (writing code, researching, auditing, reporting) — through structured context, persistent memory, and traceable work.

The Problem

AI agents are powerful but forgetful. Every new session starts from zero. The context window is a blank slate — yesterday's decisions, last week's architecture choices, the reason you picked PostgreSQL over MySQL — all gone unless someone writes it down.

Most people solve this by writing longer prompts. They paste project context, repeat instructions, hope the AI remembers what matters. This works for small tasks. It collapses for anything real.

The problems with prompt-driven development:

Ephemeral — prompts disappear when the session ends. No audit trail, no history.
Unstructured — instructions are scattered across chat messages. Nothing is canonical.
Untrackable — there's no concept of "done." Did the AI complete the task? Partially? Who checks?
Single-agent — prompts assume one AI. When multiple agents collaborate, there's no routing, no ownership, no handoff protocol.
Human memory-dependent — the developer must remember what happened last session and re-explain it. Both humans and AI have short-term memory limitations.

Praxis solves all of this with a filesystem. No database. No SaaS platform. Just folders and markdown.

Work Orders: The Core Innovation

The most important concept in Praxis is the work order — and it comes from an unexpected place.

Origin: Construction & Manufacturing

In construction, a work order is a formal document that authorizes and describes a specific piece of work. It has a scope, acceptance criteria, an assigned worker, and a clear definition of "done." When the electrician finishes wiring the second floor, the work order moves from "pending" to "complete." There's a paper trail. There's accountability. There's no ambiguity about what was asked or what was delivered.

Software engineering adopted a similar concept with tickets and issues — Jira, GitHub Issues, Linear. But these tools assume a human developer who reads the ticket, carries context in their head across sessions, and reports back.

AI agents don't work like that. They start fresh every session. They can't check Jira. They don't remember yesterday.

Work Orders for AI Agents

Praxis brings the work order pattern into AI development:

# Work Order: Implement Authentication Middleware

- **WO#:** 3
- **Date Created:** 2026-02-20
- **Status:** Pending
- **Assigned To:** Claude
- **Priority:** High

## Description
Add JWT-based authentication middleware to all /api routes.

## Acceptance Criteria
- [ ] Middleware validates JWT tokens on every /api/* route
- [ ] Invalid tokens return 401 with consistent error format
- [ ] Token refresh endpoint exists at /api/auth/refresh

This file lives in dev/work-orders/. The AI reads it at session start. The AI works against the acceptance criteria. When done, the work order moves to executed/. There is no ambiguity.

Why Work Orders Beat Prompts

	Prompts	Work Orders
Persistence	Die with the session	Live as files — survive forever
Scope	Vague, conversational	Defined acceptance criteria
Tracking	"Did I ask for that?"	Pending → Executed pipeline
Routing	One agent, one prompt	Routable to specific agents
Audit trail	None	The file IS the trail
Decomposition	Mega-prompts that grow forever	Master plan → incremental WOs
Multi-session	Re-explain everything each time	AI reads the WO fresh — no drift

The work order is to AI development what the shipping container was to global trade — a standardized unit that any agent can pick up, process, and deliver.

The Development Lifecycle

Praxis organizes all work into a four-stage pipeline:

graph LR
    R["Research<br/><i>(gather)</i>"] --> P["Planning<br/><i>(decide)</i>"]
    P --> E["Execution<br/><i>(build)</i>"]
    E --> Re["Reports<br/><i>(communicate)</i>"]

    style R fill:#667eea,stroke:#667eea,color:#fff
    style P fill:#764ba2,stroke:#764ba2,color:#fff
    style E fill:#9b59b6,stroke:#9b59b6,color:#fff
    style Re fill:#f093fb,stroke:#f093fb,color:#000

Stage	Folder	What Happens Here
Research	`dev/research/`	Gather information before making decisions. Compare options, benchmark alternatives, read documentation.
Planning	`dev/planning/`	Make decisions. Write master plans (draft → approved). Architectural choices live here.
Execution	`dev/work-orders/`, `dev/commands/`	Build. Work orders track tasks. Command docs deliver operator scripts.
Reports	`dev/reports/`	Communicate results to stakeholders. Draft → published pipeline.

Every folder in dev/ maps to one of these stages. When you open a Praxis project, you immediately know where everything is and why it's there.

Cross-Cutting Concerns

Folder	Purpose
`dev/audit/`	Quality trail — architecture audits, conformance checks, drift reports
`dev/design/`	Design assets — tokens, brand guidelines, visual audit captures
`dev/archive/`	Historical records — retired documents with manifests

Research: Gather Before Deciding

Research is Stage 1 — it flows upstream into planning. Everything in dev/research/ exists to inform a decision that hasn't been made yet.

dev/research/
├── active/       # Research for current, open decisions
└── archive/      # Decision made — kept for reference

The flow: When you need to choose between PostgreSQL and MySQL, or evaluate three hosting providers, or compare authentication libraries — that investigation lives in active/. Once the decision is made and recorded in source_of_truth.md, the research moves to archive/. It's never deleted — it's the receipts for why you chose what you chose.

Common research types: pricing comparisons, dependency audits, technology evaluations, architecture analysis, security advisory reviews, competitive benchmarks.

Research is not reporting. This distinction matters. Research gathers information before a decision (upstream). Reports communicate results after work is done (downstream). A technology comparison that helps you pick a database? Research. A progress update for a stakeholder? Report. They live in different folders because they serve different stages of the pipeline.

Planning: Decide Before Building

Planning is Stage 2 — where research findings become decisions and decisions become actionable plans.

dev/planning/
└── master-plan/
    ├── draft/      # Working plans (AI writes here)
    └── approved/   # Finalized plans (admin promotes)

The flow: The AI writes master plans to draft/. The admin reviews and promotes to approved/. The AI never writes directly to approved/ — this gate ensures a human reviews every strategic decision before work begins.

Master plan → Work order decomposition: The master plan captures the full project roadmap, organized into batches:

Batch	Scope	When to Create WOs
0: Critical	Security vulnerabilities, broken builds, data loss risks	Immediately during init
1: Foundation	Scaffolding, structural improvements, tooling setup	After Batch 0 is complete
2: Core	Feature work, architecture implementation	After Batch 1 is complete
3: Quality	Testing, documentation, polish	After Batch 2 is complete

Work orders are decomposed from the master plan incrementally — not all at once. This prevents scope overload and keeps the active queue focused.

Execution: Build with Traceability

Execution is Stage 3 — where plans become reality. This stage has two artifact types:

Work orders are the primary execution unit. They're covered in detail in the Work Orders section above — scoped tasks with acceptance criteria, assigned agents, and a pending → executed lifecycle.

Commands handle a specific execution problem: when the AI needs multi-step shell commands run on a server or workstation, it can't just paste them in chat. Instead:

dev/commands/
├── active/
│   └── 3_2026-02-20_SSL_SETUP/    # Topic subfolder with step-by-step commands
│       ├── 01_GENERATE_CERTS.md
│       └── 02_CONFIGURE_NGINX.md
└── executed/                        # Completed command sets

The AI writes commands to active/{topic}/ and references the doc path and step number: "Run Step 1 in dev/commands/active/3_2026-02-20_SSL_SETUP/01_GENERATE_CERTS.md." The admin reviews and executes. Completed sets move to executed/.

Why files instead of chat? Three reasons: (1) prevents copy-paste errors on complex multi-line commands, (2) creates an audit trail of every command run on the system, (3) allows the admin to review commands before execution — especially important for destructive operations.

Reports: Communicate Results

Reports are Stage 4 — the final stage of the pipeline. Everything upstream (research, planning, execution) has produced results. Reports communicate those results to stakeholders.

dev/reports/
├── draft/
│   ├── html/       # Visual reports (interactive, styled)
│   └── written/    # Written analysis (markdown)
└── published/
    ├── html/       # Final HTML (admin promotes here)
    └── written/    # Final written (admin promotes here)

The draft/published wall: AI writes to draft/ only. The admin reviews, redacts any sensitive information (internal IPs, credentials, PII), and promotes to published/. The AI never reads from or writes to published/. This wall exists because published reports go to external stakeholders — they must be reviewed by a human before leaving the project.

Two formats: HTML reports are visual and interactive — benchmark dashboards, progress emails, styled presentations. Written reports are markdown — technical analysis, architecture reviews, decision documents. Both follow the same draft → published flow.

Audit: Track Quality

Audit is a cross-cutting concern — it doesn't belong to a single pipeline stage. Audits can happen during planning (discovery), execution (completion), or maintenance (drift detection).

dev/audit/
├── current/    # Active audit entries
└── legacy/     # Archived by admin

Audit types:

Type	When	What It Checks
Discovery audit	First encounter with a codebase	Tech stack, architecture, risks, dependencies, test coverage
Completion audit	After a WO is marked done	Acceptance criteria met, code quality, no regressions
Drift report	Periodic or on-demand	Source of truth claims vs. actual codebase state
Conformance check	Session start or CI	Folder structure, naming conventions, file freshness

The linter (praxis-lint.sh) automates conformance checks. Discovery and completion audits are performed by the Manager agent in Triangle mode, or by any agent in Solo mode. Drift reports are typically the Researcher's responsibility — comparing what the documentation claims against what the code actually does.

The Context Chain

Praxis solves AI amnesia with three living documents that persist across every session:

graph LR
    SOT["source_of_truth.md<br/><i>canonical rules</i>"] --> CC["context_capsule.md<br/><i>session handoff</i>"]
    CC --> CP["checkpoint.md<br/><i>milestones</i>"]
    CP --> WO["Latest Work Order<br/><i>current task</i>"]

    style SOT fill:#667eea,stroke:#667eea,color:#fff
    style CC fill:#764ba2,stroke:#764ba2,color:#fff
    style CP fill:#9b59b6,stroke:#9b59b6,color:#fff
    style WO fill:#f093fb,stroke:#f093fb,color:#000

Document	What It Contains	Updated When
source_of_truth.md	Project rules, decisions log, tech stack, folder structure. The canonical record. If anything conflicts, this file wins.	When decisions are made
context_capsule.md	Last session's summary: what was done, what's next, active task status. This is the "handoff note" between sessions.	Every session end
checkpoint.md	Completed milestones with dates. The progress record.	When work is completed

The read order at every session start:

source_of_truth.md — What are the rules?
context_capsule.md — What happened last time?
checkpoint.md — What's been accomplished?
Latest work order — What should I work on now?

The write order at every session end:

Update source_of_truth.md — Any new decisions?
Update context_capsule.md — What did I do? What's next?
Update checkpoint.md — Any milestones completed?

This is the heartbeat of Praxis. It turns stateless AI sessions into a continuous, traceable development process.

The Triangle Pattern

Praxis supports two operational modes:

Solo Mode (Default)

One AI agent operates independently. Work orders are a flat queue:

work-orders/
├── 1_2026-02-20_AUTH_MIDDLEWARE.md     (pending)
├── 2_2026-02-20_API_VALIDATION.md     (pending)
└── _executed/
    └── 0_2026-02-19_PROJECT_SETUP.md  (done)

Triangle Mode (Multi-Agent)

Three specialized AI agents collaborate, each with a distinct role:

graph TD
    M["<b>Manager Agent</b><br/><i>audits, plans, reviews, creates WOs</i>"]
    I["<b>Implementer Agent</b><br/><i>implements code, deploys, tests</i>"]
    R["<b>Research Agent</b><br/><i>deep research, SOT verification</i>"]

    M -->|"work orders"| I
    M -->|"research WOs"| R
    I -->|"plans & results"| M
    R -->|"findings & reports"| M

    style M fill:#667eea,stroke:#667eea,color:#fff
    style I fill:#764ba2,stroke:#764ba2,color:#fff
    style R fill:#f093fb,stroke:#f093fb,color:#000

Role	Responsibility	Reads From	Writes To
Manager	Audits, plans, reviews, creates WOs	Full project	`work-orders/wo_{agent}/`, `audit/`
Implementer	Implements code, deploys, tests	Its assigned WOs	Source code, `commands/`, completed WOs
Researcher	Deep research, SOT verification, codebase indexing	Its assigned WOs	`research/active/`, `audit/` (drift reports)

Example assignment: Codex CLI as Manager, Claude Code as Implementer, Gemini CLI as Researcher. But any AI that can read and write files can fill any role.

Important: Triangle is a role topology, not a provider lock-in.

You can run Triangle with three different providers (for example Codex + Claude + Gemini).
You can run Triangle with the same provider in three parallel sessions (for example Claude session A/B/C, each with a different role).
You can run Triangle with hybrid private/local nodes (for example OpenCode or other self-hosted agents) as long as each agent follows the same filesystem contract.

Work orders are routed to agent-specific folders:

work-orders/
├── wo_implementer/
│   ├── 3_2026-02-20_AUTH_MIDDLEWARE.md
│   └── executed/
├── wo_manager/
│   └── executed/
└── wo_researcher/
    ├── 1_2026-02-20_JWT_LIBRARY_RESEARCH.md
    └── executed/

The Reflection Pattern — the core loop in Triangle mode (click to expand)

graph TD
    A["Manager creates WO"] --> B["Implementer writes plan"]
    B --> C["Manager reviews plan"]
    C -->|"Approved"| D["Implementer builds"]
    C -->|"Changes requested"| B
    D --> E["Manager audits result"]
    E -->|"Pass"| F["WO moves to executed/"]
    E -->|"Fail"| B

    style A fill:#667eea,stroke:#667eea,color:#fff
    style B fill:#764ba2,stroke:#764ba2,color:#fff
    style C fill:#667eea,stroke:#667eea,color:#fff
    style D fill:#764ba2,stroke:#764ba2,color:#fff
    style E fill:#667eea,stroke:#667eea,color:#fff
    style F fill:#2ecc71,stroke:#2ecc71,color:#fff

Why this works: The Manager sees the full picture (discovery audit + all WOs + all plans). The Implementer sees only its current WO. This separation prevents scope creep and ensures every implementation aligns with the overall project plan.

Detection: Triangle mode activates when multiple provider init files exist in dev/init/ (e.g., CODEX_INIT.md, GEMINI_INIT.md alongside CLAUDE_INIT.md). Otherwise, Solo mode is the default.

Beyond Triangle: Extensible Topologies

Triangle is the recommended starting pattern because it is simple and predictable. Praxis itself is not limited to three agents.

If your project needs more parallelism, you can scale to N-agent graphs (mixed providers, same-provider parallel sessions, and private/self-hosted nodes), while keeping the same core contract:

Role ownership stays explicit.
Work order routing stays deterministic.
Validation and phase gates stay enforced.

Praxis governs coordination and context continuity across agents. It does not constrain which provider or model you use.

The dev/ Folder

Full folder structure (click to expand)

dev/
├── source_of_truth.md              # Canonical rules and decisions
├── context_capsule.md              # Session handoff
├── checkpoint.md                   # Progress milestones
│
├── init/                           # Methodology reference docs
│   ├── PRAXIS_INIT.md           # Provider-agnostic init
│   ├── CLAUDE_INIT.md              # Claude Code init
│   ├── CODEX_INIT.md               # Codex manager init (Triangle)
│   └── GEMINI_INIT.md              # Gemini researcher init (Triangle)
│
├── research/                       # Stage 1: GATHER
│   ├── active/                     # Research for current decisions
│   └── archive/                    # Decisions made, kept for reference
│
├── planning/                       # Stage 2: DECIDE
│   └── master-plan/
│       ├── draft/                  # Working plans (AI writes here)
│       └── approved/               # Finalized plans (admin promotes)
│
├── work-orders/                    # Stage 3: EXECUTE
│   └── executed/                   # Completed work orders
│
├── commands/                       # Operator command delivery
│   ├── active/                     # Command sets in topic subfolders
│   └── executed/                   # Completed command sets
│
├── audit/                          # Quality + conformance trail
│   ├── current/                    # Active audit entries
│   └── legacy/                     # Archived entries
│
├── reports/                        # Stage 4: COMMUNICATE
│   ├── draft/
│   │   ├── html/                   # Draft HTML reports
│   │   └── written/                # Draft written reports
│   └── published/
│       ├── html/                   # Final HTML (admin promotes)
│       └── written/                # Final written (admin promotes)
│
├── design/                         # Design assets
│   ├── audit/screenshots/          # Visual captures
│   ├── language/                   # Design tokens + methodology docs
│   └── resources/                  # Icons, fonts, logos
│
├── private/                        # Sensitive docs (GITIGNORED)
│
└── archive/                        # Historical records
    └── {date}_{description}/       # Dated batches with manifests

Provider Integration

Praxis is provider-agnostic. It works with any AI assistant that can read and write files.

Providers and roles are decoupled:

Roles are operational (manager, implementer, researcher, or custom role sets).
Providers are implementation choices (Claude, Codex, Gemini, OpenCode, private/local LLMs, etc.).
The same provider can fill multiple roles via separate sessions if role boundaries are preserved.

The methodology does NOT control how provider config files are created. Each provider creates their config per their own conventions:

Provider	Config File	Init File
Claude Code	`CLAUDE.md`	`dev/init/CLAUDE_INIT.md`
Codex CLI	`AGENTS.md`	`dev/init/CODEX_INIT.md`
Gemini CLI	`GEMINI.md`	`dev/init/GEMINI_INIT.md`
Any other	Whatever the provider uses	`dev/init/PRAXIS_INIT.md`

Two-step init flow (important):

Native init first — Let the AI create its own config file in a dedicated session (e.g., Claude creates CLAUDE.md, Codex creates AGENTS.md). The AI gives its native setup full attention.
Praxis init second — Run the Praxis init (paste or reference dev/init/*_INIT.md). Praxis injects a small context handoff block into the provider's existing config — augmenting it, never replacing it. If the provider config doesn't exist, Praxis will stop and ask you to run step 1 first.

This ensures the AI knows where to find the context chain on every new session, without Praxis overriding the provider's native conventions.

Quick Start

Option A: CLI Init (Recommended)

npx praxis-mcp init                                     # starter tier, solo mode
npx praxis-mcp init --tier full --mode triangle          # full tier, multi-agent
npx praxis-mcp init --tier standard --path ./my-project  # custom path

This creates the dev/ folder structure, context documents, .praxis/praxis-lint.sh, and (in triangle mode) agent folders with _executed/ directories. One command, fully scaffolded.

Option B: Manual Setup

Starter (context chain + work orders only):

mkdir -p dev/work-orders/_executed

Then create dev/source_of_truth.md, dev/context_capsule.md, and dev/checkpoint.md.

Full (complete governance layer):

mkdir -p dev/{init,research/{active,archive},planning/master-plan/{draft,approved},work-orders/_executed,commands/{active,executed},audit/{current,legacy},reports/{draft/{html,written},published/{html,written}},design/{audit/screenshots,language,resources},archive,private}

Configure your provider

Copy the relevant init file from dev/init/ into your project. For Claude Code:

cp dev/init/CLAUDE_INIT.md your-project/dev/init/

Paste the contents of your provider's init file into a new session. The AI will:

Read your codebase
Populate the context documents
Inject the context handoff into your provider config
Perform an architecture audit (if existing code)
Create Batch 0 work orders (critical issues only)

You're now running Praxis.

Operating Rules

Non-destructive — AI never SSH's to production. Local copies only.
Self-contained — Every project gets its own dev/ folder. Deployable as-is.
No workspace root files — All output goes into project folders or the dev/ structure.
Draft/published wall — AI writes to draft/. Admin promotes to published/.
Executed means done — Items stay pending until fully complete. No premature moves.
Naming convention — {number}_{YYYY-MM-DD}_{DESCRIPTION}.{ext}. Number 0 = READMEs.
Commands in files, not chat — AI never pastes multiline commands in conversation. Write to commands/active/ and reference the path.
Context updated every session — Source of truth (decisions), capsule (summary), checkpoint (milestones).
No secrets in dev/ — Never store API keys, passwords, tokens, or credentials in the dev/ folder. Use .env files (gitignored) for secrets. Redact sensitive data in reports before promotion.

WO Lane System

Lanes organize work orders into subproject scopes within an agent folder. They're optional — projects without lanes work identically to v1.2.

Lane Naming

{nn}_{type}_{scope}

nn — Two-digit prefix for ordering (10, 20, 30...)
type — One of: delivery, program, lab, ops
scope — Snake_case description (e.g., academy, site_core)

Example: 10_delivery_academy, 70_program_methodology_rewrite, 80_lab_experimental_design

Lane Types

Type	Purpose	Validation
`delivery`	Shippable product work	Full: Acceptance Criteria + Status required
`program`	Planning and methodology	Relaxed: criteria and status optional
`lab`	Experimental and research	Relaxed: criteria and status optional
`ops`	Operational and infrastructure	Full: Acceptance Criteria + Status required

Centralized Completion

When a WO in a lane is completed, it moves to a centralized _executed/ directory:

wo_claude/
├── 10_delivery_academy/           # Active WOs
├── 20_delivery_site_core/         # Active WOs
└── _executed/
    ├── 10_delivery_academy/       # Completed WOs from this lane
    └── 20_delivery_site_core/     # Completed WOs from this lane

This keeps the active queue clean while preserving a lane-organized audit trail.

Patch Work Orders

Patch WOs extend a completed parent WO to address follow-up issues. They use the _P{NN} suffix convention:

5_2026-02-22_ORIGINAL_TASK.md          # Parent (in _executed/)
5_2026-02-22_FIX_HEADER_BUG_P01.md     # Patch 1
5_2026-02-23_ADD_MOBILE_SUPPORT_P02.md  # Patch 2

Required Metadata

Every patch WO includes parent tracking fields:

- **Parent WO:** 5
- **Patch:** P01
- **Sequence Key:** 5.01

The sequence key ({parent}.{patch}) enables chronological ordering across parent + patches.

N/A Criteria

When an acceptance criterion becomes inapplicable after the WO was scoped, mark it as N/A:

- [ ] ~~Criterion text~~ N/A — reason the criterion doesn't apply

The checkbox stays [ ], the text is wrapped in strikethrough (~~), and a reason follows the em dash.

Guardrails

Rule	Scope	Severity
Reason required	All WOs	N/A without reason = doesn't match, counts as unchecked
Max 3 per WO	Executed WOs	>3 N/A = FAIL (WO is poorly scoped)
Prefer rewrite	Active WOs	N/A in active WO = WARN (rewrite the criterion instead)

Security & Sensitive Data

Praxis is designed to live in Git repositories. These rules prevent accidental exposure:

Never commit secrets. API keys, passwords, tokens, and credentials belong in .env files, not in dev/ documents.
Redact before publishing. Reports in draft/ may reference internal IPs, usernames, or infrastructure details. Redact before promoting to published/.
The .gitignore matters. Praxis ships with a .gitignore that excludes common secret patterns. Extend it for your project.
Sensitive artifacts go in dev/private/. Use it for contracts, credential references, internal notes with PII, or any document that should exist in the project context but never in version control. Add dev/private/ to your project's .gitignore. Reference private docs from the source of truth by path (e.g., "credentials in dev/private/server_creds.md").
Command documents deserve extra scrutiny. Command docs in commands/active/ may contain connection strings, server addresses, or credentials. Review before committing to git.

For the MCP server security model (path safety, concurrency, known risks), see SECURITY.md.

Adoption Tiers

You don't have to use everything on day one. Start small and add structure as complexity grows.

Starter — Context Chain + Work Orders

The minimum viable Praxis. Just 3 files and 1 folder:

dev/
├── source_of_truth.md
├── context_capsule.md
├── checkpoint.md
└── work-orders/
    └── executed/

Best for: Solo developers, small projects, quick experiments. You get session continuity and task tracking with near-zero overhead.

Standard — Add Research & Planning Pipeline

The full development lifecycle without the audit/report infrastructure:

dev/
├── source_of_truth.md, context_capsule.md, checkpoint.md
├── research/{active, archive}/
├── planning/master-plan/{draft, approved}/
├── work-orders/executed/
└── commands/{active, executed}/

Best for: Medium projects, multi-session work, projects that need planning before building.

Full — Complete Governance Layer

Everything. Audit trail, report pipeline, design assets, archive:

dev/
├── (all Standard folders)
├── audit/{current, legacy}/
├── reports/draft/{html, written}/, published/{html, written}/
├── design/{audit/screenshots, language, resources}/
└── archive/

Best for: Multi-agent workflows, enterprise projects, long-running builds, projects with stakeholder reporting.

File Naming Convention

All files follow: {number}_{YYYY-MM-DD}_{DESCRIPTION}.{ext}

Number — Sequential, chronological (0, 1, 2, ...)
Date — Creation date in ISO format
Description — UPPERCASE, underscore-separated
Number 0 is reserved for READMEs and examples

1_2026-02-20_AUTH_MIDDLEWARE.md
2_2026-02-20_API_VALIDATION.md
0_2026-02-20_README.md

Validation (praxis-lint)

Praxis includes an automated validation tool that checks whether your dev/ folder conforms to the methodology. It transforms Praxis from convention-based (rules you follow voluntarily) to convention-enforced (rules that are verified automatically).

Quick Start

bash .praxis/praxis-lint.sh              # Lint current project
bash .praxis/praxis-lint.sh --fix        # Auto-create missing directories
bash .praxis/praxis-lint.sh --json       # JSON output for hooks/CI
bash .praxis/praxis-lint.sh --strict     # Warnings become failures
bash .praxis/praxis-lint.sh --help       # Full usage information

What It Checks (7 Categories, 50 Checks)

Category	What	Key Checks
Structure	Required folders exist for your tier	`dev/`, core docs, work-orders/, research/, etc.
Context Freshness	Handoff docs aren't stale	capsule < 7 days, checkpoint < 30 days
Work Orders	Executed WOs are truly complete	No unchecked `- [ ]` boxes in executed/
Naming	Files follow the convention	`{number}_{YYYY-MM-DD}_{DESC}.ext`
Security	No secrets in tracked files	Private keys, AWS keys, connection strings
SOT Consistency	Source of Truth matches reality	Referenced folders exist, decisions logged
Orphans	No files in wrong locations	No loose files at dev/ root

Exit Codes

Code	Meaning	CI/CD Effect
`0`	All pass (or INFO-only)	Pipeline passes
`1`	Warnings found (drifting)	Pipeline passes (or fails with `--strict`)
`2`	Failures found (broken)	Pipeline fails

Integration

praxis-lint integrates with both CI/CD pipelines and AI coding assistants:

Integration	How	Setup
Claude Code	SessionStart hook — runs automatically, feeds findings to AI	See `.praxis/examples/settings-hook.json`
GitHub Actions	CI workflow — blocks PRs with failures	See `.praxis/examples/github-action.yml`
Pre-commit hook	Git hook — validates before every commit	Copy hook script to `.git/hooks/pre-commit`
Any AI agent	Init file instruction — AI runs linter as first action	Referenced in `dev/init/*_INIT.md`
Manual	Run from terminal anytime	`bash .praxis/praxis-lint.sh`

Works with gitignored dev/: The linter reads the local filesystem, not git. If dev/ is gitignored, local modes (manual, hooks, AI) all work. CI gracefully skips.

Zero dependencies. One file. Works on any Unix system.

MCP Server: Methodology as a Service

Everything above this line — the context chain, work orders, validation, the entire dev/ folder structure — works through files. The AI reads init docs, follows instructions, and manually reads and writes markdown. It works. It's been battle-tested across thousands of sessions.

But there's a better way.

The MCP server turns Praxis from rules you follow into tools you use. Instead of the AI parsing instructions from CLAUDE_INIT.md and manually opening files in the right order, it calls session_start and gets the entire project state in a single structured response. Instead of manually constructing work order markdown and remembering the naming convention, it calls create_work_order and the file appears — correctly numbered, correctly dated, correctly formatted, in the correct folder.

This is what the Model Context Protocol was designed for: giving AI agents structured access to external systems. The Praxis MCP server wraps the entire methodology into 13 native tools that any MCP-compatible AI can call automatically.

Before & After

Without MCP (Manual)	With MCP (Native Tools)
AI reads init doc instructions	AI calls `session_start`
AI opens `source_of_truth.md`, then `context_capsule.md`, then `checkpoint.md`, then scans for WOs	One call returns all context + pending WOs + health assessment
AI constructs WO markdown by hand, guesses the next number	`create_work_order` auto-numbers, auto-dates, enforces the template
AI writes capsule sections hoping the format is correct	`update_capsule` replaces sections by header — preserves everything else
Admin asks "is the dev/ folder structured correctly?"	`lint` runs 50 checks and returns structured JSON
New project needs the folder structure	`scaffold` creates it by tier and mode in one call
AI forgets to update docs at session end	`session_end` checks file modification times and flags non-compliance

The AI doesn't need to be told how to follow Praxis. It calls the tools, and the tools enforce the methodology.

The Tool Inventory (13 Tools, 5 Categories)

Session Lifecycle — start, end, and detect (click to expand)

Tool	What It Does
`session_start`	Reads the full context chain (SOT → capsule → checkpoint), lists all pending work orders, detects tier/mode/providers, and returns a structured health assessment — all in one call. This replaces the manual "read these files in order" instruction from the init docs.
`session_end`	Checks whether context documents were updated during the session by comparing file modification times. Returns a compliance report with warnings for any documents that weren't touched. Optionally runs the linter as a final validation.
`detect_project`	Pure detection — determines tier (starter/standard/full), mode (solo/triangle), active providers, and structural completeness. No side effects. Useful for tools and scripts that need to adapt behavior to the project type.

Context Chain — read, update capsule, update checkpoint (click to expand)

Tool	What It Does
`read_context`	Reads one or all context documents with rich metadata: file size, age in days, and parsed structural sections (decisions count, milestone list, active task). The AI gets both raw content and structured data.
`update_capsule`	Section-aware update for `context_capsule.md`. Provide new content for specific sections (Active Task, In-Progress Notes, Last Session Summary) and the tool replaces only those sections — preserving everything else. No more accidental overwrites.
`update_checkpoint`	Appends a new milestone to `checkpoint.md`. Auto-numbers the next row, enforces the table format, and optionally updates the Current Phase. The AI never has to parse the milestone table by hand.

Work Orders — list, read, create, complete, patch (click to expand)

Tool	What It Does
`list_work_orders`	Lists all work orders with parsed metadata (number, title, status, priority, assigned agent, lane). Handles Solo, Triangle, and lane-based folder structures. Supports filtering by status, agent, and lane.
`read_work_order`	Reads a specific work order by number or filename. Returns parsed header fields, criteria completion state, N/A criteria count, and patch metadata. Searches across lanes and executed directories.
`create_work_order`	Creates a new work order with full naming convention enforcement. Auto-numbers, auto-dates, renders the standard WO template, and routes to the correct folder — including lane subfolders.
`complete_work_order`	Validates that all acceptance criteria are checked or marked N/A, then updates status to "Complete" and moves to the correct `_executed/` path (centralized for lanes, flat for top-level). N/A criteria are treated as resolved.
`create_patch_work_order`	Creates a patch WO extending an existing parent. Auto-assigns the next `_P{NN}` suffix, includes parent metadata (Parent WO, Patch, Sequence Key), and routes to the correct lane.

Validation — lint (click to expand)

Tool	What It Does
`lint`	Spawns `praxis-lint.sh` and returns structured JSON findings across all 7 categories (structure, freshness, work orders, naming, security, SOT consistency, orphans). Supports `--strict`, `--fix`, and selective category skipping. Same 50 checks as the command line, but the AI gets machine-readable results.

Scaffolding — scaffold (click to expand)

Tool	What It Does
`scaffold`	Creates the complete `dev/` folder structure based on tier (starter/standard/full), mode (solo/triangle), agent list, and optional lane definitions. Creates centralized `_executed/` directories and template context documents. Safe to run multiple times — reports what was created vs. what already existed.

How It Works in Practice

The MCP server uses stdio transport — it's a process that communicates over stdin/stdout using the JSON-RPC 2.0 protocol. You register it in your AI tool's config file, and the tools appear automatically. The AI calls them like native functions.

You don't call the tools manually. The AI calls them. When Claude Code starts a session and sees the Praxis MCP tools available, it calls session_start instead of manually reading files. When it creates a work order, it calls create_work_order instead of constructing markdown. The tools are invoked automatically by the AI as part of its normal workflow.

The server is stateless — no in-memory state between calls. Every tool reads from the filesystem and writes to the filesystem. The filesystem IS the state. This matches the core Praxis philosophy: everything is files, everything is transparent, everything is auditable.

Setup

Install from npm:

npm install praxis-mcp

That's it. The server is ready to use.

Register for Claude Code (.mcp.json at your project root):

{
  "mcpServers": {
    "praxis": {
      "command": "npx",
      "args": ["praxis-mcp"],
      "env": { "PRAXIS_PROJECT_DIR": "/path/to/your/project" }
    }
  }
}

Register for Codex CLI (~/.codex/config.toml):

[mcp_servers.praxis]
command = "npx"
args = ["praxis-mcp"]

[mcp_servers.praxis.env]
PRAXIS_PROJECT_DIR = "/path/to/your/project"

Build from source (contributors only):

git clone https://github.com/LuisFaxas/praxis.git
cd praxis/praxis-mcp && npm install && npm run build

Tools appear as mcp__praxis__session_start, mcp__praxis__create_work_order, mcp__praxis__lint, etc. The PRAXIS_PROJECT_DIR environment variable tells the server which project to operate on — tools default to this path so the AI doesn't have to pass it on every call.

Architecture

praxis-mcp/
├── src/
│   ├── index.ts              # CLI routing + McpServer + stdio transport
│   ├── cli-init.ts           # npx praxis-mcp init command
│   ├── tools/                # One file per category
│   │   ├── session.ts        # session_start, session_end, detect_project
│   │   ├── context.ts        # read_context, update_capsule, update_checkpoint
│   │   ├── work-orders.ts    # list, read, create, complete, create_patch
│   │   ├── lint.ts           # Spawns praxis-lint.sh
│   │   └── scaffold.ts       # TypeScript mkdir by tier/mode/lanes
│   └── lib/                  # Shared utilities
│       ├── constants.ts      # Tier maps, WO/patch templates, lane/naming regex
│       ├── fs-helpers.ts     # Safe file I/O, lane discovery, executed resolution
│       ├── detection.ts      # Tier, mode, and provider detection
│       ├── parsers.ts        # WO (with N/A + patch), capsule, checkpoint, SOT
│       └── naming.ts         # Auto-numbering, patch suffixes, filename formatting
├── templates/                # Bundled for CLI init
│   └── praxis-lint.sh        # Linter v1.3.1
└── build/                    # Compiled JS (gitignored)

Zero external dependencies beyond the MCP SDK and Zod for schema validation. TypeScript with strict mode. Compiles to ESM.

See praxis-mcp/README.md for the complete tool reference with input schemas and example responses.

The Foundation: Why Filesystem?

The MCP server is how Praxis scales. But the filesystem is how it survives.

Every methodology choice above — the context chain, the work orders, the four-stage lifecycle — is built on a deliberate foundation: the file system. Not a database. Not an API. Not a SaaS platform. Files and folders.

Zero dependencies — Works anywhere there's a file system. No installs, no accounts, no subscriptions.
Git-friendly — The dev/ folder can be tracked (or gitignored for private projects). Full version history for free.
AI-native — Every AI agent can read and write files. Not every AI agent can call APIs or query databases.
Human-readable — Open any file in any text editor. No special tools needed to understand the project state.
Portable — Copy the dev/ folder to a new machine, a new project, a new team. It just works.
Transparent — No hidden state. Everything is visible, auditable, and diffable.

This matters because it means Praxis works without the MCP server. Any AI that can read a file can follow the methodology. The init docs contain everything — the rules, the folder structure, the session protocol. An AI with no MCP support can still read CLAUDE_INIT.md, follow the instructions, and operate a fully governed Praxis workflow.

The MCP server doesn't replace this foundation. It accelerates it. Files are the state. Tools are the interface. You can run Praxis at any level:

Level	What You Need	What You Get
Files only	Any AI + init docs	Full methodology — context chain, work orders, audit trail
Files + Lint	Any Unix system	Automated validation — 50 checks, CI/CD integration
Files + MCP	MCP-compatible AI	Native tools — one-call session start, auto-numbered WOs, enforced quality gates

Each layer adds automation. None of them add lock-in.

Origin Story

Praxis is the culmination of thousands of hours pushing the most capable agentic LLMs to their limits on real projects. Not toy demos. Not tutorial apps. Real infrastructure builds, real web applications, real multi-agent workflows where mistakes cost hours and context loss costs days.

But the methodology didn't come from AI alone. It came from an unexpected place: property management.

Years of managing construction projects, coordinating contractors, tracking work orders across multiple sites, and maintaining audit trails for compliance — that operational experience is baked into every part of Praxis. The work order pattern? That's how construction has tracked tasks for decades. The draft/published wall? That's how property managers handle lease documents — drafts are internal, published documents go to tenants. The context chain? That's the handoff note you leave for the next shift manager so nothing falls through the cracks.

The insight was simple: AI agents have the same coordination problems as human teams. They forget context between sessions. They don't know what other agents are working on. They lack a single source of truth. They can't verify whether a task was actually completed. These are solved problems in operations management — they just hadn't been applied to AI development yet.

Praxis bridges two worlds:

The organizational discipline of real-world project management — work orders, audit trails, handoff protocols, quality gates
The technical capabilities of modern AI agents — code generation, research, architecture analysis, multi-agent orchestration

The result is a methodology where humans and AI agents collaborate as equals, each compensating for the other's limitations. AI has unlimited patience and processing power but no persistent memory. Humans have institutional knowledge and decision authority but limited bandwidth. Praxis gives both sides a shared workspace where context persists, work is tracked, and nothing gets lost.

The evolution tells the story: v1.1 gave AI agents structured folders and markdown documents to follow. v1.2 added praxis-lint — 50 automated checks that enforce the rules. v1.3 added the MCP server — native tools that turn the methodology into something the AI doesn't just follow, but calls. v1.3.1 hardened the entire stack: lane-based subproject organization, patch work orders, N/A criteria recognition, a CLI installer, and a security model — all battle-tested on a real multi-agent project before upstream.

The filesystem is the foundation. The linter is the guardrails. The MCP server is the interface. Together, they make Praxis the first AI development methodology that governs itself.

Every rule in this methodology exists because its absence caused a real problem on a real project. Nothing is theoretical. Everything is praxis.

License

MIT License. See LICENSE for details.

The MIT License means you can freely use, modify, and distribute Praxis — including in commercial projects. The only requirement is including the copyright notice. This is the same license used by React, Next.js, and most major open-source developer tools.

Created by Luis Faxas, 2026.

faxas.net/methodology — full methodology explanation, examples, and resources.

"The process by which theory becomes practice." — Aristotle, on πρᾶξις

praxis-mcp

Quick Install

Praxis (πρᾶξις)

The Problem

Work Orders: The Core Innovation

Origin: Construction & Manufacturing

Work Orders for AI Agents

Why Work Orders Beat Prompts

The Development Lifecycle

Cross-Cutting Concerns

Research: Gather Before Deciding

Planning: Decide Before Building

Execution: Build with Traceability

Reports: Communicate Results

Audit: Track Quality

The Context Chain

The Triangle Pattern

Solo Mode (Default)

Triangle Mode (Multi-Agent)

Beyond Triangle: Extensible Topologies

The dev/ Folder

Provider Integration

Quick Start

Option A: CLI Init (Recommended)

Option B: Manual Setup

Configure your provider

Operating Rules

WO Lane System

Lane Naming

Lane Types

Centralized Completion

Patch Work Orders

Required Metadata

N/A Criteria

Guardrails

Security & Sensitive Data

Adoption Tiers

Starter — Context Chain + Work Orders

Standard — Add Research & Planning Pipeline

Full — Complete Governance Layer

File Naming Convention

Validation (praxis-lint)

Quick Start

What It Checks (7 Categories, 50 Checks)

Exit Codes

Integration

MCP Server: Methodology as a Service

Before & After

The Tool Inventory (13 Tools, 5 Categories)

How It Works in Practice

Setup

Architecture

The Foundation: Why Filesystem?

Origin Story

License

Reviews