Praxis (πρᾶξις)
The practice of doing — a filesystem-based methodology for agentic development.
From the Greek πράσσω (prássō) — "to do, to act, to practice."
Zero dependencies. Just folders, markdown, and native AI tools.
In philosophy, Aristotle coined the modern usage of praxis to mean the process by which theory becomes practice. It is the bridge between knowing and doing — you have theory (theōría / θεωρία) on one side, and praxis on the other, where knowledge is enacted through deliberate action.
That is exactly what this methodology does. It bridges the gap between what AI agents know (their training, their context window, their capabilities) and what they do (writing code, researching, auditing, reporting) — through structured context, persistent memory, and traceable work.
The Problem
AI agents are powerful but forgetful. Every new session starts from zero. The context window is a blank slate — yesterday's decisions, last week's architecture choices, the reason you picked PostgreSQL over MySQL — all gone unless someone writes it down.
Most people solve this by writing longer prompts. They paste project context, repeat instructions, hope the AI remembers what matters. This works for small tasks. It collapses for anything real.
The problems with prompt-driven development:
- Ephemeral — prompts disappear when the session ends. No audit trail, no history.
- Unstructured — instructions are scattered across chat messages. Nothing is canonical.
- Untrackable — there's no concept of "done." Did the AI complete the task? Partially? Who checks?
- Single-agent — prompts assume one AI. When multiple agents collaborate, there's no routing, no ownership, no handoff protocol.
- Human memory-dependent — the developer must remember what happened last session and re-explain it. Both humans and AI have short-term memory limitations.
Praxis solves all of this with a filesystem. No database. No SaaS platform. Just folders and markdown.
Work Orders: The Core Innovation
The most important concept in Praxis is the work order — and it comes from an unexpected place.
Origin: Construction & Manufacturing
In construction, a work order is a formal document that authorizes and describes a specific piece of work. It has a scope, acceptance criteria, an assigned worker, and a clear definition of "done." When the electrician finishes wiring the second floor, the work order moves from "pending" to "complete." There's a paper trail. There's accountability. There's no ambiguity about what was asked or what was delivered.
Software engineering adopted a similar concept with tickets and issues — Jira, GitHub Issues, Linear. But these tools assume a human developer who reads the ticket, carries context in their head across sessions, and reports back.
AI agents don't work like that. They start fresh every session. They can't check Jira. They don't remember yesterday.
Work Orders for AI Agents
Praxis brings the work order pattern into AI development:
# Work Order: Implement Authentication Middleware
- **WO#:** 3
- **Date Created:** 2026-02-20
- **Status:** Pending
- **Assigned To:** Claude
- **Priority:** High
## Description
Add JWT-based authentication middleware to all /api routes.
## Acceptance Criteria
- [ ] Middleware validates JWT tokens on every /api/* route
- [ ] Invalid tokens return 401 with consistent error format
- [ ] Token refresh endpoint exists at /api/auth/refresh
This file lives in dev/work-orders/. The AI reads it at session start. The AI works against the acceptance criteria. When done, the work order moves to executed/. There is no ambiguity.
Why Work Orders Beat Prompts
| Prompts | Work Orders | |
|---|---|---|
| Persistence | Die with the session | Live as files — survive forever |
| Scope | Vague, conversational | Defined acceptance criteria |
| Tracking | "Did I ask for that?" | Pending → Executed pipeline |
| Routing | One agent, one prompt | Routable to specific agents |
| Audit trail | None | The file IS the trail |
| Decomposition | Mega-prompts that grow forever | Master plan → incremental WOs |
| Multi-session | Re-explain everything each time | AI reads the WO fresh — no drift |
The work order is to AI development what the shipping container was to global trade — a standardized unit that any agent can pick up, process, and deliver.
The Development Lifecycle
Praxis organizes all work into a four-stage pipeline:
graph LR
R["Research<br/><i>(gather)</i>"] --> P["Planning<br/><i>(decide)</i>"]
P --> E["Execution<br/><i>(build)</i>"]
E --> Re["Reports<br/><i>(communicate)</i>"]
style R fill:#667eea,stroke:#667eea,color:#fff
style P fill:#764ba2,stroke:#764ba2,color:#fff
style E fill:#9b59b6,stroke:#9b59b6,color:#fff
style Re fill:#f093fb,stroke:#f093fb,color:#000
| Stage | Folder | What Happens Here |
|---|---|---|
| Research | dev/research/ | Gather information before making decisions. Compare options, benchmark alternatives, read documentation. |
| Planning | dev/planning/ | Make decisions. Write master plans (draft → approved). Architectural choices live here. |
| Execution | dev/work-orders/, dev/commands/ | Build. Work orders track tasks. Command docs deliver operator scripts. |
| Reports | dev/reports/ | Communicate results to stakeholders. Draft → published pipeline. |
Every folder in dev/ maps to one of these stages. When you open a Praxis project, you immediately know where everything is and why it's there.
Cross-Cutting Concerns
| Folder | Purpose |
|---|---|
dev/audit/ | Quality trail — architecture audits, conformance checks, drift reports |
dev/design/ | Design assets — tokens, brand guidelines, visual audit captures |
dev/archive/ | Historical records — retired documents with manifests |
Research: Gather Before Deciding
Research is Stage 1 — it flows upstream into planning. Everything in dev/research/ exists to inform a decision that hasn't been made yet.
dev/research/
├── active/ # Research for current, open decisions
└── archive/ # Decision made — kept for reference
The flow: When you need to choose between PostgreSQL and MySQL, or evaluate three hosting providers, or compare authentication libraries — that investigation lives in active/. Once the decision is made and recorded in source_of_truth.md, the research moves to archive/. It's never deleted — it's the receipts for why you chose what you chose.
Common research types: pricing comparisons, dependency audits, technology evaluations, architecture analysis, security advisory reviews, competitive benchmarks.
Research is not reporting. This distinction matters. Research gathers information before a decision (upstream). Reports communicate results after work is done (downstream). A technology comparison that helps you pick a database? Research. A progress update for a stakeholder? Report. They live in different folders because they serve different stages of the pipeline.
Planning: Decide Before Building
Planning is Stage 2 — where research findings become decisions and decisions become actionable plans.
dev/planning/
└── master-plan/
├── draft/ # Working plans (AI writes here)
└── approved/ # Finalized plans (admin promotes)
The flow: The AI writes master plans to draft/. The admin reviews and promotes to approved/. The AI never writes directly to approved/ — this gate ensures a human reviews every strategic decision before work begins.
Master plan → Work order decomposition: The master plan captures the full project roadmap, organized into batches:
| Batch | Scope | When to Create WOs |
|---|---|---|
| 0: Critical | Security vulnerabilities, broken builds, data loss risks | Immediately during init |
| 1: Foundation | Scaffolding, structural improvements, tooling setup | After Batch 0 is complete |
| 2: Core | Feature work, architecture implementation | After Batch 1 is complete |
| 3: Quality | Testing, documentation, polish | After Batch 2 is complete |
Work orders are decomposed from the master plan incrementally — not all at once. This prevents scope overload and keeps the active queue focused.
Execution: Build with Traceability
Execution is Stage 3 — where plans become reality. This stage has two artifact types:
Work orders are the primary execution unit. They're covered in detail in the Work Orders section above — scoped tasks with acceptance criteria, assigned agents, and a pending → executed lifecycle.
Commands handle a specific execution problem: when the AI needs multi-step shell commands run on a server or workstation, it can't just paste them in chat. Instead:
dev/commands/
├── active/
│ └── 3_2026-02-20_SSL_SETUP/ # Topic subfolder with step-by-step commands
│ ├── 01_GENERATE_CERTS.md
│ └── 02_CONFIGURE_NGINX.md
└── executed/ # Completed command sets
The AI writes commands to active/{topic}/ and references the doc path and step number: "Run Step 1 in dev/commands/active/3_2026-02-20_SSL_SETUP/01_GENERATE_CERTS.md." The admin reviews and executes. Completed sets move to executed/.
Why files instead of chat? Three reasons: (1) prevents copy-paste errors on complex multi-line commands, (2) creates an audit trail of every command run on the system, (3) allows the admin to review commands before execution — especially important for destructive operations.
Reports: Communicate Results
Reports are Stage 4 — the final stage of the pipeline. Everything upstream (research, planning, execution) has produced results. Reports communicate those results to stakeholders.
dev/reports/
├── draft/
│ ├── html/ # Visual reports (interactive, styled)
│ └── written/ # Written analysis (markdown)
└── published/
├── html/ # Final HTML (admin promotes here)
└── written/ # Final written (admin promotes here)
The draft/published wall: AI writes to draft/ only. The admin reviews, redacts any sensitive information (internal IPs, credentials, PII), and promotes to published/. The AI never reads from or writes to published/. This wall exists because published reports go to external stakeholders — they must be reviewed by a human before leaving the project.
Two formats: HTML reports are visual and interactive — benchmark dashboards, progress emails, styled presentations. Written reports are markdown — technical analysis, architecture reviews, decision documents. Both follow the same draft → published flow.
Audit: Track Quality
Audit is a cross-cutting concern — it doesn't belong to a single pipeline stage. Audits can happen during planning (discovery), execution (completion), or maintenance (drift detection).
dev/audit/
├── current/ # Active audit entries
└── legacy/ # Archived by admin
Audit types:
| Type | When | What It Checks |
|---|---|---|
| Discovery audit | First encounter with a codebase | Tech stack, architecture, risks, dependencies, test coverage |
| Completion audit | After a WO is marked done | Acceptance criteria met, code quality, no regressions |
| Drift report | Periodic or on-demand | Source of truth claims vs. actual codebase state |
| Conformance check | Session start or CI | Folder structure, naming conventions, file freshness |
The linter (praxis-lint.sh) automates conformance checks. Discovery and completion audits are performed by the Manager agent in Triangle mode, or by any agent in Solo mode. Drift reports are typically the Researcher's responsibility — comparing what the documentation claims against what the code actually does.
The Context Chain
Praxis solves AI amnesia with three living documents that persist across every session:
graph LR
SOT["source_of_truth.md<br/><i>canonical rules</i>"] --> CC["context_capsule.md<br/><i>session handoff</i>"]
CC --> CP["checkpoint.md<br/><i>milestones</i>"]
CP --> WO["Latest Work Order<br/><i>current task</i>"]
style SOT fill:#667eea,stroke:#667eea,color:#fff
style CC fill:#764ba2,stroke:#764ba2,color:#fff
style CP fill:#9b59b6,stroke:#9b59b6,color:#fff
style WO fill:#f093fb,stroke:#f093fb,color:#000
| Document | What It Contains | Updated When |
|---|---|---|
| source_of_truth.md | Project rules, decisions log, tech stack, folder structure. The canonical record. If anything conflicts, this file wins. | When decisions are made |
| context_capsule.md | Last session's summary: what was done, what's next, active task status. This is the "handoff note" between sessions. | Every session end |
| checkpoint.md | Completed milestones with dates. The progress record. | When work is completed |
The read order at every session start:
source_of_truth.md— What are the rules?context_capsule.md— What happened last time?checkpoint.md— What's been accomplished?- Latest work order — What should I work on now?
The write order at every session end:
- Update
source_of_truth.md— Any new decisions? - Update
context_capsule.md— What did I do? What's next? - Update
checkpoint.md— Any milestones completed?
This is the heartbeat of Praxis. It turns stateless AI sessions into a continuous, traceable development process.
The Triangle Pattern
Praxis supports two operational modes:
Solo Mode (Default)
One AI agent operates independently. Work orders are a flat queue:
work-orders/
├── 1_2026-02-20_AUTH_MIDDLEWARE.md (pending)
├── 2_2026-02-20_API_VALIDATION.md (pending)
└── _executed/
└── 0_2026-02-19_PROJECT_SETUP.md (done)
Triangle Mode (Multi-Agent)
Three specialized AI agents collaborate, each with a distinct role:
graph TD
M["<b>Manager Agent</b><br/><i>audits, plans, reviews, creates WOs</i>"]
I["<b>Implementer Agent</b><br/><i>implements code, deploys, tests</i>"]
R["<b>Research Agent</b><br/><i>deep research, SOT verification</i>"]
M -->|"work orders"| I
M -->|"research WOs"| R
I -->|"plans & results"| M
R -->|"findings & reports"| M
style M fill:#667eea,stroke:#667eea,color:#fff
style I fill:#764ba2,stroke:#764ba2,color:#fff
style R fill:#f093fb,stroke:#f093fb,color:#000
| Role | Responsibility | Reads From | Writes To |
|---|---|---|---|
| Manager | Audits, plans, reviews, creates WOs | Full project | work-orders/wo_{agent}/, audit/ |
| Implementer | Implements code, deploys, tests | Its assigned WOs | Source code, commands/, completed WOs |
| Researcher | Deep research, SOT verification, codebase indexing | Its assigned WOs | research/active/, audit/ (drift reports) |
Example assignment: Codex CLI as Manager, Claude Code as Implementer, Gemini CLI as Researcher. But any AI that can read and write files can fill any role.
Important: Triangle is a role topology, not a provider lock-in.
- You can run Triangle with three different providers (for example Codex + Claude + Gemini).
- You can run Triangle with the same provider in three parallel sessions (for example Claude session A/B/C, each with a different role).
- You can run Triangle with hybrid private/local nodes (for example OpenCode or other self-hosted agents) as long as each agent follows the same filesystem contract.
Work orders are routed to agent-specific folders:
work-orders/
├── wo_implementer/
│ ├── 3_2026-02-20_AUTH_MIDDLEWARE.md
│ └── executed/
├── wo_manager/
│ └── executed/
└── wo_researcher/
├── 1_2026-02-20_JWT_LIBRARY_RESEARCH.md
└── executed/
The Reflection Pattern — the core loop in Triangle mode (click to expand)
graph TD
A["Manager creates WO"] --> B["Implementer writes plan"]
B --> C["Manager reviews plan"]
C -->|"Approved"| D["Implementer builds"]
C -->|"Changes requested"| B
D --> E["Manager audits result"]
E -->|"Pass"| F["WO moves to executed/"]
E -->|"Fail"| B
style A fill:#667eea,stroke:#667eea,color:#fff
style B fill:#764ba2,stroke:#764ba2,color:#fff
style C fill:#667eea,stroke:#667eea,color:#fff
style D fill:#764ba2,stroke:#764ba2,color:#fff
style E fill:#667eea,stroke:#667eea,color:#fff
style F fill:#2ecc71,stroke:#2ecc71,color:#fff
Why this works: The Manager sees the full picture (discovery audit + all WOs + all plans). The Implementer sees only its current WO. This separation prevents scope creep and ensures every implementation aligns with the overall project plan.
Detection: Triangle mode activates when multiple provider init files exist in dev/init/ (e.g., CODEX_INIT.md, GEMINI_INIT.md alongside CLAUDE_INIT.md). Otherwise, Solo mode is the default.
Beyond Triangle: Extensible Topologies
Triangle is the recommended starting pattern because it is simple and predictable. Praxis itself is not limited to three agents.
If your project needs more parallelism, you can scale to N-agent graphs (mixed providers, same-provider parallel sessions, and private/self-hosted nodes), while keeping the same core contract:
- Role ownership stays explicit.
- Work order routing stays deterministic.
- Validation and phase gates stay enforced.
Praxis governs coordination and context continuity across agents. It does not constrain which provider or model you use.
The dev/ Folder
Full folder structure (click to expand)
dev/
├── source_of_truth.md # Canonical rules and decisions
├── context_capsule.md # Session handoff
├── checkpoint.md # Progress milestones
│
├── init/ # Methodology reference docs
│ ├── PRAXIS_INIT.md # Provider-agnostic init
│ ├── CLAUDE_INIT.md # Claude Code init
│ ├── CODEX_INIT.md # Codex manager init (Triangle)
│ └── GEMINI_INIT.md # Gemini researcher init (Triangle)
│
├── research/ # Stage 1: GATHER
│ ├── active/ # Research for current decisions
│ └── archive/ # Decisions made, kept for reference
│
├── planning/ # Stage 2: DECIDE
│ └── master-plan/
│ ├── draft/ # Working plans (AI writes here)
│ └── approved/ # Finalized plans (admin promotes)
│
├── work-orders/ # Stage 3: EXECUTE
│ └── executed/ # Completed work orders
│
├── commands/ # Operator command delivery
│ ├── active/ # Command sets in topic subfolders
│ └── executed/ # Completed command sets
│
├── audit/ # Quality + conformance trail
│ ├── current/ # Active audit entries
│ └── legacy/ # Archived entries
│
├── reports/ # Stage 4: COMMUNICATE
│ ├── draft/
│ │ ├── html/ # Draft HTML reports
│ │ └── written/ # Draft written reports
│ └── published/
│ ├── html/ # Final HTML (admin promotes)
│ └── written/ # Final written (admin promotes)
│
├── design/ # Design assets
│ ├── audit/screenshots/ # Visual captures
│ ├── language/ # Design tokens + methodology docs
│ └── resources/ # Icons, fonts, logos
│
├── private/ # Sensitive docs (GITIGNORED)
│
└── archive/ # Historical records
└── {date}_{description}/ # Dated batches with manifests
Provider Integration
Praxis is provider-agnostic. It works with any AI assistant that can read and write files.
Providers and roles are decoupled:
- Roles are operational (
manager,implementer,researcher, or custom role sets). - Providers are implementation choices (Claude, Codex, Gemini, OpenCode, private/local LLMs, etc.).
- The same provider can fill multiple roles via separate sessions if role boundaries are preserved.
The methodology does NOT control how provider config files are created. Each provider creates their config per their own conventions:
| Provider | Config File | Init File |
|---|---|---|
| Claude Code | CLAUDE.md | dev/init/CLAUDE_INIT.md |
| Codex CLI | AGENTS.md | dev/init/CODEX_INIT.md |
| Gemini CLI | GEMINI.md | dev/init/GEMINI_INIT.md |
| Any other | Whatever the provider uses | dev/init/PRAXIS_INIT.md |
Two-step init flow (important):
- Native init first — Let the AI create its own config file in a dedicated session (e.g., Claude creates
CLAUDE.md, Codex createsAGENTS.md). The AI gives its native setup full attention. - Praxis init second — Run the Praxis init (paste or reference
dev/init/*_INIT.md). Praxis injects a small context handoff block into the provider's existing config — augmenting it, never replacing it. If the provider config doesn't exist, Praxis will stop and ask you to run step 1 first.
This ensures the AI knows where to find the context chain on every new session, without Praxis overriding the provider's native conventions.
Quick Start
Option A: CLI Init (Recommended)
npx praxis-mcp init # starter tier, solo mode
npx praxis-mcp init --tier full --mode triangle # full tier, multi-agent
npx praxis-mcp init --tier standard --path ./my-project # custom path
This creates the dev/ folder structure, context documents, .praxis/praxis-lint.sh, and (in triangle mode) agent folders with _executed/ directories. One command, fully scaffolded.
Option B: Manual Setup
Starter (context chain + work orders only):
mkdir -p dev/work-orders/_executed
Then create dev/source_of_truth.md, dev/context_capsule.md, and dev/checkpoint.md.
Full (complete governance layer):
mkdir -p dev/{init,research/{active,archive},planning/master-plan/{draft,approved},work-orders/_executed,commands/{active,executed},audit/{current,legacy},reports/{draft/{html,written},published/{html,written}},design/{audit/screenshots,language,resources},archive,private}
Configure your provider
Copy the relevant init file from dev/init/ into your project. For Claude Code:
cp dev/init/CLAUDE_INIT.md your-project/dev/init/
Paste the contents of your provider's init file into a new session. The AI will:
- Read your codebase
- Populate the context documents
- Inject the context handoff into your provider config
- Perform an architecture audit (if existing code)
- Create Batch 0 work orders (critical issues only)
You're now running Praxis.
Operating Rules
- Non-destructive — AI never SSH's to production. Local copies only.
- Self-contained — Every project gets its own
dev/folder. Deployable as-is. - No workspace root files — All output goes into project folders or the dev/ structure.
- Draft/published wall — AI writes to
draft/. Admin promotes topublished/. - Executed means done — Items stay pending until fully complete. No premature moves.
- Naming convention —
{number}_{YYYY-MM-DD}_{DESCRIPTION}.{ext}. Number 0 = READMEs. - Commands in files, not chat — AI never pastes multiline commands in conversation. Write to
commands/active/and reference the path. - Context updated every session — Source of truth (decisions), capsule (summary), checkpoint (milestones).
- No secrets in dev/ — Never store API keys, passwords, tokens, or credentials in the
dev/folder. Use.envfiles (gitignored) for secrets. Redact sensitive data in reports before promotion.
WO Lane System
Lanes organize work orders into subproject scopes within an agent folder. They're optional — projects without lanes work identically to v1.2.
Lane Naming
{nn}_{type}_{scope}
- nn — Two-digit prefix for ordering (10, 20, 30...)
- type — One of:
delivery,program,lab,ops - scope — Snake_case description (e.g.,
academy,site_core)
Example: 10_delivery_academy, 70_program_methodology_rewrite, 80_lab_experimental_design
Lane Types
| Type | Purpose | Validation |
|---|---|---|
delivery | Shippable product work | Full: Acceptance Criteria + Status required |
program | Planning and methodology | Relaxed: criteria and status optional |
lab | Experimental and research | Relaxed: criteria and status optional |
ops | Operational and infrastructure | Full: Acceptance Criteria + Status required |
Centralized Completion
When a WO in a lane is completed, it moves to a centralized _executed/ directory:
wo_claude/
├── 10_delivery_academy/ # Active WOs
├── 20_delivery_site_core/ # Active WOs
└── _executed/
├── 10_delivery_academy/ # Completed WOs from this lane
└── 20_delivery_site_core/ # Completed WOs from this lane
This keeps the active queue clean while preserving a lane-organized audit trail.
Patch Work Orders
Patch WOs extend a completed parent WO to address follow-up issues. They use the _P{NN} suffix convention:
5_2026-02-22_ORIGINAL_TASK.md # Parent (in _executed/)
5_2026-02-22_FIX_HEADER_BUG_P01.md # Patch 1
5_2026-02-23_ADD_MOBILE_SUPPORT_P02.md # Patch 2
Required Metadata
Every patch WO includes parent tracking fields:
- **Parent WO:** 5
- **Patch:** P01
- **Sequence Key:** 5.01
The sequence key ({parent}.{patch}) enables chronological ordering across parent + patches.
N/A Criteria
When an acceptance criterion becomes inapplicable after the WO was scoped, mark it as N/A:
- [ ] ~~Criterion text~~ N/A — reason the criterion doesn't apply
The checkbox stays [ ], the text is wrapped in strikethrough (~~), and a reason follows the em dash.
Guardrails
| Rule | Scope | Severity |
|---|---|---|
| Reason required | All WOs | N/A without reason = doesn't match, counts as unchecked |
| Max 3 per WO | Executed WOs | >3 N/A = FAIL (WO is poorly scoped) |
| Prefer rewrite | Active WOs | N/A in active WO = WARN (rewrite the criterion instead) |
Security & Sensitive Data
Praxis is designed to live in Git repositories. These rules prevent accidental exposure:
- Never commit secrets. API keys, passwords, tokens, and credentials belong in
.envfiles, not indev/documents. - Redact before publishing. Reports in
draft/may reference internal IPs, usernames, or infrastructure details. Redact before promoting topublished/. - The
.gitignorematters. Praxis ships with a.gitignorethat excludes common secret patterns. Extend it for your project. - Sensitive artifacts go in
dev/private/. Use it for contracts, credential references, internal notes with PII, or any document that should exist in the project context but never in version control. Adddev/private/to your project's.gitignore. Reference private docs from the source of truth by path (e.g., "credentials indev/private/server_creds.md"). - Command documents deserve extra scrutiny. Command docs in
commands/active/may contain connection strings, server addresses, or credentials. Review before committing to git.
For the MCP server security model (path safety, concurrency, known risks), see SECURITY.md.
Adoption Tiers
You don't have to use everything on day one. Start small and add structure as complexity grows.
Starter — Context Chain + Work Orders
The minimum viable Praxis. Just 3 files and 1 folder:
dev/
├── source_of_truth.md
├── context_capsule.md
├── checkpoint.md
└── work-orders/
└── executed/
Best for: Solo developers, small projects, quick experiments. You get session continuity and task tracking with near-zero overhead.
Standard — Add Research & Planning Pipeline
The full development lifecycle without the audit/report infrastructure:
dev/
├── source_of_truth.md, context_capsule.md, checkpoint.md
├── research/{active, archive}/
├── planning/master-plan/{draft, approved}/
├── work-orders/executed/
└── commands/{active, executed}/
Best for: Medium projects, multi-session work, projects that need planning before building.
Full — Complete Governance Layer
Everything. Audit trail, report pipeline, design assets, archive:
dev/
├── (all Standard folders)
├── audit/{current, legacy}/
├── reports/draft/{html, written}/, published/{html, written}/
├── design/{audit/screenshots, language, resources}/
└── archive/
Best for: Multi-agent workflows, enterprise projects, long-running builds, projects with stakeholder reporting.
File Naming Convention
All files follow: {number}_{YYYY-MM-DD}_{DESCRIPTION}.{ext}
- Number — Sequential, chronological (0, 1, 2, ...)
- Date — Creation date in ISO format
- Description — UPPERCASE, underscore-separated
- Number 0 is reserved for READMEs and examples
1_2026-02-20_AUTH_MIDDLEWARE.md
2_2026-02-20_API_VALIDATION.md
0_2026-02-20_README.md
Validation (praxis-lint)
Praxis includes an automated validation tool that checks whether your dev/ folder conforms to the methodology. It transforms Praxis from convention-based (rules you follow voluntarily) to convention-enforced (rules that are verified automatically).
Quick Start
bash .praxis/praxis-lint.sh # Lint current project
bash .praxis/praxis-lint.sh --fix # Auto-create missing directories
bash .praxis/praxis-lint.sh --json # JSON output for hooks/CI
bash .praxis/praxis-lint.sh --strict # Warnings become failures
bash .praxis/praxis-lint.sh --help # Full usage information
What It Checks (7 Categories, 50 Checks)
| Category | What | Key Checks |
|---|---|---|
| Structure | Required folders exist for your tier | dev/, core docs, work-orders/, research/, etc. |
| Context Freshness | Handoff docs aren't stale | capsule < 7 days, checkpoint < 30 days |
| Work Orders | Executed WOs are truly complete | No unchecked - [ ] boxes in executed/ |
| Naming | Files follow the convention | {number}_{YYYY-MM-DD}_{DESC}.ext |
| Security | No secrets in tracked files | Private keys, AWS keys, connection strings |
| SOT Consistency | Source of Truth matches reality | Referenced folders exist, decisions logged |
| Orphans | No files in wrong locations | No loose files at dev/ root |
Exit Codes
| Code | Meaning | CI/CD Effect |
|---|---|---|
0 | All pass (or INFO-only) | Pipeline passes |
1 | Warnings found (drifting) | Pipeline passes (or fails with --strict) |
2 | Failures found (broken) | Pipeline fails |
Integration
praxis-lint integrates with both CI/CD pipelines and AI coding assistants:
| Integration | How | Setup |
|---|---|---|
| Claude Code | SessionStart hook — runs automatically, feeds findings to AI | See .praxis/examples/settings-hook.json |
| GitHub Actions | CI workflow — blocks PRs with failures | See .praxis/examples/github-action.yml |
| Pre-commit hook | Git hook — validates before every commit | Copy hook script to .git/hooks/pre-commit |
| Any AI agent | Init file instruction — AI runs linter as first action | Referenced in dev/init/*_INIT.md |
| Manual | Run from terminal anytime | bash .praxis/praxis-lint.sh |
Works with gitignored dev/: The linter reads the local filesystem, not git. If dev/ is gitignored, local modes (manual, hooks, AI) all work. CI gracefully skips.
Zero dependencies. One file. Works on any Unix system.
MCP Server: Methodology as a Service
Everything above this line — the context chain, work orders, validation, the entire dev/ folder structure — works through files. The AI reads init docs, follows instructions, and manually reads and writes markdown. It works. It's been battle-tested across thousands of sessions.
But there's a better way.
The MCP server turns Praxis from rules you follow into tools you use. Instead of the AI parsing instructions from CLAUDE_INIT.md and manually opening files in the right order, it calls session_start and gets the entire project state in a single structured response. Instead of manually constructing work order markdown and remembering the naming convention, it calls create_work_order and the file appears — correctly numbered, correctly dated, correctly formatted, in the correct folder.
This is what the Model Context Protocol was designed for: giving AI agents structured access to external systems. The Praxis MCP server wraps the entire methodology into 13 native tools that any MCP-compatible AI can call automatically.
Before & After
| Without MCP (Manual) | With MCP (Native Tools) |
|---|---|
| AI reads init doc instructions | AI calls session_start |
AI opens source_of_truth.md, then context_capsule.md, then checkpoint.md, then scans for WOs | One call returns all context + pending WOs + health assessment |
| AI constructs WO markdown by hand, guesses the next number | create_work_order auto-numbers, auto-dates, enforces the template |
| AI writes capsule sections hoping the format is correct | update_capsule replaces sections by header — preserves everything else |
| Admin asks "is the dev/ folder structured correctly?" | lint runs 50 checks and returns structured JSON |
| New project needs the folder structure | scaffold creates it by tier and mode in one call |
| AI forgets to update docs at session end | session_end checks file modification times and flags non-compliance |
The AI doesn't need to be told how to follow Praxis. It calls the tools, and the tools enforce the methodology.
The Tool Inventory (13 Tools, 5 Categories)
Session Lifecycle — start, end, and detect (click to expand)
| Tool | What It Does |
|---|---|
session_start | Reads the full context chain (SOT → capsule → checkpoint), lists all pending work orders, detects tier/mode/providers, and returns a structured health assessment — all in one call. This replaces the manual "read these files in order" instruction from the init docs. |
session_end | Checks whether context documents were updated during the session by comparing file modification times. Returns a compliance report with warnings for any documents that weren't touched. Optionally runs the linter as a final validation. |
detect_project | Pure detection — determines tier (starter/standard/full), mode (solo/triangle), active providers, and structural completeness. No side effects. Useful for tools and scripts that need to adapt behavior to the project type. |
Context Chain — read, update capsule, update checkpoint (click to expand)
| Tool | What It Does |
|---|---|
read_context | Reads one or all context documents with rich metadata: file size, age in days, and parsed structural sections (decisions count, milestone list, active task). The AI gets both raw content and structured data. |
update_capsule | Section-aware update for context_capsule.md. Provide new content for specific sections (Active Task, In-Progress Notes, Last Session Summary) and the tool replaces only those sections — preserving everything else. No more accidental overwrites. |
update_checkpoint | Appends a new milestone to checkpoint.md. Auto-numbers the next row, enforces the table format, and optionally updates the Current Phase. The AI never has to parse the milestone table by hand. |
Work Orders — list, read, create, complete, patch (click to expand)
| Tool | What It Does |
|---|---|
list_work_orders | Lists all work orders with parsed metadata (number, title, status, priority, assigned agent, lane). Handles Solo, Triangle, and lane-based folder structures. Supports filtering by status, agent, and lane. |
read_work_order | Reads a specific work order by number or filename. Returns parsed header fields, criteria completion state, N/A criteria count, and patch metadata. Searches across lanes and executed directories. |
create_work_order | Creates a new work order with full naming convention enforcement. Auto-numbers, auto-dates, renders the standard WO template, and routes to the correct folder — including lane subfolders. |
complete_work_order | Validates that all acceptance criteria are checked or marked N/A, then updates status to "Complete" and moves to the correct _executed/ path (centralized for lanes, flat for top-level). N/A criteria are treated as resolved. |
create_patch_work_order | Creates a patch WO extending an existing parent. Auto-assigns the next _P{NN} suffix, includes parent metadata (Parent WO, Patch, Sequence Key), and routes to the correct lane. |
Validation — lint (click to expand)
| Tool | What It Does |
|---|---|
lint | Spawns praxis-lint.sh and returns structured JSON findings across all 7 categories (structure, freshness, work orders, naming, security, SOT consistency, orphans). Supports --strict, --fix, and selective category skipping. Same 50 checks as the command line, but the AI gets machine-readable results. |
Scaffolding — scaffold (click to expand)
| Tool | What It Does |
|---|---|
scaffold | Creates the complete dev/ folder structure based on tier (starter/standard/full), mode (solo/triangle), agent list, and optional lane definitions. Creates centralized _executed/ directories and template context documents. Safe to run multiple times — reports what was created vs. what already existed. |
How It Works in Practice
The MCP server uses stdio transport — it's a process that communicates over stdin/stdout using the JSON-RPC 2.0 protocol. You register it in your AI tool's config file, and the tools appear automatically. The AI calls them like native functions.
You don't call the tools manually. The AI calls them. When Claude Code starts a session and sees the Praxis MCP tools available, it calls session_start instead of manually reading files. When it creates a work order, it calls create_work_order instead of constructing markdown. The tools are invoked automatically by the AI as part of its normal workflow.
The server is stateless — no in-memory state between calls. Every tool reads from the filesystem and writes to the filesystem. The filesystem IS the state. This matches the core Praxis philosophy: everything is files, everything is transparent, everything is auditable.
Setup
Install from npm:
npm install praxis-mcp
That's it. The server is ready to use.
Register for Claude Code (.mcp.json at your project root):
{
"mcpServers": {
"praxis": {
"command": "npx",
"args": ["praxis-mcp"],
"env": { "PRAXIS_PROJECT_DIR": "/path/to/your/project" }
}
}
}
Register for Codex CLI (~/.codex/config.toml):
[mcp_servers.praxis]
command = "npx"
args = ["praxis-mcp"]
[mcp_servers.praxis.env]
PRAXIS_PROJECT_DIR = "/path/to/your/project"
Build from source (contributors only):
git clone https://github.com/LuisFaxas/praxis.git
cd praxis/praxis-mcp && npm install && npm run build
Tools appear as mcp__praxis__session_start, mcp__praxis__create_work_order, mcp__praxis__lint, etc. The PRAXIS_PROJECT_DIR environment variable tells the server which project to operate on — tools default to this path so the AI doesn't have to pass it on every call.
Architecture
praxis-mcp/
├── src/
│ ├── index.ts # CLI routing + McpServer + stdio transport
│ ├── cli-init.ts # npx praxis-mcp init command
│ ├── tools/ # One file per category
│ │ ├── session.ts # session_start, session_end, detect_project
│ │ ├── context.ts # read_context, update_capsule, update_checkpoint
│ │ ├── work-orders.ts # list, read, create, complete, create_patch
│ │ ├── lint.ts # Spawns praxis-lint.sh
│ │ └── scaffold.ts # TypeScript mkdir by tier/mode/lanes
│ └── lib/ # Shared utilities
│ ├── constants.ts # Tier maps, WO/patch templates, lane/naming regex
│ ├── fs-helpers.ts # Safe file I/O, lane discovery, executed resolution
│ ├── detection.ts # Tier, mode, and provider detection
│ ├── parsers.ts # WO (with N/A + patch), capsule, checkpoint, SOT
│ └── naming.ts # Auto-numbering, patch suffixes, filename formatting
├── templates/ # Bundled for CLI init
│ └── praxis-lint.sh # Linter v1.3.1
└── build/ # Compiled JS (gitignored)
Zero external dependencies beyond the MCP SDK and Zod for schema validation. TypeScript with strict mode. Compiles to ESM.
See praxis-mcp/README.md for the complete tool reference with input schemas and example responses.
The Foundation: Why Filesystem?
The MCP server is how Praxis scales. But the filesystem is how it survives.
Every methodology choice above — the context chain, the work orders, the four-stage lifecycle — is built on a deliberate foundation: the file system. Not a database. Not an API. Not a SaaS platform. Files and folders.
- Zero dependencies — Works anywhere there's a file system. No installs, no accounts, no subscriptions.
- Git-friendly — The
dev/folder can be tracked (or gitignored for private projects). Full version history for free. - AI-native — Every AI agent can read and write files. Not every AI agent can call APIs or query databases.
- Human-readable — Open any file in any text editor. No special tools needed to understand the project state.
- Portable — Copy the
dev/folder to a new machine, a new project, a new team. It just works. - Transparent — No hidden state. Everything is visible, auditable, and diffable.
This matters because it means Praxis works without the MCP server. Any AI that can read a file can follow the methodology. The init docs contain everything — the rules, the folder structure, the session protocol. An AI with no MCP support can still read CLAUDE_INIT.md, follow the instructions, and operate a fully governed Praxis workflow.
The MCP server doesn't replace this foundation. It accelerates it. Files are the state. Tools are the interface. You can run Praxis at any level:
| Level | What You Need | What You Get |
|---|---|---|
| Files only | Any AI + init docs | Full methodology — context chain, work orders, audit trail |
| Files + Lint | Any Unix system | Automated validation — 50 checks, CI/CD integration |
| Files + MCP | MCP-compatible AI | Native tools — one-call session start, auto-numbered WOs, enforced quality gates |
Each layer adds automation. None of them add lock-in.
Origin Story
Praxis is the culmination of thousands of hours pushing the most capable agentic LLMs to their limits on real projects. Not toy demos. Not tutorial apps. Real infrastructure builds, real web applications, real multi-agent workflows where mistakes cost hours and context loss costs days.
But the methodology didn't come from AI alone. It came from an unexpected place: property management.
Years of managing construction projects, coordinating contractors, tracking work orders across multiple sites, and maintaining audit trails for compliance — that operational experience is baked into every part of Praxis. The work order pattern? That's how construction has tracked tasks for decades. The draft/published wall? That's how property managers handle lease documents — drafts are internal, published documents go to tenants. The context chain? That's the handoff note you leave for the next shift manager so nothing falls through the cracks.
The insight was simple: AI agents have the same coordination problems as human teams. They forget context between sessions. They don't know what other agents are working on. They lack a single source of truth. They can't verify whether a task was actually completed. These are solved problems in operations management — they just hadn't been applied to AI development yet.
Praxis bridges two worlds:
- The organizational discipline of real-world project management — work orders, audit trails, handoff protocols, quality gates
- The technical capabilities of modern AI agents — code generation, research, architecture analysis, multi-agent orchestration
The result is a methodology where humans and AI agents collaborate as equals, each compensating for the other's limitations. AI has unlimited patience and processing power but no persistent memory. Humans have institutional knowledge and decision authority but limited bandwidth. Praxis gives both sides a shared workspace where context persists, work is tracked, and nothing gets lost.
The evolution tells the story: v1.1 gave AI agents structured folders and markdown documents to follow. v1.2 added praxis-lint — 50 automated checks that enforce the rules. v1.3 added the MCP server — native tools that turn the methodology into something the AI doesn't just follow, but calls. v1.3.1 hardened the entire stack: lane-based subproject organization, patch work orders, N/A criteria recognition, a CLI installer, and a security model — all battle-tested on a real multi-agent project before upstream.
The filesystem is the foundation. The linter is the guardrails. The MCP server is the interface. Together, they make Praxis the first AI development methodology that governs itself.
Every rule in this methodology exists because its absence caused a real problem on a real project. Nothing is theoretical. Everything is praxis.
License
MIT License. See LICENSE for details.
The MIT License means you can freely use, modify, and distribute Praxis — including in commercial projects. The only requirement is including the copyright notice. This is the same license used by React, Next.js, and most major open-source developer tools.
Created by Luis Faxas, 2026.
faxas.net/methodology — full methodology explanation, examples, and resources.
"The process by which theory becomes practice." — Aristotle, on πρᾶξις