Command Line Interface
Codecrate provides a small CLI with subcommands.
Configuration file
Codecrate reads configuration from the repository root. It will look for:
.codecrate.toml(preferred, if present)codecrate.toml(fallback)pyproject.tomlunder[tool.codecrate](fallback if no codecrate TOML file exists)
Precedence (highest first):
CLI flags
.codecrate.toml/codecrate.tomlpyproject.toml[tool.codecrate]
See Configuration Reference for the exhaustive generated config reference, including:
[codecrate]
output = "context.md"
include_preset = "python+docs"
profile = "human"
index_json_mode = "normalized"
index_json_enabled = true
index_json_output = ""
standalone_unpacker_output = ""
Overview
codecrate --version
codecrate pack [ROOT ...] [--repo REPO ...] [options]
codecrate unpack PACK.md -o OUT_DIR [--check-machine-header] [--strict]
codecrate patch OLD_PACK.md ROOT [-o patch.md]
codecrate apply PATCH.md ROOT [--check-baseline|--ignore-baseline]
codecrate validate-pack PACK.md [--root ROOT] [--strict] [policy flags]
codecrate doctor [ROOT]
codecrate config show [ROOT] [--effective] [--json]
codecrate config schema [--json]
pack
Create a packed Markdown context file from one or more repositories.
codecrate pack . -o context
codecrate pack /path/to/repo1 /path/to/repo2 -o multi.md
codecrate pack --repo /path/to/repo1 --repo /path/to/repo2 -o multi.md
Use either positional ROOT arguments or repeated --repo arguments for
multi-repo packs. Mixing the two styles is an error.
Useful flags:
--dedupe / --no-dedupe: enable or disable deduplication--profile human|agent|lean-agent|hybrid|portable|portable-agent: choose output defaults profile--layout auto|stubs|full: choose layout (auto selects best token efficiency)--nav-mode auto|compact|full: navigation density; auto uses compact for unsplit output and full when split outputs are requested--symbol-backend auto|python|tree-sitter|none: optional non-Python symbol extraction backend (Python files always use AST)--keep-docstrings / --no-keep-docstrings: keep docstrings in stubbed views--manifest / --no-manifest: include or omit the Manifest section--respect-gitignore / --no-respect-gitignore: include ignored files or not--security-check / --no-security-check: enable or disable sensitive-file safety filtering--security-content-sniff / --no-security-content-sniff: optionally scan file content for key/token patterns--security-redaction / --no-security-redaction: redact flagged files instead of skipping them--safety-report / --no-safety-report: include Safety Report section in output--security-path-pattern GLOB(repeatable): override sensitive path rule set--security-path-pattern-add GLOB(repeatable): append sensitive path rules--security-path-pattern-remove GLOB(repeatable): remove sensitive path rules--security-content-pattern RULE(repeatable): override sensitive content rule set (name=regexorregex).codecrateignore: gitignore-style ignore file in repo root (always respected)--include GLOB(repeatable): include patterns--include-preset python-only|python+docs|everything: include preset--exclude GLOB(repeatable): exclude patterns--stdin: read file paths from stdin (one per line) instead of scanning--stdin0: read file paths from stdin as NUL-separated entries--print-files: debug-print selected files after filtering--print-skipped: debug-print skipped files and reasons--print-rules: debug-print effective include/exclude/ignore/safety rules--split-max-chars N: additionally emit.index.mdand.partN.mdfiles for LLMs--split-strict / --no-split-strict: fail instead of writing oversize logical blocks--split-allow-cut-files / --no-split-allow-cut-files: explicitly cut oversized file blocks across multiple part files--token-count-tree [threshold]: show file tree with token counts; optional threshold shows only files with >=N tokens (for example,--token-count-tree 100)--top-files-len N: show N largest files by token count in stderr report--token-count-encoding NAME: tokenizer encoding (for tiktoken backend)--file-summary / --no-file-summary: enable or disable pack summary output--max-file-bytes N: skip files larger than N bytes--max-total-bytes N: fail if included files exceed N bytes--max-file-tokens N: skip files above N tokens--max-total-tokens N: fail if included files exceed N tokens--max-workers N: cap thread pool size for IO/parsing/token counting--manifest-json [PATH]: write manifest JSON for tooling (default:<output>.manifest.json)--index-json [PATH]: write index JSON for agent/tooling lookup (default:<output>.index.json; explicit--index-jsonpreserves profile/config mode defaults unless--index-json-modeoverrides them)--index-json-mode full|compact|minimal|normalized: choose sidecar mode and enableindex-json output (
agentandportable-agentdefault tonormalized;hybriddefaults tofull)
--index-json-lookup / --no-index-json-lookup: include or trim lookup mapsin compact/minimal v2 sidecars
--index-json-symbol-index-lines / --no-index-json-symbol-index-lines:include or trim compact v2 symbol index line ranges
--index-json-symbol-locators / --no-index-json-symbol-locators: include ortrim symbol locator payloads
--index-json-symbol-references / --no-index-json-symbol-references:include or trim conservative symbol reference and call-like metadata
--index-json-graph / --no-index-json-graph,--index-json-test-links / --no-index-json-test-links,--index-json-guide / --no-index-json-guide,--index-json-file-imports / --no-index-json-file-imports,--index-json-classes / --no-index-json-classes,--index-json-exports / --no-index-json-exports,--index-json-module-docstrings / --no-index-json-module-docstrings: independently trim analysis sections--no-index-json: disable index JSON output, including profile-implied defaults--emit-standalone-unpacker: write<output>.unpack.pyfor zero-installreconstruction of manifest-enabled packs
--locator-space auto|markdown|reconstructed|dual: choose whethersidecar locators target the markdown pack, the reconstructed file tree, or both;
autoresolves toreconstructedwhen--emit-standalone-unpackeris enabled and otherwise tomarkdown
--encoding-errors replace|strict: UTF-8 decode policy when reading files-o/--output PATH: output markdown path (defaults to configoutputorcontext.md)
Profile defaults:
human: current markdown-first behavioragent: compact navigation plus normalized v3index-jsonoutputlean-agent: smaller normalized v3 sidecars with lean analysis defaultshybrid: current markdown behavior plus fullindex-jsonoutputportable: manifest-enabledfulllayout intended for standalone unpackportable-agent:fulllayout, standalone unpacker, normalized sidecar, and dual locators by default
Portable reconstruction example:
codecrate pack . -o context.md --profile portable --emit-standalone-unpacker
python3 -S context.unpack.py context.md -o reconstructed/ --check-machine-header --strict --fail-on-warning
The emitted script uses only the Python standard library. It supports both
full and stubs layouts; portable remains the recommended profile
when you want a reconstruction-first full pack.
On Windows, use py -3 -S context.unpack.py context.md -o reconstructed
--check-machine-header --strict --fail-on-warning.
If you also emit index-json, the default locator_space = "auto"
switches the sidecar to reconstructed locators so tools can target the unpacked
tree directly.
When --emit-standalone-unpacker is used together with --split-max-chars,
Codecrate still writes the unsplit markdown to the main output path because that
unsplit pack remains the authoritative machine-readable reconstruction source.
--stdin / --stdin0 notes:
--stdinaccepts one path per line from stdin.--stdin0accepts NUL-separated paths from stdin.--stdinignores blank lines and lines starting with#.Requires a single
ROOT(cannot be combined with--repo).Include globs are not applied to explicit stdin files.
Exclude rules and ignore files still apply.
Outside-root and missing explicit paths are skipped.
With
--print-skipped, explicit file filtering reports reasons likenot-a-file,outside-root,duplicate,ignored, andexcluded.
Include precedence:
explicit
--includeexplicit
--include-presetconfig
includeconfig
include_presetbuilt-in default preset (
python+docs)
Token diagnostics notes:
Token diagnostics are CLI-only and do not modify generated markdown.
If
tiktokenis not installed, counting falls back to an approximate method.If tokenizer initialization fails, codecrate still reports top-N largest files using heuristic counts.
Safety scanning uses conservative defaults; you can override both path and content rule sets.
With redaction enabled, flagged files remain in output with masked content.
A compact
Pack Summary(files/tokens/chars/output path) is printed by default and can be disabled with--no-file-summaryorfile_summary = falsein config.File code fences are automatically widened when file content contains backticks, so generated markdown remains parsable.
unpack
Reconstruct files into an output directory:
codecrate unpack context.md -o /tmp/out --check-machine-header --strict --fail-on-warning
Use --check-machine-header to verify the machine-header manifest checksum
before writing files, --strict to fail on missing/broken part mappings, and
--fail-on-warning to make warning conditions exit non-zero.
If the input pack omits the Manifest section (for example from
codecrate pack --no-manifest), unpack fails with a clear hint to re-pack with
manifest enabled.
patch
Generate a diff-only Markdown patch between an old pack and the current repo:
codecrate patch old_context.md . -o patch.md
The output is Markdown containing one or more ```diff fences.
Patch requires a pack with Manifest; --no-manifest packs are rejected with a
clear hint.
Patch output includes a codecrate-patch-meta fence with baseline hashes.
apply
Apply a patch Markdown to a repo root:
codecrate apply patch.md .
codecrate apply patch.md . --dry-run
codecrate apply patch.md . --check-baseline
codecrate apply patch.md . --ignore-baseline
Use --dry-run to parse and validate hunks without writing files.
Baseline policy:
default: verify baseline hashes when metadata is present
--check-baseline: require metadata and verify--ignore-baseline: skip baseline verification
validate-pack
Validate pack internals (sha/markers/canonical consistency). Optionally compare with files on disk:
codecrate validate-pack context.md
codecrate validate-pack context.md --root .
Use --strict to treat unresolved marker mapping as validation errors.
Use --fail-on-warning to turn any warning into a non-zero exit.
Use --fail-on-root-drift with --root to fail when disk content differs from the pack.
Use --fail-on-redaction or --fail-on-safety-skip for stricter safety policy enforcement.
Validation output groups issues by repository section and includes short hints.
Packs created with --no-manifest are rejected with a consistent error message.
Use --json for machine-readable report output.
For an end-to-end agent-oriented usage guide, see Agent Workflows.
doctor
Inspect configuration and runtime capabilities:
codecrate doctor .
Doctor reports:
config discovery and precedence
selected config source (if any)
ignore file detection (
.gitignore,.codecrateignore)token backend availability
optional parsing backend availability (tree-sitter)
config show
Inspect the resolved configuration for a repository root:
codecrate config show . --effective
codecrate config show . --effective --json
The command reports:
selected config source (or defaults-only)
effective values after precedence resolution
full resolved
security_path_patternslist (after add/remove)configured
security_content_patternslistper-field provenance, including config aliases such as
include_manifest
config schema
Inspect the authoritative config metadata generated from code:
codecrate config schema
codecrate config schema --json