Agent Workflows
Codecrate can be used as a human-readable markdown exporter, but the current pack and sidecar output is also intended to support coding-agent workflows.
This page focuses on how to use Codecrate when an agent needs to read, retrieve, validate, diff, and apply changes safely.
Choose A Profile
Codecrate supports six output profiles:
human: keep the current markdown-first behavioragent: compact navigation plus normalizedcodecrate.index-json.v3lean-agent: normalizedcodecrate.index-json.v3with lean analysis and markdown defaultshybrid: current markdown richness plus fullcodecrate.index-json.v1portable: manifest-enabledfulllayout for standalone reconstructionportable-agent: reconstructablefulllayout plus normalized retrieval metadata and dual locators
Example:
codecrate pack . -o context.md --profile agent
codecrate pack . -o context.md --profile lean-agent
codecrate pack . -o context.md --profile portable-agent
If you relied on compact-only lookup convenience fields, request them explicitly:
codecrate pack . -o context.md --profile agent --index-json-mode compact
Explicit CLI flags still override profile defaults. For example, this keeps the
agent profile but turns file-level navigation back on and disables the
index sidecar:
codecrate pack . -o context.md --profile agent --nav-mode full --no-index-json
Use portable when reconstruction is the priority rather than retrieval:
codecrate pack . -o context.md --profile portable --emit-standalone-unpacker
If you also want an index-json sidecar that points at the reconstructed
tree, keep the default --locator-space auto and add a sidecar mode:
codecrate pack . -o context.md --profile portable \
--emit-standalone-unpacker --index-json-mode normalized
Use portable-agent when you want those reconstruction and retrieval defaults
in one preset:
codecrate pack . -o context.md --profile portable-agent
portable-agent keeps the reconstructable full markdown layout, but now
minifies the normalized sidecar and trims graph/reference-heavy payloads by
default. Re-enable those sections explicitly when you need them.
Use The Index Sidecar
Codecrate now exposes three sidecar modes:
full: current v1-compatible retrieval surfacecompact: machine-first v2 retrieval surfaceminimal: smallest v2-compatible retrieval surfacenormalized: table-interned v3 retrieval surface with the best default token efficiency for agent workflows
See Index JSON Sidecar for the contract and field guide.
Generate it directly:
codecrate pack . -o context.md --index-json
codecrate pack . -o context.md --index-json-mode minimal
codecrate pack . -o context.md --index-json-mode normalized
--index-json alone preserves the profile/config sidecar default. For plain
markdown-first runs without a sidecar profile, that falls back to the full
v1-compatible sidecar. Use --index-json-mode compact|minimal|normalized
when you want the leanest machine-first sidecar surface.
Use --profile lean-agent when you want those lean defaults without spelling
out the individual toggles.
If you need to trim the v2 payload further, you can also disable the lookup maps or compact-only symbol index line ranges:
codecrate pack . -o context.md --profile agent --no-index-json-lookup
codecrate pack . -o context.md --index-json-mode compact --no-index-json-symbol-index-lines
Or let the profile imply it:
codecrate pack . -o context.md --profile hybrid
codecrate pack . -o context.md --profile agent
The sidecar includes:
per-repository metadata
split part metadata
file-to-part lookup
symbol-to-file lookup
symbol-to-canonical-body lookup in stub layout
direct href-style navigation fields
To compare profile/mode size tradeoffs on a real repository, use the bundled benchmark helper:
python scripts/benchmark_packs.py --profile portable-agent --index-json-mode normalized
reverse lookup indexes appropriate to the chosen mode
unsplit markdown line ranges for review-oriented jumps
locator-space metadata describing whether the primary machine-facing locators point to markdown, reconstructed files, or both
safety findings
language and backend reporting
short display IDs and stronger machine IDs
Useful fields to inspect first:
pack.output_filesrepositories[].partsrepositories[].filesrepositories[].symbols
Locate Files And Symbols
The index sidecar is designed so a tool does not need to scrape markdown to answer common lookup questions.
Locate a file and its markdown part:
codecrate pack . -o context.md --index-json
Then inspect repositories[].files[] for:
pathpart_pathhrefs.index/hrefs.sourcemarkdown_lineson unsplit packslocators.markdownand/orlocators.reconstructed
In full/v1 mode you also get anchors and richer size/hash metadata.
Locate a symbol and its canonical body in stub layout:
codecrate pack . -o context.md --layout stubs --index-json
Then inspect repositories[].symbols[] for:
local_idcanonical_idwhen stub/dedupe behavior requires itfile_partfile_hrefcanonical_partcanonical_hrefindex_markdown_lineson unsplit packscanonical_markdown_lineson unsplit stub packslocators.markdownand/orlocators.reconstructed
By default, review-oriented packs keep markdown locators. When
--emit-standalone-unpacker is enabled, --locator-space auto switches the
primary sidecar locator space to reconstructed files instead.
If you need explicit reverse indexes instead of scanning arrays, inspect
repositories[].lookup for:
file_by_pathfile_by_symbolpart_by_filesymbol_by_local_id
minimal mode trims that further to file_by_path and
symbol_by_local_id only.
Understand Split Output
When --split-max-chars is used, Codecrate can emit:
context.index.mdcontext.partN.mdfiles
Split output is intended for reading and retrieval, while the unsplit markdown remains the machine-readable source for unpack, validate, and standalone reconstruction flows.
Split policy is explicit:
default preserve behavior: keep an oversize logical block intact in an oversize part
--split-strict: fail if a logical block exceeds the limit--split-allow-cut-files: explicitly cut oversize file blocks across parts
The sidecar records both the effective split policy and whether a specific part is oversize.
Example:
codecrate pack . -o context.md --split-max-chars 20000 --split-allow-cut-files --index-json
Then inspect repositories[].parts[] for:
kindchar_counttoken_estimateis_oversizedcontains.filescontains.canonical_ids
For review-oriented tooling, per-file entries also include packed size metadata:
sizes.originalsizes.effective
Check Safety And Trust Signals
Agent workflows often need to know whether a pack is safe to use for automated editing.
The sidecar reports:
repositories[].safety.skipped_countrepositories[].safety.redacted_countrepositories[].safety.findingsper-file
is_redacted/is_binary_skipped/is_safety_skipped
If you want stricter packaging behavior, use redaction and validation policy flags together.
Validate Before Acting
Use validate-pack before unpacking or applying edits in CI or autonomous
loops.
Examples:
codecrate validate-pack context.md
codecrate validate-pack context.md --root .
codecrate validate-pack context.md --strict
codecrate validate-pack context.md --root . --fail-on-root-drift
codecrate validate-pack context.md --fail-on-warning
codecrate validate-pack context.md --fail-on-redaction
codecrate validate-pack context.md --fail-on-safety-skip
codecrate validate-pack context.md --json
Recommended CI-style validation for agent loops:
codecrate validate-pack context.md --root . --strict --fail-on-warning --fail-on-root-drift --json
JSON validation output includes:
error_countwarning_countpolicy_error_countroot_drift_countredacted_countsafety_skip_count
Use Patch And Apply Loops
For iterative edit workflows, patch/apply is often safer than directly editing packed markdown.
Baseline pack:
codecrate pack . -o baseline.md --profile hybrid
Generate a patch after local changes:
codecrate patch baseline.md . -o changes.md
Validate or apply the patch:
codecrate apply changes.md . --dry-run
codecrate apply changes.md . --check-baseline
This keeps the change representation in unified diff form and lets Codecrate verify baseline hashes before applying edits.
Mixed-Language Repositories
For non-Python files, the index sidecar makes backend reporting explicit.
Per-file fields include:
language_detectedsymbol_backend_requestedsymbol_backend_usedsymbol_extraction_status
This helps an agent distinguish between:
a file type that is unsupported
a backend that was disabled
a backend that was unavailable
a parse that succeeded but yielded no symbols
Example:
codecrate pack . -o context.md --include "*.java" --symbol-backend tree-sitter --index-json
Recommended Defaults
For human review:
codecrate pack . -o context.md --profile human
For autonomous retrieval/edit loops:
codecrate pack . -o context.md --profile agent
codecrate validate-pack context.md --root . --strict --fail-on-warning --fail-on-root-drift
For mixed workflows where humans and agents both read the result:
codecrate pack . -o context.md --profile hybrid