Approach -- Cronk AI Agents

// before we touch code

Intake before guesswork.

Every engagement starts with the ten-minute intake form and a written briefing. We refuse to scope anything from a thirty-minute discovery call -- too much guesswork, not enough specifics.

The briefing names the agents we'd build, what they touch, what they don't, the risks, and the rough cost band. You read it before signing anything. If it doesn't match what you wanted, we revise it or you walk. No pitch deck.

If we can't write a useful briefing from your intake, we're not the right shop.

// safety gates by default

Humans approve things that matter.

Default posture: every agent action that touches money, customers, or external state goes through a human approval gate. Drafts in Slack. Suggests in the helpdesk. Queues a PO instead of placing it.

Removing gates is a deliberate decision per agent, made together, with logging. We've seen what fully-autonomous looks like when nobody's watching. The dollar value of "almost right" is often negative.

› agent.action · draft_reorder_po → human_approve › agent.action · place_order → on_approval › agent.action · read_inventory → auto

// fixed scope · fixed price

You know the number before signing.

First builds are fixed-scope, fixed-price engagements. The briefing names the deliverables; the SOW names the dollars; the timeline names the weeks. If we underestimate, that's our problem. If you change scope mid-flight, we re-baseline together and you decide whether it's worth it.

We don't bill hourly for first builds. Hourly billing is what consulting shops do to make sure their incentives don't align with yours. Ours do.

// rhythm · ship every two weeks

Working software every two weeks.

Every fortnight you get something running in your environment -- even if it's just one agent doing one thing. We don't disappear for six weeks to "build the architecture." There is no architecture worth six weeks of silence.

Each sprint ends with a 15-minute walkthrough, a one-page progress note, and the next sprint's scope written down. If something's off, you find out at the end of week two, not month three.

// ownership · day one

The code lives in your repos.

From the first commit. Source, prompts, evals, runbooks -- all in your GitHub or GitLab org under your accounts. We push branches; your team merges. If we vanished tomorrow, your agents keep running and any engineer could pick up where we left off.

This is non-negotiable and it costs us business with companies that want us to lock them into our platform. We're fine with that.

YOU OWNSource code · prompts · eval suites · runbooks · cron schedules · API integrations
YOU OWNModel API accounts · billing is on your card, not ours
YOU OWNData · we don't take production data home; staging snapshots are wiped at engagement end

// evals · not vibes

Every agent ships with a test suite.

We don't trust agent quality to gut-checks. Every agent we build ships with an eval suite -- a set of input cases with expected output shape, scored automatically on every prompt change. Regressions get caught before they hit production.

This is the difference between "the LLM seems to be doing the thing" and "the LLM is doing the thing, here is the evidence." It's also how we tune up systems on the monthly retainer without breaking them.

$ cronk eval run agent_02 ↳ 47 cases · 3 categories · 2 model variants ↳ scoring … PASS accuracy 94.7% (target ≥ 92%) PASS tone_match 0.89 (target ≥ 0.80) PASS latency_p95 2.1s (target ≤ 3.0s) ✓ deploy authorized

A typical build, week by week.

Most first engagements run six weeks. Smaller scopes finish in four, larger ones land in eight. Here's the standard six.

Intake + briefing

You fill the form. We deliver a written briefing in 48 hours with proposed scope.

W1–2

Scope + scaffold

SOW signed. Eval suite drafted. First agent shells running in your staging env.

W3–4

Agent 01 live

Smallest, highest-leverage agent ships to production behind a human approval gate.

Agents 02 / 03

Remaining agents ship. Inter-agent routing wired. Eval coverage above 90%.

Hand-off

Walkthrough with your team. Runbooks delivered. Retainer starts if you opted in.

// next step

The boring methodology that actually ships.

Ten minutes to tell us what you'd want us to build. Forty-eight hours to a written briefing. Cancel anywhere along the way.

Start the intake →

How we actually build.