Approach — Governance by process, six phases, no surprises

Phase 01 / Week 1

Assess

On site with the team. Watch the work. Decide what is worth building.

We do not start in a workshop. We start at a desk, on a couch, on a job site, watching the actual person do the actual task. Half the projects we kill at this stage. The other half come out with a sharper scope than the client walked in with.

Assess includes an Algorithmic Impact Assessment (AIA) screening on the government’s framework (the Directive on Automated Decision-Making, impact Levels I–IV). Before any code, we classify what each use case touches and where human oversight is non-negotiable — risk scoped at the on-ramp, not found in production. The deliverable is structured to be grant-ready: ROI projections and risk-mitigation outlines formatted to support BDC and Global Innovation Cluster applications, subject to current program terms.

What you give us

Two hours with leadership
Access to one team for shadowing
Sample documents or call recordings (anonymized fine)

What you get back

A scored opportunity list, with an AIA impact level per use case
A written, grant-ready 90-day plan
An honest go / no-go recommendation

Maps to Algorithmic Impact Assessment (AIA) · the Voluntary Code of Conduct (Accountability)

Phase 02 / Week 2

Shape

Write the evals before the agent. Define what right looks like.

We pull 100 to 300 real historical examples and label them. This is the unglamorous week that decides whether the project succeeds. Without an eval set, you do not have a project, you have a vibe.

Shape is where we set the autonomy ceiling. Per the federal Guide on the Use of Agentic AI, we scope to Level 1–2 (assistive, semi-autonomous), read-only by default: the AI reads, extracts, and drafts, but is architecturally blocked from writing to systems or sending external messages without a human key. Every step that touches a person or a record gets a human-in-the-loop approval gate — a design decision, not a later patch.

If you can’t describe success on a spreadsheet, you can’t build for it. Week 2 is the spreadsheet week.

Maps to Guide on the Use of Agentic AI (Bounded Autonomy, Levels 1–2) · the Voluntary Code of Conduct (Human Oversight)

Phase 03 / Week 3

Build

First working pass. Crude but end-to-end.

By the end of week three you have a working agent on a staging URL. It is not polished. It is end-to-end, and we can run it against the evals and see exactly where it fails. That failure list is the build plan for week four.

The bounds from Shape are now wired in. The human-in-the-loop gate becomes a real checkpoint — a Slack approval, a dashboard sign-off — between the AI’s output and anything leaving the system. Prompt-injection defense ships here too: external text (documents, emails, web content) is treated as data, not instructions — the model reads it but cannot obey it. The AI proposes; your team executes.

What ships in this phase

The agent itself, wired to your stack — read-only by default
The human approval gate on every write or external action
An eval dashboard, updated nightly
A short Loom walking through it

Maps to Guide on the Use of Agentic AI (prompt-injection defense) · the Voluntary Code of Conduct (Human Oversight, Safety)

Phase 04 / Week 4

Harden

Tool calls, retries, cost ceilings, observability.

Most agency projects quietly stop at the happy-path. Week four is the part nobody photographs: rate limits, fallback prompts, partial-failure handling, daily cost ceilings, drift alerting. It’s what makes the system survive a real Tuesday afternoon.

It’s also where recoverability gets built and tested: immutable, human-readable audit logs the AI can’t alter, plus a kill switch that severs AI access instantly and reverts to manual control. Before go-live we red-team for prompt injection, data exfiltration, and unsafe outputs — evidenced, not assumed. Same architecture as the Sovereign Security Stack: recoverability and bounded autonomy as structural constraints, not settings.

Maps to Guide on the Use of Agentic AI (Recoverability, automation drift) · the Voluntary Code of Conduct (Validity & Robustness, Safety)

Phase 05 / Week 5

Pilot

Live with two or three real people. Measure friction.

We pick two or three actual end users (not managers) and put the system in front of them for a week. We watch what they do, where they go around it, where they distrust it. The thing we ship in week 6 is shaped by what week 5 tells us, not by what looked good on the demo.

Phase 06 / Week 6 + 30 days

Adopt

Handoff, training, 30 days of running support.

We write a runbook your team can actually read. We train a named owner. We sit in your Slack for 30 days, on the house, watching the metrics and patching the small things that surface in real use. Then we get out of the way. The point of the engagement is that you do not need us afterwards.

Six phases. Every project, every time.

Assess

Shape

Build

Harden

Pilot

Adopt

How we operate, regardless of the project.

Evals before code.

Ship to one user first.

Default to your stack.

Canadian data residency by design.

Pro-worker by default. You own it.

Automation-drift monitoring is ongoing.

A calendar before a contract.