Imagine the following scene: It’s Monday morning, you have several projects running in parallel on your desk – and in every status call the same question is hanging in the air: “How much longer will this take?” This is exactly where Mima is right now.
Mima leads KKO – a team that, under her leadership, manages several projects for different commissioning companies. Some of these projects are KKO’s own initiatives, some are even personal passion projects. The core of the team is carried by Anaya: Anaya is the head of an Indian development company and thus Mima’s most important contractor. On Mima’s side there is also Gandalf, CTO and technical pace-setter: He takes care of the guardrails for quality, process, and the question of how “code” reliably becomes “delivery”.
And now comes the real tension: Mima wants commitment from Anaya – preferably in the form of days. However, Gandalf sees that they are currently playing a completely different game: AI‑First.
⸻
Executive Summary (overall)
• The conflict is not purely a timing problem, but a definition problem: Is what’s being delivered the process or the result?
• “Days” are the wrong control variable in a transformation. AI‑First is a restructuring of process, tooling, and quality standards.
• Prototype speed is not product maturity. The last 10% (integration, tests, operations) dominate the real delivery time.
• AI amplifies seniority. Without standards, reviews, and enablement, the effect doesn’t scale – it just gets distributed unevenly.
• The solution is a new working model with Anaya: rules of the game, phase plan, a few hard metrics, operating model, and trust architecture.
⸻
Part 1: Context and stakeholders
The initial dilemma: “How do I tell Anaya – and what do we base it on?”
Mima comes to Gandalf because she senses: If she just says to Anaya “Please be faster”, nothing will improve sustainably. At the same time, she can’t afford projects to become “diffuse”. She needs predictability, because KKO serves several clients in parallel – and because “private” in her world doesn’t mean “unimportant”, but often “especially important”.
Anaya is on the other side of the table: She runs a development company that forms the core of KKO. In such relationships, it is obvious that the client side talks about time (“How many days?”), while the contractor side talks about approach (“We first need to set up the process”).
Mima is therefore caught between two needs: She needs results and she needs an upgrade of the system, because “more pressure” does not improve the system – it only stresses it. This is exactly where Gandalf’s advice comes in.
⸻
Part 2: Technical classification and control logic
Executive Summary (Gandalf’s advice, ultra-compact)
Redefine the delivery object: not “days”, but a capable delivery system. Control via outcome, quality, and maturity level (process + tooling + standards). Expect transformation time; strictly separate prototype speed from product maturity; explicitly plan for integration, testing, and operational reality. And: build senior guidance and enablement as a feature of the transition – not as a “nice to have”.
⸻
1) Basic diagnosis: expectation mismatch (process vs. result)
Gandalf doesn’t start with tool questions, but with a simple sentence: “Right now you are selling and buying different products.”
• Anaya emphasizes an approach: better collaboration, cleaner implementation, AI‑First methods.
• Mima needs results: features, releases, visible progress.
Both are legitimate. The problem is: If you don’t clearly state what the product of your collaboration is, you are constantly negotiating at the wrong lever.
Practical example:
Anaya says: “We’ve completed the refactoring, the architecture is now clean.”
Mima hears: “The feature is still not live.”
Consequence: First you need a new delivery definition – not just a new sprint plan.
Concrete artifact (small but effective): a one-page delivery definition per project:
• Outcome (what is valuable for the client?)
• Quality level (e.g. test depth, security, observability)
• Acceptance criteria (when is it considered delivered?)
• Non-goals (what is explicitly not part of the delivery object?)
Mnemonic: If you argue about time, you are often arguing about the wrong product.
⸻
2) Wrong control variable: “number of days”
“How many days?” sounds like control, but is often an illusion – especially during transitions. Gandalf calls “days” an output estimate that says too little in a transformation phase, because it does not reflect the crucial uncertainties: toolchain, review quality, test strategy, integration hurdles, maturity in dealing with AI.
Time is not irrelevant. But: time only becomes reliable when the system is stable. In a transformation you first measure the build-up of capabilities.
Alternative control: maturity level instead of days
Gandalf thinks in stages, not calendars. For example, per area:
• Specification (0: verbal request … 3: clear specs + acceptance criteria)
• Tests (0: hardly any … 3: automated + meaningful coverage)
• CI/CD (0: manual … 3: automated + quality gates)
• Observability (0: “it runs” … 3: logs/tracing/alerts that show problems early)
Mnemonic: In a transformation, “time” is the result – not the starting point.
⸻
3) Reality check: AI‑First is not a quick flip of a switch
AI‑First is often sold like a booster: “With AI everything goes faster.” Gandalf clarifies: AI accelerates code – not automatically quality, understanding, or operations.
Many teams experience a typical curve:
1. Euphoria: Code is created faster, demos look good.
2. Slump: Integration, bugs, disagreement about standards.
3. Stabilization: Playbooks, templates, review routines, tooling – then it really gets faster.
If Mima demands “days” in phase 1, she forces the system into phase 2 – and interprets the slump as a performance problem, although it is a maturity problem.
Practical decision: AI‑First does not start everywhere, but in a pilot area: one module, one service, one feature cluster. Standards are built there that are later scaled.
Mnemonic: AI‑First is a program – not a switch.
⸻
4) Scope shift: not just code, but the entire development process including tooling
The most common mistake with AI‑First: reducing it to “we use an AI tool when coding”. Gandalf turns it into an operating system upgrade:
• Specifications become more structured (so that AI works in a targeted way).
• Reviews become more important (because AI produces a lot, but doesn’t “know” what is right for you).
• Tests have to come earlier (so that faster output doesn’t become faster chaos).
• CI/CD and quality gates have to become stricter (so that speed doesn’t tip into instability).
• Operations/observability has to grow with it (otherwise you notice problems too late).
A helpful formulation for Mima in the conversation with Anaya:
“I don’t just want faster code. I want a system that gets us to releases faster – including tests, reviews, deployment, and operations.”
Mnemonic: AI‑First is not a tool. AI‑First is a toolchain + a process.
⸻
5) Delimitation: prototype ≠ product (maturity & expectation management)
AI can build something in hours that used to take days. This creates the dangerous reflex: “Then it must also go into production quickly.” Gandalf draws a hard line:
• Prototype: shows direction, is exploratory, may be shaky, optimized for learning.
• Product: must be maintainable, testable, secure, integrable, supportable.
If Mima expects “product” but accepts “prototype”, frustration arises on both sides: Anaya feels unfairly judged, Mima feels strung along.
Concrete artifact: a “production readiness checklist” (max. 10 points), e.g.:
• Acceptance criteria fulfilled
• Tests in place (unit + critical integration)
• Logging/monitoring in place
• Rollback plan/feature flag
• Documentation/runbook
• Security basics checked
Mnemonic: A demo is not a release.
⸻
6) Time trap: the 90‑90 rule also applies to AI‑First
Gandalf brings up the 90‑90 rule because it hurts so well:
The first 90% take 90% of the time – the last 10% take the other 90%.
These last 10% are rarely “code”. They are integration into real systems, edge cases, stability, deployments, migrations, UX polish, and operations (alerts, dashboards, on-call reality). AI helps – but it does not eliminate this work. Those who ignore it only produce the first 90% faster.
Practical mechanism: Plan “hardening” explicitly: stabilization loops, not as leftover disposal at the end.
Mnemonic: The last mile is a project of its own.
⸻
7) Team capability as bottleneck: juniors benefit less from AI than seniors
AI does not automatically make teams equally strong. It amplifies judgment. Seniors use AI to think faster, check variants, see risks. Juniors get text faster – but often without the ability to reliably evaluate it.
Gandalf formulates this as a design constraint: If you are serious about AI‑First, you must consider leadership, reviews, and enablement as part of delivery.
What this means in practice:
• Pairing (junior + senior) on critical tasks
• Review gates (as a safety net, not as harassment)
• Playbooks: “How do we write specs?”, “How do we test?”, “How do we use AI?”
• Learning paths and “golden examples” in the repo
Mnemonic: AI amplifies seniority – and makes leadership more important, not less.
⸻
Part 3: Implementation and collaboration with Anaya
Executive Summary (Part 3)
Part 3 translates Gandalf’s classification into a workable approach with Anaya: new delivery rules of the game, a phase-based transition plan, a lean measurement and reporting system, a binding operating model for KKO, and a trust architecture based on transparency, clarity of expectations, and escalation paths.
⸻
1) Agreement with Anaya – new rules of the game for delivery
The most important step is not a new tool, but a new agreement. Mima needs a language that creates commitment without creating false certainty.
A sentence that surprisingly eases a lot of tension:
“We don’t commit to days, but to a result with a clear definition of done and quality gates.”
What should be in this agreement?
• Delivery object per iteration (outcome, not activity)
• Mandatory artifacts (specs, tests, release notes, runbooks)
• Quality gates (CI, reviews, minimum test requirements)
• Handling uncertainty (flag risks early, name mitigations)
This is not a legal document. It is a shared operating system for collaboration.
⸻
2) The transition plan – from today to “AI‑First in production”
So that “transformation” doesn’t sound like “infinite”, you need phases. Not as rigid Gantt thinking, but as orientation. A pragmatic four-phase plan:
Phase 1: Pilot
• Goal: A defined area is implemented as AI‑First.
• Deliverable: First release + documented learnings.
• Exit: Draft playbook + first quality gates running.
Phase 2: Standards
• Goal: Stabilize templates, review routines, test strategy.
• Deliverable: Spec template, PR checklist, CI gates.
• Exit: Repeatable delivery in the pilot area.
Phase 3: Rollout
• Goal: Extend standards to further projects/modules.
• Deliverable: Migration plan, onboarding, trainings.
• Exit: Multiple streams deliver according to the same rules.
Phase 4: Stabilization
• Goal: Operations, observability, tech debt management.
• Deliverable: SLOs/SLIs, alerts, runbooks, hardening cycles.
• Exit: AI‑First is “normal operations”, not a special project.
Important: In phase 1 Mima needs a visible result, otherwise trust will tip. The trick is to choose the pilot so that it is valuable enough without risking the entire system.
⸻
3) Measurement system instead of gut feeling – KPIs, artifacts, cadence
If “days” are removed, you need orientation. Gandalf would say: “Measurement system instead of gut feeling.” The art is to choose a few metrics that improve behavior without gamifying it. A minimal set:
• Lead time (from “ready” to “released”)
• Release frequency (how often does something actually go live?)
• Change failure rate (how often do releases cause problems?)
• Mean time to recovery (how quickly are you stable again?)
• Review rate (how much is actually peer-reviewed?)
• Test signal (critical paths are tested)
Plus a cadence:
• Weekly delivery review: What is live? What is blocked? What new risks exist?
• Monthly process review: Which rule helps, which annoys, what is missing?
This makes commitment visible – not through promises, but through repeatable delivery.
⸻
4) Operating model KKO – governance, roles, reviews
Without an operating model, AI‑First quickly becomes “everyone does it differently”. For KKO, which serves several projects in parallel, that is poison.
A lean operating model answers five questions:
1. Who prioritizes? (Mima)
2. Who sets technical guardrails? (Gandalf, with delegation rules)
3. Who delivers? (Anaya’s team)
4. Who approves releases? (clearly defined)
5. How is quality enforced? (gates, not hope)
In practice this means: PR checklists, mandatory reviews for critical areas, CI gates, definition of done. Not as bureaucracy, but as rails so that speed doesn’t derail.
⸻
5) Trust architecture – expectations, transparency, commitment
In the end, much of this is a trust problem – or more precisely: a problem of missing mechanisms that create trust.
A “trust architecture” consists of three elements:
Clarity of expectations
What is “good enough”? What is a product release, what is a demo?
Transparency via artifacts
Not “we’re at 80%”, but: spec in place, tests running, release candidate deployed, monitoring active.
Escalation and decision paths
If something is stuck: Who decides? By when? What happens without a decision?
A simple tool is a traffic light status per stream:
• Green: delivery running, risks small
• Yellow: risks real, mitigation planned
• Red: blocker, decision/escalation needed
This makes problems visible earlier – and conversations become more objective.
⸻
Bringing it together and ultra-compact concentrate
Detailed summary
The conflict between Mima (KKO) and Anaya is at first glance a time conflict – but at second glance a conflict about the delivery object. Gandalf helps by shifting the conversation away from “How many days?” toward “What are we actually delivering – process or result?” In an AI‑First transition, “days” is a poor control variable because you are not only building features, but capabilities: process, tooling, standards, reviews, tests, operations. This transformation has a learning curve: first everything seems faster (prototypes), then reality hits (integration, quality, operations), and only with stable standards does speed become sustainable.
This is precisely why the separation of prototype and product is central. AI can quickly generate working code – but “working” is not automatically maintainable, secure, or supportable. The 90‑90 rule is a reminder that the last 10% (integration, edge cases, stability, deployments, observability) dominate the time – and that AI does not eliminate this work. In addition, there is a team bottleneck: AI amplifies seniority. Without senior guidance, review gates, and enablement, the team does not become uniformly faster; it becomes unevenly faster.
The practical answer is a new cooperation model with Anaya: new delivery rules of the game (outcome/quality/maturity instead of days), a phase-based transition plan up to “AI‑First in production”, a lean measurement and reporting system, an operating model for KKO (roles, governance, quality gates), and a trust architecture that organizes transparency and escalation. In this way, commitment arises not through estimates, but through visible artifacts, clear standards, and a shared definition of “done”.
Ultra-compact concentrate (1 paragraph)
Not “How many days?”, but “What delivery system are we delivering?”: AI‑First is a transformation of process, tooling, and quality, in which prototype speed is not product maturity and the last 10% (integration/tests/operations) dominate the time; it only becomes controllable through new rules of the game with Anaya, a phase plan, a few hard metrics, an operating model with review/quality gates, and a trust architecture based on transparency, clarity of expectations, and clear escalation paths.