This package is part of a coordinated family, not a standalone artifact. Start with the relationship map, use the shared core for common primitives, then read each applied package at the right altitude.
This package shows how the shared discipline becomes a governed assistant architecture: source authority, evidence states, non-inference, control libraries, eval fixtures, human review, repo bootstrap, and controlled build-preparation handoff.
Start with the decision you need to make.
This package is deliberately deep. The improved baseline gives leaders, builders, governance owners, and source reviewers a faster way into the material without weakening the underlying evidence and handoff discipline.
Enterprise Architecture Review Assistant
Comprehensive Alpha Intake, Engineering Orientation, Platform Alignment, Resourcing, Value Blueprint, Agent-Ready Repo Bootstrap, Copilot Alignment, Manual Harness, Navigation-Restored HTML, and Controlled Build-Preparation Public-site edition v1.15 with historical handoff notes
Purpose
This package explains what we are trying to build, why it matters, how it should be governed, what needs to be defined before engineering starts, how the emerging enterprise AI platform patterns should influence the delivery path, and what roles, effort, and run economics must be understood before this moves from idea to build.
It is written for four audiences at once:
- David and the EA/governance team, who own the review judgment, standards, criteria, and final decision model.
- Tony and product/architecture collaborators, who need to convert the idea into a buildable product shape.
- Engineering or platform teams, including Rob C's team or equivalent implementation resources, who may help wire the capability into approved platforms and systems.
- Future submitters, because this package should become the alpha example of what good AI or agentic intake should look like before anyone starts building.
The short version: this is not a proposal to replace architects with an agent, and it is not a proposal to build a clever demo that cannot be sustained. It is a proposal to build an internally owned architecture intelligence layer that helps architects do first-pass review faster, more consistently, and with stronger evidence. The agent is only the interface and assistance layer. The hard part is the review brain.
Executive thesis
The central recommendation is direct: own the brain, wire the interface.
The enterprise should internally own the product intent, review criteria, standards interpretation, source authority, data classification, reference patterns, semantic/control model, exception logic, evidence model, prioritization rules, and decision memory. A supplier or technical team can help with implementation wiring, workflow, integrations, Copilot/Copilot Studio mechanics, Azure/AWS orchestration, telemetry, and deployment plumbing. But the architecture intelligence layer should not be outsourced.
That distinction matters because the value of this product is not a chatbot, portal, or model call. The value is a governed, versioned, reusable architecture review capability that can explain why a submission passes, fails, needs more evidence, requires escalation, or should be routed to a different platform or governance path.
A vendor can build a portal. A vendor can configure an agent. A vendor can wire APIs. A vendor cannot responsibly define how this enterprise judges architecture fitness, risk, source authority, platform alignment, GxP posture, or exception acceptability. If we hand that layer away, we will rent back our own architecture judgment through maintenance fees and change orders, which is apparently how civilization chose to monetize avoidable dependency.
1. The problem we are solving
Architecture review demand is increasing, especially for AI, GenAI, cloud, SaaS, integration-heavy, and emerging technology submissions. The architecture team is expected to process more reviews without a proportional increase in headcount. The current operating model depends heavily on manual review, scattered standards, inconsistent artifacts, and the institutional memory of individual architects.
The business problem is not simply that reviews take too long. The deeper problem is that the enterprise lacks a scalable, consistent, explainable, and reusable way to evaluate architecture submissions against current standards, approved patterns, data requirements, security expectations, platform constraints, existing capabilities, and prior decisions.
Today, a high-quality review often requires an architect to know where standards live, which version applies, what technologies are approved or restricted, whether an existing capability already solves the problem, how similar decisions were handled before, which governance route applies, and whether an exception is justified. Much of that knowledge is distributed across people, repositories, portals, decks, tools, and memory. That does not scale.
Current pain points
| Pain point | Why it matters |
|---|---|
| Fragmented intake | Reviews arrive through multiple channels, with inconsistent artifacts and context. |
| No common checklist | Review quality depends on reviewer style and available memory. |
| Scattered standards | Reviewers may rely on different or outdated source material. |
| Inconsistent outputs | Findings, risks, and decisions are not captured uniformly. |
| Weak decision memory | Prior approvals, exceptions, and rationale are hard to reuse. |
| Duplicate solutions | Existing capabilities are not always surfaced before new builds proceed. |
| Fast-path ambiguity | Some submissions may self-certify as low risk without enough architectural evidence. |
| Governance pressure | Exceptions may be approved due to delivery timelines rather than clear risk acceptance. |
What a bad review can cause
A weak or inconsistent review can approve non-standard technology, miss a duplicative capability, allow an unsupported or non-scalable pattern, overlook security or data issues, accept an exception without clear rationale, or create a decision that cannot be defended later. In a regulated healthcare, pharma, and medical device environment, those failures are not academic. Some may become operational, compliance, audit, quality, or patient-impacting problems.
This is why the assistant must improve rigor, not merely speed.
2. The vision
The vision is an Enterprise Architecture Review Assistant that helps architects perform first-pass evaluation of architecture submissions using a governed, evidence-backed, internally owned review model.
The assistant should be able to:
- Access or receive submitted architecture artifacts.
- Extract solution intent, business context, technologies, integrations, data flows, security model, operating model, and missing evidence.
- Classify the submission by domain, risk, data sensitivity, GxP potential, business impact, platform path, and governance route.
- Determine whether AI is actually appropriate, or whether the use case is better served by deterministic workflow, rules, dashboarding, search, or a data product.
- Evaluate the submission against approved standards, reference patterns, tool catalogs, control libraries, and architecture principles.
- Generate standardized review outputs: summary, scorecard, findings, evidence, missing information, risks, recommendations, and decision posture.
- Keep final approval with human architects.
- Capture final decision rationale and exceptions as reusable institutional memory.
The long-term capability should make reviews faster, more consistent, more explainable, and more reusable. It should also become a pattern for how future AI/agentic use cases enter architecture review.
The product philosophy
| Principle | Meaning |
|---|---|
| Assist, do not replace | The assistant performs first-pass work. Architects make final decisions. |
| Evidence before opinion | Findings must cite submitted evidence or mark what is missing. |
| Deterministic where possible | Source authority, routing, classification, and control logic should be governed rules where possible. |
| AI where ambiguity exists | Use AI for messy document interpretation, summarization, extraction, and recommendation drafting. |
| Own the intelligence layer | Criteria, controls, standards, patterns, and decision memory must remain internal assets. |
| Platform-fit before build | The delivery path must align to enterprise-approved platform, governance, and integration patterns. |
3. What we are building
We are building a decision-support capability for enterprise architecture review. It has three primary layers.
3.1 Intake and governance layer
This layer clarifies the request before review begins. It should determine what the submission is, who owns it, what business outcome it supports, whether it is an AI use case, what governance path applies, what platform route is likely, and what evidence is required before a meaningful review can occur.
This is where the future assistant should ask the questions we are asking ourselves now: What problem are we solving? Why does this need AI? What data is required? What sources are authoritative? What evidence proves the architecture is ready? What review path applies?
3.2 Architecture intelligence layer
This is the internal asset. It contains source authority, standards, review criteria, control definitions, reference patterns, anti-patterns, approved technologies, data classification, evidence requirements, exception logic, prior decisions, and review-output schemas.
This layer should be versioned, governed, auditable, and reusable. It should not live as undocumented prompt text. It should be represented through structured artifacts such as JSON, YAML, Markdown, schemas, control libraries, and reviewer guidance.
3.3 Experience and implementation layer
This is the interface and runtime layer. It may involve Copilot, Copilot Studio, Teams, SharePoint, the AI Governance Portal, Azure services, AWS services, APIs, MCP, A2A, REST, event patterns, observability, and workflow automation.
This layer can change over time. The architecture intelligence layer should survive changes in UI, model provider, cloud runtime, licensing, and token economics.
4. What we are not building yet
The first version should not become a full enterprise governance platform. It should not automate final approval. It should not replace the Architecture Review Board. It should not become the single system for legal, privacy, security, quality, procurement, and architecture workflows. It should not create a duplicate intake portal if an existing AI Governance Portal or submission system already serves as the system of record.
The first version should prove the disciplined review pattern:
- Can we define the minimum review criteria?
- Can we classify submissions correctly?
- Can we identify missing evidence?
- Can we evaluate against trusted sources and reference patterns?
- Can we produce useful findings that architects trust?
- Can we capture final decisions for reuse?
If those are not true, adding a nicer interface just gives us a more attractive way to be wrong.
5. Why this should be the golden alpha intake model
This use case should model the behavior we eventually expect from other AI and agentic submissions. It is both a product concept and a process example.
Before future teams request or build AI-enabled solutions, they should be able to explain:
- What problem they are solving.
- Who has the problem.
- What happens if nothing changes.
- Why AI is appropriate.
- What deterministic alternatives were considered.
- What data is required.
- Who owns the data.
- What is canonical versus reference-only.
- What governance review is required.
- What platform path is preferred.
- What security, privacy, GxP, quality, audit, or business-risk obligations apply.
- What output must be produced.
- Who owns sustainment.
- What it costs to build, run, improve, and govern.
- Whether the value justifies the engineering and operating cost.
This package should become the first working example because it exposes the truth that many AI projects try to avoid: the hard part is usually not model access. The hard part is clarifying intent, governing data, defining criteria, mapping source authority, managing evidence, and deciding who owns the decision.
6. Methodology: from idea to build-ready
The methodology should balance upfront discipline with practical delivery. We need enough definition to avoid building the wrong thing, but not so much documentation that the project suffocates in its own ceremony.
Stage 1: Intent clarification
Define the problem, target users, first review type, value hypothesis, non-goals, and decision boundaries.
Required outputs:
- Problem statement
- Vision statement
- First use case
- Target users
- Value hypothesis
- Non-goals
- Success and failure criteria
Stage 2: Governance routing
Classify the use case across business function, enterprise platform governance, security, privacy, quality, GxP potential, and AI governance.
Required outputs:
- Governance route
- Required approvers
- Fast-path versus standard-review eligibility
- Security and privacy triggers
- GxP and quality triggers
- Auditability requirements
Stage 3: AI appropriateness gate
Determine whether the problem actually needs generative AI or agentic AI.
Required outputs:
- AI appropriateness statement
- Deterministic alternative assessment
- Human-in-the-loop requirement
- Decision impact classification
- Required explainability level
Stage 4: Data and source authority model
Identify all relevant sources, classify them, determine authority, and define what can be used as evidence.
Required outputs:
- Source inventory
- Source authority map
- Data classification model
- Evidence model
- Access model
- Retention model
- Refresh model
Stage 5: Review criteria and control design
Turn architecture principles into reviewable controls. This is where vague principles become operational checks.
Required outputs:
- Review rubric
- Control library
- Reference pattern library
- Anti-pattern library
- Technology catalog
- Exception rules
- Severity model
Stage 6: Platform-fit routing
Map the use case to the right implementation path.
Required outputs:
- Front door recommendation
- Runtime recommendation
- Integration pattern
- Model/tool access path
- Plan B for licensing or tokenomics changes
- Supplier role boundary
Stage 7: Resourcing and value gate
Before build, define the minimum team, delivery path, expected effort, technology cost exposure, run economics, and value hypothesis. This is not procurement theatre. It is a reality check. A use case that cannot explain who will build it, who will run it, what it costs, and why it is worth doing should not be approved just because someone used the word agent.
Required outputs:
- Delivery role map
- Internal versus augmented skills plan
- Rough order-of-magnitude effort estimate
- Build cost estimate
- Run cost and tokenomics model
- Sustainment owner
- Value hypothesis and stop criteria
Stage 8: Pilot gate
Before build, define the pilot set and evaluation method.
Required outputs:
- Pilot submissions
- Gold-standard human review baseline
- Accuracy and usefulness metrics
- Override tracking
- Cycle-time baseline
- Pilot success criteria
7. How an engineer should read this package
This section is for implementation teams who have not lived through the discovery conversations.
An engineer should not read this package as a final architecture design. It is a build-orientation package. It explains the product intent, source materials, configuration assets, likely platform paths, and open decisions that must be resolved before detailed design.
Recommended reading order
- Read the problem statement and vision.
- Read the alpha intake methodology.
- Read the data architecture and source authority sections.
- Review the platform-fit routing model.
- Review the configuration and data asset map.
- Review the engineering build flow.
- Treat the source library as provenance and evidence, not as official policy unless validated.
- Treat JSON/YAML/schema/control files as seed configuration, not final enterprise standards.
Engineering north star
Build an assistive EA review system that ingests architecture submissions, extracts structured facts, evaluates them against internally owned controls and source-authority rules, produces evidence-backed findings, and routes final decisions to human architects.
What engineers should not do
- Do not hard-code review criteria into prompts.
- Do not make the model the final approver.
- Do not build a duplicate intake portal unless integration is blocked.
- Do not treat company-agnostic templates as official policy without validation.
- Do not make source authority subjective or prompt-driven.
- Do not bury critical review logic in a black-box model response.
8. Engineering build orientation
The engineering model should separate configuration, deterministic logic, AI assistance, workflow, and human review.
8.1 Conceptual component model
| Component | Role | Build posture |
|---|---|---|
| Intake adapter | Reads or receives submissions from the approved intake source. | Integrate with system of record where possible. |
| Artifact processor | Extracts text, tables, diagrams, metadata, and file inventory. | Use approved document processing path. |
| Submission normalizer | Converts artifacts into a standard submission schema. | Deterministic schema-first implementation. |
| Classification engine | Applies data, risk, GxP, platform, and governance classification. | Rule-driven with AI suggestions only where needed. |
| Source authority resolver | Determines which standards and sources can support findings. | Deterministic and governed. |
| Control evaluation engine | Evaluates submission facts against controls and criteria. | Hybrid deterministic + AI-assisted evidence matching. |
| AI assistance layer | Summarizes, extracts, maps evidence, drafts findings. | Bounded by instructions, schemas, and citations. |
| Human review workbench | Lets architects accept, reject, edit, override, and finalize findings. | Required before decision closure. |
| Decision memory store | Captures final decisions, rationale, exceptions, and links. | Reusable institutional memory. |
| Observability layer | Tracks usage, accuracy, overrides, failure modes, and cost. | Required for trust and degradation monitoring. |
8.2 Minimum viable build flow
- Select a pilot review type.
- Load a controlled source package.
- Receive or access submitted architecture artifacts.
- Inventory the package and detect missing expected artifacts.
- Extract structured submission facts.
- Apply classification and source authority rules.
- Evaluate against the initial control library.
- Generate evidence-backed findings.
- Route findings to an architect for review.
- Capture accept/reject/edit/override actions.
- Store the final decision record.
- Feed approved decisions and exceptions into the decision memory layer.
8.3 Deterministic versus AI-assisted responsibilities
| Function | Deterministic | AI-assisted | Notes |
|---|---|---|---|
| Source authority ranking | Yes | No | Must not depend on model opinion. |
| Data classification rules | Yes | Suggestion only | Model can flag suspected data types, not decide policy. |
| Required submission completeness | Yes | Evidence detection | Required artifacts should be schema-driven. |
| Technology catalog lookup | Yes | Alias extraction | Model can help map names to known tools. |
| Evidence extraction from messy documents | No | Yes | Good fit for AI if output cites source evidence. |
| Finding narrative | No | Yes | Narrative must be constrained by evidence. |
| Final approval | Human | No | Never autonomous in this use case. |
| Decision record storage | Yes | No | Structured and auditable. |
| Pattern matching | Hybrid | Yes | Rules define applicability; AI helps with interpretation. |
9. Configuration and data asset map
The “brain” of the system should be a versioned configuration and data layer, not a pile of prompt text.
| Asset | Purpose | Current seed source | Owner to confirm |
|---|---|---|---|
| Source authority map | Defines canonical, reference, derived, and non-authoritative sources. | v1.2 hardening package | EA governance / platform owners |
| Normalized tool catalog | Defines approved, restricted, declining, emerging, exception-required technologies. | v1.2 hardening package | Technology owners |
| Architecture submission schema | Defines what a valid submission package must contain. | v1.2 hardening package | EA governance |
| Review output schema | Defines standard finding, evidence, score, and recommendation output. | v1.2 hardening package | EA governance |
| Machine-executable controls | Defines review checks that can be evaluated. | v1.2 hardening package | EA governance + domain SMEs |
| Control-to-pattern mapping | Links controls to approved reference patterns. | v1.2 hardening package | EA governance |
| Capability ontology | Defines domains, capabilities, and review taxonomy. | v1.1 package | EA governance / enterprise architecture |
| Reference pattern library | Defines approved patterns and where they apply. | v1.1 package | Architecture owners |
| Agent instructions | Defines agent behavior, boundaries, evidence rules, and output contract. | v1.2 hardening package | Product + EA |
| Human reviewer guide | Defines how architects validate, override, and finalize output. | v1.2 hardening package | EA governance |
| Enterprise AI knowledge base | Captures broader platform, governance, stack, and pattern context. | Enterprise stack files | Platform owner validation required |
Configuration principle
Anything that changes because a standard changes should not require application code changes. Standards, controls, source status, platform routing, severity, and evidence requirements should be governed configuration wherever feasible.
10. Enterprise platform alignment
The newly supplied enterprise stack inputs materially improve the package. They show a platform posture that appears to favor Microsoft for front-door/orchestration experiences, AWS as an approved runtime and infrastructure pattern, and open interoperability through MCP, A2A, REST, and event streaming.
These inputs are valuable, but several are company-agnostic templates or model-distilled artifacts. They should be treated as platform intelligence inputs until confirmed by platform owners.
10.1 Microsoft front door
The user experience should likely begin where users already work: Copilot, Teams, SharePoint, or Copilot Studio. For early piloting, Copilot/Agent Builder may help test prompts and outputs. For a governed team assistant, Copilot Studio is more realistic.
10.2 Azure service path
Azure may be useful for AI Foundry, Azure OpenAI, AI Search, Semantic Kernel, API Management, Key Vault, Purview, Logic Apps, Durable Functions, Power Automate, Graph, Log Analytics, and integration into the Microsoft collaboration ecosystem.
10.3 AWS runtime path
If enterprise platform leadership prefers AWS for deeper runtime, orchestration, or Bedrock-native patterns, the product should remain compatible with AWS services such as Bedrock, AgentCore, Step Functions, EventBridge, Lambda, S3, KMS, CloudWatch, and related runtime services.
10.4 Interoperability patterns
MCP, A2A, REST, and event streaming should be treated as candidate integration patterns. They matter because the assistant may eventually need to interact with source systems, registries, enterprise agents, review workflows, and decision stores.
10.5 Plan B
The architecture should avoid tying the intelligence layer to one vendor, one front door, one model, or one licensing model. Copilot token economics may be favorable today. That is not a permanent architecture guarantee. The control library, source authority map, schemas, and decision memory should be portable.
10.6 Copilot Premium and Studio governance alignment
The internal AI Platform team context materially validates the staged path already used by this package. Copilot is the broad enterprise front door, but the governance model becomes stricter as agent audience, data access, integration depth, and operational dependency increase.
The relevant platform signal is not merely that Copilot exists. The relevant signal is that the enterprise already distinguishes lightweight individual experimentation from team, function/sector, and enterprise agent patterns. That means this Enterprise Architecture Review Assistant should not leap from a strong package to broad deployment. It should first prove the evidence loop under the correct tier, with data handling and publishing boundaries explicit.
Copilot maturity roadmap signal
| Stage | Timing | Scope | Platform implication |
|---|---|---|---|
| Web GenAI | 2025 | 227K users, individual | Broad general-purpose Copilot Chat foundation. |
| Work GenAI | Q1 2026 | 130K users, individual | Microsoft 365 Copilot Premium expands work-grounded use. |
| Agents You USE | April 2026 | Copilot Premium users | Pre-built internal agents become available for individual use. |
| Agents You BUILD | July 2026 | Individual, up to 5 users | Agent Builder and SharePoint Agents enable limited creator-led experimentation. |
| Agents You REQUEST | Q3 2026 | Deployable to Copilot Premium users | Copilot Studio, Azure AI Foundry, and M365 Agent Toolkit support higher-governance requested agents. |
Governance tier interpretation
| Tier | Typical tool path | User scope | Data posture | Governance posture | SDLC posture | Publishing boundary |
|---|---|---|---|---|---|---|
| Individual use | Agent Builder, no code | Up to 5 | Data approved for Copilot Premium | No approval required | Not required | Agent creator |
| Team | Agent Builder, low code | Up to 100 | Data approved for Copilot Premium | Approval required | SDLC Lite | Copilot Service team |
| Function / Sector | Agent Builder / Copilot Studio, pro code | Any number | Data approved for the use case | Approval required | SDLC Full | Copilot Service team |
| Enterprise | Copilot Studio / Azure AI Foundry, pro code | Any number | Data approved for the use case | Approval required | SDLC Full | Copilot Service team |
The governance break point is important. Below the Team/Function boundary, Copilot Premium-approved data may be sufficient. At Function/Sector or Enterprise scope, data must be approved for the use case, formal approval is required, enterprise support becomes mandatory, and full SDLC applies.
Implication for this Enterprise Architecture Review Assistant
This package remains aligned with the platform model:
- A manual harness or individual experimentation may be used to test the framework against non-sensitive, approved data.
- A team review assistant or shared agent should be treated as a governed request, not casual experimentation.
- Function/Sector or Enterprise use requires full SDLC, approved data handling, enterprise support, Copilot Service team publishing, and validated telemetry.
- Copilot Studio or Azure AI Foundry become relevant only after the evidence loop, source authority, data boundaries, and human review workflow are proven.
- The control library, source authority map, schemas, and decision memory remain internal assets regardless of the front door.
The platform team context therefore increases confidence in the staged delivery model. It does not remove the pilot blockers. The blockers become more defensible, which is annoying only if someone hoped governance would politely disappear.
11. Platform-fit routing model
The implementation path should follow the use case profile.
| Use case profile | Likely path | Notes |
|---|---|---|
| Individual productivity, low sensitivity, no business-process impact | Copilot / Agent Builder | Useful for very small experiments. |
| Team assistant over controlled M365 knowledge | Copilot Studio | Good pilot candidate if governance allows. |
| Review assistant integrated with Teams, SharePoint, and workflow | Copilot Studio + Power Automate / Logic Apps | Practical near-term path. |
| Durable business-process assistant with multiple integrations | Enterprise agent framework / Azure / AWS hybrid | More production-suitable. |
| AWS-preferred runtime or Bedrock-native use case | AWS agent framework / Bedrock / Step Functions / Temporal-style orchestration | Viable if platform owner validates path. |
| Regulated, GxP, privacy-sensitive, or decision-impacting workflow | Formal governance first | Do not casually prototype with sensitive data. |
For this use case, the likely path is a staged approach: validate rubric/output quickly, then move toward Copilot Studio or approved enterprise runtime depending on integration, governance, and source-of-record requirements.
12. Governance and sensitivity
David's concern about sensitivity is valid. People may interpret an EA review assistant as a threat to architect roles. That framing is wrong and risky.
This should be positioned as:
- Reducing repetitive first-pass review work.
- Improving consistency across reviewers.
- Capturing institutional knowledge.
- Making missing evidence visible earlier.
- Giving architects better evidence before decisions.
- Preserving human approval and judgment.
It should not be positioned as:
- Replacing architects.
- Automating ARB approval.
- Removing judgment.
- Treating architecture as a binary checklist.
- Using AI to rubber-stamp submissions.
The message is simple: architects remain accountable. The assistant improves the evidence, preparation, and consistency around the decision.
13. Data architecture and source authority
This is the heaviest lift and the least optional part.
The assistant can only be trusted if it knows which sources are authoritative, which sources are reference-only, which sources are stale, which sources are submission evidence, and which sources are not eligible to support findings.
13.1 Required source classes
| Source class | Examples | Required decision |
|---|---|---|
| Submission artifacts | Architecture diagrams, PPT, Word, PDFs, vendor docs, SAExpress exports | What evidence is submitted for review? |
| Standards | EA principles, AI principles, integration principles, data principles, security models | Which versions are authoritative? |
| Technology catalog | Approved, restricted, declining, emerging tools | Who owns tool status? |
| Reference patterns | Approved architecture patterns and reusable designs | Which patterns apply to which domains? |
| Existing capabilities | Platforms, services, reusable solutions | How do we detect duplication? |
| Prior decisions | Past approvals, exceptions, rejections, remediation | How do prior decisions inform new reviews? |
| Enterprise stack guidance | Copilot, Azure, AWS, MCP/A2A, governance paths | Which inputs are official versus draft/reference? |
13.2 Authority levels
| Authority level | Meaning |
|---|---|
| Canonical | Can be used as source of truth for findings and decisions. |
| Governed reference | Useful but not final authority. |
| Submission evidence | Evidence from the project team for a specific review. |
| Derived analysis | Agent-generated extraction or interpretation, must cite source evidence. |
| Historical context | Prior decisions or lessons learned, useful but version-sensitive. |
| Not authoritative | Demo artifacts, unofficial notes, outdated files, or unvalidated model output. |
13.3 Evidence states
Every review control should resolve to one of these states:
- Pass
- Gap
- Not evidenced
- Exception required
- Not applicable
- Human review required
“Not evidenced” is critical. It prevents the assistant from inventing completeness when the submission simply lacks enough information.
14. Review criteria and control model
The review model should start narrow, but it must be structurally correct.
14.1 Candidate review dimensions
| Dimension | Purpose |
|---|---|
| Intake completeness | Confirm the submission has enough evidence to review. |
| Business and process context | Understand what problem the solution supports. |
| AI appropriateness | Decide whether AI is justified or deterministic alternatives are better. |
| Platform alignment | Evaluate fit with Copilot, Copilot Studio, Azure, AWS, or hybrid paths. |
| Technology adherence | Check approved, restricted, declining, and exception-required technologies. |
| Architecture fit | Compare against reference patterns and anti-patterns. |
| Integration alignment | Evaluate APIs, events, MCP/A2A, point-to-point risk, ownership, and resilience. |
| Data readiness | Evaluate classification, source authority, lineage, quality, access, retention, and ownership. |
| Security baseline | Evaluate identity, access, encryption, secrets, logging, vulnerability management, and threat model. |
| Operations readiness | Evaluate support model, observability, recovery, lifecycle, change, and ownership. |
| Risk and compliance | Evaluate GxP, privacy, quality, regulatory, audit, and vendor risk triggers. |
| Decision recommendation | Recommend posture, conditions, required remediation, or escalation. |
14.2 Control structure
A control should include:
- Control ID
- Name
- Category
- Rule statement
- Applicability condition
- Required evidence
- Pass condition
- Gap condition
- Not-evidenced condition
- Severity
- Source authority
- Remediation guidance
- Human review trigger
This converts architecture judgment into a reviewable, traceable, human-governed model.
15. Reference implementation tracks
Track A: Copilot / Agent Builder discovery prototype
Use this only to test the interaction model, initial prompts, output shape, and user reaction with a very small group. It should not be treated as the production architecture.
Best for:
- Rapid concept validation
- Small group review
- Prompt/output exploration
- Low integration needs
Risks:
- Limited lifecycle governance
- Limited integration depth
- Not ideal for decision memory or operational controls
Track B: Copilot Studio governed pilot
Use this when the pilot needs controlled access, M365 knowledge grounding, Teams/SharePoint interaction, workflow, analytics, and a more governed agent model.
Best for:
- 7-to-30 user pilot
- Team-level assistant
- M365 collaboration integration
- Basic workflow and review routing
Risks:
- Platform governance path must be confirmed
- Some source-of-record integration may require additional services
- Complex orchestration may exceed low-code comfort
Track C: Enterprise stack implementation
Use this for a durable product that integrates with the AI Governance Portal, architecture repositories, CMDB, catalog tools, decision stores, telemetry, and enterprise orchestration.
Best for:
- Production-grade architecture review capability
- Durable decision memory
- System-of-record integration
- Complex workflow and audit requirements
- Azure/AWS hybrid implementation
Risks:
- More engineering required
- More governance required
- Requires product ownership and sustainment funding
16. Minimum viable product backlog
Epic 1: Intake and source package
- Define first review type.
- Define pilot submission package.
- Identify source-of-record for submissions.
- Build controlled upload or read-only artifact ingestion path.
- Validate supported file types.
Epic 2: Source authority and configuration
- Finalize source authority map.
- Finalize initial tool catalog.
- Finalize initial reference patterns.
- Finalize initial control library.
- Version the configuration package.
Epic 3: Extraction and normalization
- Extract text and metadata from submitted artifacts.
- Normalize extracted facts into submission schema.
- Detect missing evidence.
- Map technologies to catalog entries.
- Flag uncertain extraction for human review.
Epic 4: Control evaluation
- Apply intake completeness checks.
- Apply technology and platform checks.
- Apply data readiness checks.
- Apply security and operational readiness checks.
- Generate evidence-backed findings.
Epic 5: Human reviewer workflow
- Present summary, scorecard, findings, evidence, and missing information.
- Allow accept, reject, edit, override, and rationale capture.
- Capture final decision posture.
- Export review output.
Epic 6: Decision memory and telemetry
- Store final decision record.
- Store standards/control version used.
- Track override rate, false positives, false negatives, cycle-time savings, and adoption.
- Feed approved decisions into reusable memory.
17. Delivery roles, skills, and resourcing model
If this becomes a build initiative, the next question is not only what platform we use. The next question is who actually does the work, who owns the durable assets, and what level of investment is reasonable for the first pilot.
The answer should be deliberately modest for v1. We do not need a huge delivery army to prove the concept. We do need the right split of product ownership, architecture judgment, data/control modeling, and engineering implementation. One highly capable engineer can do a surprising amount if the scope is contained and the platform path is clear. One highly capable engineer cannot also be the EA product owner, data steward, governance approver, security reviewer, prompt/control librarian, and adoption lead. That way lies the traditional enterprise ritual of asking one person to be a department and acting surprised when they become carbon.
17.1 Core team for a controlled alpha
| Role | Approx. FTE for alpha | Primary responsibility | Internal or augment |
|---|---|---|---|
| Product / architecture lead | 0.25-0.50 | Product framing, scope control, decision model, package ownership | Internal |
| EA domain owner | 0.25-0.50 | Review criteria, standards interpretation, approval logic, reviewer adoption | Internal |
| Data/control architect | 0.50 | Source authority, data classification, control model, schemas, evidence model | Internal preferred |
| Platform engineer / full-stack integrator | 0.50-1.00 | Connector wiring, auth/RBAC, workflow, orchestration, deployment mechanics | Internal or Rob C team / augmentation |
| Security / privacy advisor | 0.05-0.10 | Security, access, privacy, logging, data-handling review | Internal review role |
| Quality / GxP advisor | 0.05-0.10 if triggered | GxP/quality posture, validation implications, audit expectations | Internal review role |
| Architect pilot reviewers | 2-4 reviewers, part-time | Validate findings, override model, usefulness, trust, adoption | Internal |
17.2 Skills needed
| Skill area | Why it matters | Risk if missing |
|---|---|---|
| Enterprise architecture judgment | Turns principles into reviewable criteria | Agent produces generic recommendations |
| Data architecture and classification | Defines what can be processed, retained, cited, and trusted | Unsafe or ungoverned data handling |
| Source authority modeling | Separates canonical sources from reference-only material | Conflicting standards and false confidence |
| JSON/YAML/schema/control modeling | Makes the review brain configurable instead of buried in code | Prompt spaghetti and maintenance pain |
| Identity and RBAC | Ensures users see only what they should | Security and privacy exposure |
| Workflow/orchestration | Routes reviews, exceptions, approvals, and evidence loops | Manual glue work survives the automation |
| Observability and telemetry | Measures accuracy, override rates, adoption, cost, and failures | No way to know if the pilot works |
| Change/release governance | Versions controls, standards, prompts, and schemas | Agent drift and stale review logic |
17.3 Practical staffing scenarios
| Scenario | When it fits | Likely staffing | Tradeoff |
|---|---|---|---|
| Minimal internal alpha | Controlled proof using sample artifacts and manual source package | Tony + David + 1 engineer part-time + reviewers | Fastest, but limited integration |
| Productized pilot | Team use with controlled intake, RBAC, workflow, telemetry, and reusable control library | Tony + David + 1 engineer 0.75-1.0 FTE + data/control architect 0.5 FTE + review advisors | Best balance of speed and discipline |
| Enterprise integrated pilot | Pulls from system of record, writes decision memory, integrates with platform governance | Above + platform owner + security/privacy/quality + integration support | Stronger, but slower and governance heavier |
17.4 Supplier or Rob C team role
The clean boundary is this: internal owners define the review brain; technical augmentation helps wire the system together.
Internal ownership should include product intent, review criteria, semantic/control model, source authority, data classification, prioritization, exception rules, output model, and decision memory. Rob C's team or a supplier can help with connectors, identity/RBAC, Copilot/Copilot Studio or enterprise-stack mechanics, workflow/orchestration, telemetry, deployment, and platform compliance.
This is not anti-supplier. It is anti-outsourcing-the-part-only-we-understand. Subtle distinction, often missed by people selling roadmaps.
18. TCO, tokenomics, and value gate
Every AI or agentic use case should be required to pass a basic economic reality check before it moves forward. This project should hold itself to the same standard. If a team cannot explain what it will cost to build, what it will cost to run, what licensing or token exposure exists, who will maintain it, and what value it creates, then the use case is not ready for approval.
That is especially important here because the enterprise may have favorable near-term economics through Microsoft 365 Copilot and Copilot Studio licensing. That helps, but it is not a permanent architectural strategy. Licensing can change. Token policies can change. Model availability can change. Usage patterns can surprise everyone, because nothing says enterprise innovation like discovering the invoice after the demo.
18.1 Cost categories that must be estimated
| Cost area | What to estimate | Notes |
|---|---|---|
| Build labor | Product, EA, data/control, engineering, security/privacy, quality, testing | Include internal labor even if not charged back |
| Platform costs | Copilot/Copilot Studio, Azure, AWS, data services, workflow/runtime | Validate against enterprise licensing and chargeback rules |
| Model/token costs | Prompt/completion tokens, embedding, retrieval, evaluation, batch jobs | May be hidden under seat licensing or exposed under API/runtime path |
| Storage and indexing | Artifacts, extracted text, vector indexes, logs, decision memory | Retention and classification rules drive cost |
| Integration costs | APIs, connectors, MCP/tool gateways, identity, workflow, monitoring | Higher if system-of-record integration is required |
| Governance and validation | Security, privacy, quality, GxP, audit, release/change controls | Not optional in regulated contexts |
| Sustainment | Standards updates, control library maintenance, prompt/schema changes, triage | This is where cheap vendor builds become expensive hobbies |
| Adoption and training | Reviewer enablement, documentation, feedback loops, support | Needed for trust and repeat use |
18.2 Tokenomics and runtime questions
| Question | Why it matters |
|---|---|
| Are we using seat-licensed Copilot capabilities, metered API calls, or both? | Determines whether run cost is predictable or usage-based |
| Which steps require model calls versus deterministic logic? | Prevents token burn on tasks rules can handle |
| Are we embedding source corpora, reviewing submitted artifacts live, or both? | Drives indexing and refresh cost |
| How large are typical submission packages? | Controls ingestion, extraction, context, and model cost |
| Do we need repeated evaluation runs per submission? | Can multiply cost quickly |
| Are we storing extracted facts, evidence spans, and decision records? | Reduces repeated processing but adds storage/governance obligations |
| What happens if Microsoft tokenomics or licensing changes? | Forces a Plan B before dependency becomes expensive |
18.3 Value model
Value must be classified before it is judged. A proposal may be useful and still fail the current approval test if it does not match the decision context.
| Value class | What makes it decision-ready |
|---|---|
| Direct savings | Strongest when it lands in the accountable operating budget. |
| Indirect savings | Useful when the financial path is credible but not yet booked. |
| Avoided cost | Valid when the counterfactual cost, renewal, incident, or lifecycle obligation is real and time-bounded. |
| Risk reduction | Valid when operational, security, compliance, or audit exposure is evidenced and the decision owner agrees it matters now. |
| Capacity release | Useful when released time is tied to named work or support demand. |
| Quality or rework reduction | Valid when defect load, rework, or service drag is measurable. |
| Cross-workstream value | Valid when the benefit lands outside the local team only if a benefiting owner and sponsor accept it. |
All value classes matter, but not equally in every business climate. When the immediate decision context is an OPEX reduction mandate, direct savings from an accountable operational budget may be the only class that materially changes the decision.
- Decision owner: decides whether the proposal clears the current line.
- Budget owner: owns the operating budget or capacity being affected.
- Benefiting owner: receives the service, cost, risk, or quality benefit if the claim is true.
- Evidence owner: owns the numbers, baseline, or operational proof behind the claim.
The Outcome Acceptance Line is the current threshold for approval. A proposal can show real value and still sit below the line. Below-line items may still warrant sandboxing, time-boxed feasibility, an explicit exception, or a stop decision.
The first pilot should not claim enterprise-wide transformation. It should prove a measurable local value hypothesis.
Pilot measurement examples:
| Value lever | Pilot measurement |
|---|---|
| Review cycle-time reduction | Baseline human review time versus assisted first-pass review time |
| Reviewer effort reduction | Hours saved per submission on intake, evidence review, and output drafting |
| Consistency improvement | Agreement across reviewers using common criteria and output format |
| Missing evidence detection | Percentage of incomplete submissions flagged correctly |
| Reuse of prior decisions | Number of findings or recommendations linked to precedent |
| Risk reduction | Duplicative, unsupported, restricted, or non-standard patterns caught earlier |
| Adoption | Reviewer usage, trust score, override rate, and repeat-use willingness |
18.4 Alpha-level investment bands
These are planning bands, not budget commitments. They should be replaced with actual internal rates, platform costs, and sourcing assumptions once the delivery path is selected.
| Delivery path | Likely duration | Internal effort | Augmentation need | Cost posture |
|---|---|---|---|---|
| Manual source-package alpha | 2-4 weeks | Low to moderate | Optional part-time engineer | Lowest cost, weakest integration |
| Copilot/Copilot Studio pilot | 4-8 weeks | Moderate | 0.5-1.0 technical FTE | Good proof path if governance permits |
| Enterprise integrated pilot | 8-12+ weeks | Moderate to high | 1.0+ engineering plus platform/security support | Highest fidelity, slower path |
| Vendor-led build | Variable | Still high internal SME burden | Supplier team | Risk of low upfront/high sustainment cost |
18.5 Approval gate for any future AI/agentic submission
A future submission should be considered incomplete if it cannot answer these questions:
- What business problem is being solved?
- What value class is claimed, and what measurable value is expected?
- Who owns the decision, the impacted budget or capacity, the benefit, and the evidence?
- Why is AI or agentic automation appropriate?
- What deterministic alternative was considered?
- What data is required and who owns it?
- What sources are canonical, reference-only, or prohibited?
- What platform path is proposed and why?
- What governance reviews are triggered?
- What is the expected build effort?
- What are the expected run costs, including tokenomics where applicable?
- What is the current business climate weighting most heavily?
- Is the proposal above or below the Outcome Acceptance Line, and what is the below-line path if it does not clear it?
- Who owns sustainment after launch?
- What metrics prove success or failure?
- What is the stop condition if the value does not materialize?
This is the filter that keeps architecture review from approving AI slop with a budget line. Harsh, yes. Cheaper than cleaning it up later.
18.6 Build difficulty and intelligence-layer economics
The hardest work is not syntax. The hardest work is converting scattered enterprise judgment into governed, testable, versioned decision assets. That is what this package is preparing.
For this product, implementation is secondary to knowledge architecture. The repo, prompts, evaluations, schemas, controls, source authority, and non-inference contract are the product. The code is the delivery mechanism.
| Workstream | Relative difficulty | Best ownership posture | Why it matters |
|---|---|---|---|
| Source authority, canonical context, and classification | Very high | Internal ownership required | Requires institutional judgment, source ownership, conflict resolution, and defensibility |
| Review criteria and control model | Very high | Internal ownership required | Converts architecture judgment into repeatable controls without reducing review to dumb checkboxes |
| Non-inference and evidence discipline | Very high | Internal ownership required | Prevents the assistant from treating missing evidence as approval or inventing governance-sensitive facts |
| Evaluation fixtures and expected outputs | High | Internal led, engineering assisted | Forces the team to define what good looks like before demos start lying |
| Repo shape and agent instructions | Medium-high | Internal led, engineering assisted | Makes the work reproducible for humans, agents, suppliers, and future maintainers |
| Integration and workflow plumbing | Medium | Engineering or supplier assisted | Still real work, but a known class of pain once the review brain is clear |
| UI and experience polish | Medium-low | Engineering or supplier assisted | Important for adoption, but not the durable intellectual property |
| Syntax and code generation | Low-medium | Engineering or supplier assisted | Increasingly commoditized when architecture, controls, and evidence contracts are coherent |
This is probably 60-75% of the hard risk-reduction work, but not necessarily 60-75% of total labor hours. Integration with enterprise systems can still be slow and expensive. The difference is that integration pain is estimable and testable. Ambiguity in source authority, evidence, controls, classification, and decision ownership is harder to estimate and more dangerous to outsource.
The practical build-versus-buy rule is direct: outsource wiring, workflow, integrations, orchestration, telemetry plumbing, and deployment mechanics where it makes sense. Do not outsource the architecture intelligence layer unless the enterprise is willing to rent back its own decision logic through future change orders, which would be an unusually expensive way to discover gravity.
Before approving a build or vendor engagement, leadership should ask whether the enterprise is willing and able to own these internal assets:
- Source authority and source conflict rules.
- Review criteria and control library.
- Data classification and evidence requirements.
- Non-inference and escalation behavior.
- Evaluation fixtures and expected outputs.
- Human approval and override workflow.
- Decision memory and exception rationale.
- Sustainment model for standards, controls, prompts, and schemas.
If those assets cannot be owned internally, the right answer may not be to buy a smarter agent. The right answer may be to improve the underlying governance operating model first. Otherwise the enterprise will automate ambiguity, then pay someone to explain why the automation is ambiguous.
A useful leadership framing is this:
The hard part is not building a chatbot. The hard part is codifying how the enterprise makes architecture review decisions: which sources are authoritative, what evidence is required, what controls apply, what must never be inferred, how exceptions are handled, and how human architects approve or override the output. Once that architecture intelligence layer is defined, engineering becomes implementation. Without it, engineering just automates ambiguity.
19. Build Readiness, Repo Bootstrap, and Evaluation Discipline
This section is the final bridge between alignment and development. The intent is not to add more ceremonial documentation. The intent is to define the minimum engineering runway so a builder can start from the package without reinterpreting the product vision, guessing at the data model, or burying governance rules in prompt text.
The package should support four different build depths. A simple Copilot experiment should not carry the same burden as an enterprise integrated product, but every path still needs enough rigor to avoid building from vibes.
19.1 Repo bootstrap readiness model
The HTML package should be treated as the human-readable source of truth. The repo bootstrap pack is the machine-actionable payload that can be extracted from the package once scope and platform path are confirmed.
| Tier | Use case | Minimum expectation |
|---|---|---|
| Tier 0 - Concept / intake | Idea still being shaped | Intent, AI appropriateness, governance triggers, source provenance, open questions |
| Tier 1 - Simple Copilot / Agent Builder POC | Narrow assistant, small audience, limited integration | Instructions, knowledge sources, prompts, tests, known limitations |
| Tier 2 - Copilot Studio governed pilot | Shared agent, connectors, permissions, repeatable process | Architecture, source authority, classification, controls, output schema, RBAC, observability, TCO |
| Tier 3 - Enterprise integrated product | Systems of record, decision memory, audit, quality/security/GxP gates | Full PRD, schemas, control library, decision memory model, validation, runbook, support model |
For this Enterprise Architecture Review Assistant, Tier 1 is useful only as a discovery prototype. If we share it with a reviewer group or connect it to enterprise content, it becomes Tier 2. If it integrates with the AI Governance Portal, architecture repository, decision memory, or regulated review path, it becomes Tier 3.
19.2 Requirements-to-artifact traceability
Requirements gathering should not produce meeting notes that die in a folder. Each question should populate an artifact.
| Question area | Output artifact |
|---|---|
| Problem, vision, and value | PRODUCT_BRIEF.md |
| Why AI versus deterministic automation | AI_APPROPRIATENESS.md |
| Governance routing | GOVERNANCE_ROUTING.md |
| Source authority | SOURCE_AUTHORITY_MAP.json |
| Data classification | DATA_CLASSIFICATION.md |
| Review criteria | CONTROL_LIBRARY.seed.json |
| Reference patterns | PATTERN_LIBRARY.seed.json |
| Output expectations | REVIEW_OUTPUT_SCHEMA.json |
| Non-inference behavior | NON_INFERENCE_RULES.md |
| Evidence requirements | EVIDENCE_REQUIREMENTS.md |
| Human review workflow | HUMAN_REVIEW_WORKFLOW.md |
| Evaluation strategy | EVALUATION_STRATEGY.md |
| TCO and tokenomics | TCO_TOKENOMICS.md |
| Sustainment | SUPPORT_MODEL.md |
This traceability model is what turns intake from a conversation into a buildable source package.
19.3 Evaluation and test fixtures
The assistant should not be trusted because the output looks polished. It should be trusted only after it performs well against a curated fixture set.
The evaluation model should measure extraction correctness, evidence quality, control correctness, missing evidence handling, recommendation quality, source authority behavior, non-inference behavior, reviewer usefulness, and regression stability.
Minimum fixture types should include:
- Complete good submission.
- Incomplete submission.
- Conflicting evidence.
- Restricted or declining technology.
- Missing security model.
- Missing data classification.
- Fast-path claim with risky architecture.
- Duplicate capability scenario.
- GxP or possible-GxP scenario.
- Non-AI deterministic use case.
- Ambiguous submission.
- Adversarial wording.
Happy-path-only testing is how pilots lie. We should not do that to ourselves, even if it would make the demo feel emotionally safer.
19.4 Non-inference and evidence contract
The assistant must not infer approval status, data classification, GxP impact, security control existence, technology approval status, production readiness, ownership, exception approval, business criticality, platform approval, or source authority.
Allowed evidence states should be explicit:
| State | Meaning |
|---|---|
| Supported | Evidence exists in an approved source |
| Not evidenced | Required evidence is missing |
| Conflicting evidence | Sources disagree |
| Requires confirmation | Human or source owner must decide |
| Requires escalation | Governance review is required |
| Not applicable | Control does not apply based on documented facts |
This is the discipline that prevents the assistant from becoming a confident hallucination appliance with a governance logo.
19.5 Control authoring standard
Before review criteria become machine-readable controls, each control should define:
- Control ID.
- Name.
- Category.
- Rule statement.
- Applicability trigger.
- Severity.
- Required evidence.
- Pass criteria.
- Fail criteria.
- Not-evidenced criteria.
- Source authority dependency.
- Recommended remediation.
- Human review requirement.
- Linked reference patterns.
- Version and owner.
Controls should be testable, evidence-bound, versioned, and reviewable by humans. Controls should not be hidden inside prompts where governance goes to become archaeology.
19.6 Build gates and stop conditions
Build should not begin unless the first review type, minimum rubric, source authority, data classification, pilot submissions, platform path, human review flow, run-cost assumptions, and sustainment owner are at least provisionally defined.
Stop or redirect if:
- The assistant guesses instead of marking missing evidence.
- High-risk false negatives appear.
- Architect override rate is too high.
- Source ownership is unresolved.
- Data classification is unknown.
- Run cost exceeds expected value.
- Sustainment ownership is unclear.
- The use case is better solved with deterministic workflow, rules, search, or dashboarding.
The build should proceed only when the next decision is smaller than the last one. If every meeting reveals a new unknown, the project is still in discovery, no matter how many impressive screenshots exist.
19.7 Manual framework harness execution
The framework can be exercised manually before an agent or integrated workflow exists. This is useful for validation, reviewer training, prompt testing, and evidence-loop rehearsal. It must not be treated as architecture approval.
Manual harness execution means a human reviewer uses the HTML, DOCX, Markdown, and repo-bootstrap materials as the review framework, then asks an approved AI assistant or Copilot surface to draft structured analysis against a submitted architecture package.
Manual harness boundary
| Boundary | Requirement |
|---|---|
| Approval | Manual harness output is draft review support only. It cannot approve architecture. |
| Data | Use only data allowed in the selected Copilot or model tier. Do not include PHI, PII, GxP-impacting, confidential vendor, security-sensitive, or restricted material unless the platform/data owner approves the handling path. |
| Evidence | Findings must cite submitted evidence or mark the item as not evidenced. |
| Inference | Do not infer approval, data classification, GxP impact, privacy/PHI/PII posture, source authority, technology approval, exception approval, platform approval, funding, support ownership, or production readiness. |
| Human review | A qualified architecture reviewer must accept, edit, reject, or override all findings before any decision packet is created. |
| Retention | Do not store submitted artifacts, excerpts, or model output beyond the approved retention boundary. |
| Publication | Do not share a manual harness result as an official review record unless it is copied into the approved governance system of record by an authorized reviewer. |
Manual harness execution steps
- Confirm the data handling path and selected Copilot/model tier.
- Remove or redact any material not approved for the chosen tool surface.
- Open the v1.14.1 HTML or DOCX framework and identify the review sections that apply.
- Provide the submitted architecture summary and evidence set to the approved assistant.
- Run the manual harness prompt from
repo-bootstrap/prompts/manual-ea-review-harness.prompt.md. - Require the assistant to classify all uncertain or missing facts as
not_evidenced,requires_confirmation, orrequires_escalation. - Review the draft output against the source evidence, control logic, and non-inference rules.
- Record human reviewer actions: accept, edit, reject, override, request evidence, or escalate.
- Capture the final human-reviewed decision packet only in the approved governance path.
Manual harness output shape
The manual output should include:
- submission summary
- assumptions explicitly marked as assumptions
- evidence inventory
- missing evidence list
- likely review dimensions triggered
- source authority concerns
- AI appropriateness assessment
- platform path assessment
- data/privacy/security/GxP triggers requiring confirmation
- control-style findings with evidence state
- escalation recommendations
- human-review checklist
- final note:
Draft review support only - no approval authority
Reusable manual harness prompt
Use the prompt stored in
repo-bootstrap/prompts/manual-ea-review-harness.prompt.md.
If copied into Copilot or another approved tool, preserve the
non-inference rules and the final approval boundary exactly. A prettier
prompt that silently weakens governance is still just a decorative
liability.
20. From package to repo
The self-contained HTML should be treated as the human-readable source of truth, not as a magic executable specification. The right path is to use the HTML for intent, context, and provenance, then use the repo-bootstrap scaffold and Markdown/JSON/YAML/schema assets as the machine-readable starting point.
The build team should not simply feed the HTML to a model and ask it to "spawn the repo" without constraints. A model can help generate or fill starter files, but only if it is told not to invent missing decisions, not to turn assumptions into defaults, and not to let the agent infer governance-sensitive facts.
Recommended repo-spawn flow
- Unzip the package into a clean workspace.
- Copy the
repo-bootstrap/folder into a new repository or approved enterprise repo template. - Read the package in order: README, product brief, architecture, engineering orientation, AI appropriateness, governance routing, evaluation strategy, non-inference rules, evidence requirements, and ADRs.
- Select the delivery tier: simple Copilot prototype, Copilot Studio governed pilot, or enterprise integrated product.
- Validate required files using the bootstrap manifest and completeness checklist.
- Replace placeholders with real decisions or mark them as
needs_decision. - Validate schemas, controls, catalogs, and source authority files before coding against them.
- Build the first evidence loop before building a polished UI.
- Create a golden test set with expected outputs.
- Do not start broader development until build gates pass.
What the engineer should use
| Package asset | Engineering use |
|---|---|
| HTML blueprint | Human-readable source of truth and context |
| Comprehensive Markdown | Repo documentation source |
repo-bootstrap/ |
Starter repo file structure |
| JSON/YAML/schema assets | Seed configuration and contracts |
| Control/catalog files | Initial review intelligence seed |
| Evaluation fixture model | How to test whether the assistant works |
| Non-inference contract | What the assistant must not guess |
| Build gates | What must be true before build/pilot/productization |
Completion criteria
The repo is ready for first development only when the product brief, platform path, source authority, data classification, minimum controls, output schema, non-inference contract, test fixtures, human review workflow, run-cost assumptions, and sustainment ownership are present or explicitly marked as unresolved decisions.
If that sounds like discipline, good. It is much cheaper than debugging ambiguity after a team has already built the wrong thing.
21. Agent readiness posture v1.14.1
v1.14.1 carries forward the v1.13 agent-readiness work and fixes residual handoff hygiene defects that could make active surfaces look less trustworthy than the underlying package. The target remains narrow: make the package easier for humans to approve, engineers to implement, and agents to consume without guessing.
The current release keeps the operating contract, work modes, manifest, fixture structure, observability expectations, human review workflow, artifact pairing guidance, and prompt/instruction quality standard. It also tightens schema behavior, fixture coverage, reference-pattern/control alignment, and validation.
21.1 Current execution-preparation assets
| Asset | v1.14.1 role |
|---|---|
AGENTS.md |
Root coding-agent operating contract, edit gate, non-inference rules, validation expectations, and stop conditions. |
.github/copilot-instructions.md |
Lightweight Copilot repository instruction pointer. |
agent-work-modes.json |
Machine-readable work modes: analyze_only,
draft_artifact, modify_repo,
run_tests, prepare_pr, and
blocked_requires_human. |
BOOTSTRAP_MANIFEST.json |
Current inventory of every repo-bootstrap file, readiness posture, required decisions, and validation expectations. |
READINESS_SCORECARD.md |
Current human, repo-bootstrap, agent, build-preparation, pilot, and production readiness posture. |
ARTIFACT_PAIRING_GUIDE.md |
Pairing rules for human-readable intent and machine-readable contracts. |
PROMPT_INSTRUCTION_QUALITY_STANDARD.md |
Standard for scoped, evidence-bound, schema-aligned, non-conflicting prompt and instruction text. |
TEST_FIXTURE_PLAN.md |
Fixture taxonomy, expected-output requirements, and regression expectations. |
OBSERVABILITY.md |
Trace, event, metric, cost, and override telemetry expectations. |
HUMAN_REVIEW_WORKFLOW.md |
Human accept/edit/reject/override workflow and approval boundary. |
SOURCE_PROVENANCE.md |
Current-versus-historical source interpretation rule. |
SOURCE_AUTHORITY_OWNER_MATRIX.md |
Owner-confirmation matrix for source authority before pilot findings. |
PILOT_CHARTER.md |
Controlled pilot boundary, non-goals, success metrics, and stop conditions. |
THREAT_MODEL.md |
Threat model for LLM, agent, artifact ingestion, tool access, logging, and governance failure modes. |
scripts/validate_package.py |
Validation harness for file inventory, JSON/schema checks, fixture/output alignment, stale control references, non-inference, and human-review boundary checks. |
21.2 Historical provenance boundary
Earlier source cards and archived package material remain useful provenance. They are not current release metadata and must not be treated as canonical enterprise authority unless a current v1.14.1 artifact explicitly points to them.
The package now separates current execution-preparation assets from historical source-library material more clearly. This matters because stale version labels can cause engineers to trust the wrong artifact, which is how small handoff defects become meetings.
21.3 Agent operating model
Agents start in analyze_only. They may inspect, compare,
validate syntax, and report gaps. They may not edit until the user
explicitly opens the edit gate.
When edits are approved, agents must work in a bounded mode, update the manifest when structure changes, validate modified JSON and schemas, preserve source provenance, and report unresolved decisions.
If a task requires approval status, data classification, GxP impact,
source authority, exception status, technology approval, production
readiness, platform approval, funding, ownership, privacy/PII/PHI
posture, or compliance posture, the agent must stop or mark the item as
not_evidenced, requires_confirmation, or
requires_escalation.
21.4 Current readiness posture
| Area | v1.14.1 score | Meaning |
|---|---|---|
| Human package | 96 / 100 | Strong narrative, synchronized DOCX/Markdown/HTML package surfaces, Safari-solid showpiece baseline, and corrected active release labeling. Still blocked from pilot by unconfirmed enterprise source authority and named reviewers. |
| Repo bootstrap | 94 / 100 | Current file inventory, schemas, controls, fixtures, expected outputs, and validation are coherent enough for controlled evidence-loop build preparation. |
| Agent readiness | 92 / 100 | Strong non-inference and human-review contract. Runtime harness and reviewer workbench still need implementation. |
| Controlled build-preparation readiness | 95 / 100 | Ready for a bounded implementation proof of the evidence loop, with package surfaces synchronized and leadership/blueprint presentation baselines stabilized. |
| Pilot readiness | 55 / 100 | Not ready until pilot sources, data classification, platform path, reviewer group, retention boundary, and sustainment ownership are confirmed. |
| Production readiness | 15 / 100 | Not production-ready. No runtime, no validated integrations, no enterprise source authority approval, and no operational release gate. |
21.5 v1.14.1 stop condition
Do not treat v1.14.1 as autonomous governance, pilot approval, production approval, or vendor-owned review logic. It is a controlled build-preparation runway. Build may proceed only while unresolved enterprise decisions remain visible, named, and blocked from becoming fake defaults.
22. Build versus buy boundary
Own internally
- Product vision
- Review criteria
- Source authority
- Data classification
- Semantic/control model
- Reference patterns
- Exception logic
- Decision memory
- Approval model
- Quality and governance rules
Supplier-assisted
- UI implementation
- Connector configuration
- Copilot Studio build mechanics
- Azure/AWS orchestration implementation
- Workflow automation
- Runtime deployment
- Logging and telemetry plumbing
- Test harness and evaluation tooling
Avoid
- Vendor-owned control library
- Vendor-owned standards interpretation
- Vendor-owned decision memory
- Duplicate intake portal that bypasses the existing system of record
- Black-box scoring that architects cannot challenge
- Perpetual maintenance dependency for changes the enterprise should own
23. Delta needed before build
| Decision | What must be answered |
|---|---|
| First review type | Confirm AI architecture review as the first use case, or choose another. |
| First user group | Identify the first architects/reviewers and what they expect. |
| Intake source | Confirm whether artifacts come from AI Governance Portal, SharePoint, LeanIX, SAExpress, or manual package. |
| Platform path | Confirm Copilot/Copilot Studio versus Azure/AWS/hybrid for the pilot. |
| Governance route | Confirm business, platform, security, privacy, quality, and GxP review requirements. |
| Minimum rubric | Define the smallest useful set of review criteria. |
| Source authority | Identify canonical standards, catalogs, and reference patterns. |
| Data classification | Define what artifact classes are allowed in the pilot. |
| Evidence model | Define how findings must cite source evidence. |
| Output format | Define good output for architects, project teams, leadership, and audit. |
| Pilot set | Select historical submissions to test against. |
| Success metrics | Define accuracy, usefulness, time saved, override rate, and adoption targets. |
| Sustainment owner | Decide who owns standards, controls, and the product after the pilot. |
| TCO and value gate | Estimate build effort, run cost, tokenomics exposure, and measurable value before build approval. |
24. Immediate working plan for David and Tony
David leads
- Architecture review criteria
- EA principles and standards interpretation
- Reference patterns and anti-patterns
- Human review and approval model
- Reviewer expectations
- Leadership positioning and sensitivity handling
Tony leads
- Product framing
- Data/control model structure
- Source authority packaging
- Engineering orientation
- Platform-fit analysis
- Supplier boundary and delivery mechanics
Together
- Confirm first use case.
- Define minimum viable rubric.
- Select pilot submissions.
- Agree on output model.
- Decide which platform path to test first.
- Define what help is needed from Rob C's team or another technical resource.
25. Recommended next move
Schedule a focused working session to lock the build-readiness foundation. The session should not attempt to design the entire platform. It should answer enough to start a controlled evidence-loop implementation proof.
Suggested agenda:
- Confirm first review type.
- Confirm first user group.
- Walk the minimum review rubric.
- Identify authoritative standards and catalogs.
- Confirm source-of-record for submitted artifacts.
- Confirm platform path candidates.
- Identify governance gates.
- Define pilot success metrics.
- Identify what Rob C's team or another technical resource would own.
The output should be a one-page build-preparation charter, a source authority map, a minimum control library, a fixture-backed submission set, and a build decision for the first technical path.
26. v1.14.1 Copilot platform-alignment, manual-harness, and navigation-restoration patch
v1.14.1 is a narrow controlled addendum over the v1.13.1 baseline. It does not change the product thesis, audience model, strategy, review model, build-versus-buy boundary, control scope, pilot boundary, production boundary, or governance authority model.
The patch adds enterprise Copilot platform alignment, adds a manual framework harness execution path, adds repo-bootstrap prompt/template assets for manual review rehearsal, restores the full HTML navigation/source-library surface after the v1.14.1 compression regression, updates readiness posture conservatively, and refreshes validation and hash artifacts.
26.1 What was added or tightened
| Area | v1.14.1 outcome |
|---|---|
| Copilot platform alignment | Adds the internal Copilot maturity and governance model as a current-source input for delivery-tier interpretation. |
| Governance ladder validation | Confirms that broader agent use requires escalating governance, approval, SDLC, data authorization, support, and Copilot Service team publishing. |
| Manual harness execution | Adds a controlled method for using the framework manually with approved Copilot/model surfaces before an agent runtime exists. |
| Manual harness prompt | Adds
repo-bootstrap/prompts/manual-ea-review-harness.prompt.md
for draft evidence-bound analysis. |
| Manual output template | Adds
repo-bootstrap/templates/manual-review-output-template.md
to standardize reviewer-facing output. |
| HTML navigation restoration | Restores the full top quickbar and left navigation model from v1.13.1, then extends it for Copilot alignment, manual harness execution, and the v1.14.1 patch section. |
| Content-preservation QA | Confirms the HTML, DOCX, Markdown, repo-bootstrap, source library, and archive/package-history structure were preserved rather than compressed during the small update. |
| Copilot source provenance | Adds the AI Platform team Copilot Premium and Studio expectations as current platform-context provenance. |
| Readiness posture | Human package readiness moves to 95 and controlled build-preparation readiness moves to 94. Pilot readiness remains capped because enterprise owners and runtime decisions are still unresolved. |
| Non-inference boundary | Manual harness instructions preserve the same non-inference contract as the agent package. |
| Data handling boundary | Manual harness use is explicitly limited to data approved for the selected Copilot/model tier and use case. |
| Pilot/production boundary | Broad pilot, production routing, autonomous governance, and vendor-owned review logic remain no-go. |
26.2 What remains intentionally unresolved
The Copilot platform context validates the staged model; it does not implement the agent or approve pilot expansion. These items still require named enterprise owners before pilot expansion:
| Open decision | Required owner decision |
|---|---|
| Canonical source authority | Confirm which standards, catalogs, policies, and repositories are authoritative for pilot findings. |
| Pilot artifact data classification | Confirm which submitted artifacts may be processed, retained, excerpted, indexed, logged, or rejected. |
| Manual harness data boundary | Confirm which artifact classes may be used in Copilot Premium, Agent Builder, Copilot Studio, Azure AI Foundry, or another approved tool surface. |
| First reviewer group | Name the architects and governance reviewers who will test the output. |
| Platform path | Select Copilot Studio, Azure/AWS runtime, or a staged hybrid path for the first implementation proof. |
| Runtime logging and retention | Confirm whether telemetry stores only IDs, hashes, and spans or controlled excerpts. |
| Sustainment ownership | Name the owner for standards, controls, prompts, fixtures, source maps, and release governance. |
| Copilot Service team engagement | Confirm publishing, sharing, support, and governance process for anything beyond individual experimentation. |
26.3 v1.14.1 go/no-go position
v1.14.1 is ready for controlled evidence-loop build preparation and manual framework harness rehearsal using approved data and approved tool surfaces, with the full HTML navigation and source-library handoff surface restored. It is not ready for broad pilot, autonomous approval, production routing, or vendor-owned control logic.
The correct next gate is either:
- a manual harness test using sanitized or approved sample submissions, or
- a small implementation proof that loads fixtures, applies source authority, evaluates controls, produces evidence-bound findings, rejects unsupported inference, routes to human review, captures reviewer actions, and records reviewer overrides.
If the manual harness or implementation proof fails, fix the review brain before building a prettier interface. That sentence has now survived enough deck trauma to qualify as policy.
Source Library and Provenance
These source cards are evidence and provenance, not the main reading path. Versioned historical cards are retained for traceability and are not current release metadata unless a current v1.14.1 artifact explicitly points to them. Open them when you need the underlying source material.
Historical provenance: Build Difficulty and Intelligence-Layer Economics v1.9Explains where the hard work sits and how that affects build-versus-buy and supplier boundaries.
Back to top · Back to source list
Build Difficulty and Intelligence-Layer Economics v1.9
Purpose
This artifact explains where the real difficulty sits for the Enterprise Architecture Review Assistant and how that should influence build-versus-buy, vendor engagement, internal ownership, and pilot approval.
The short version: the hardest work is not syntax. The hardest work is converting scattered enterprise judgment into governed, testable, versioned decision assets.
Core thesis
For this product, implementation is secondary to knowledge architecture. The repo, prompts, evaluations, schemas, controls, source authority, and non-inference contract are the product. The code is the delivery mechanism.
A supplier can help wire the system. A supplier should not own the architecture intelligence layer unless the enterprise is comfortable renting back its own review judgment later.
Difficulty map
| Workstream | Relative difficulty | Best ownership posture | Why it matters |
|---|---|---|---|
| Source authority, canonical context, and classification | Very high | Internal ownership required | Requires institutional judgment, source ownership, conflict resolution, and defensibility |
| Review criteria and control model | Very high | Internal ownership required | Converts architecture judgment into repeatable controls without reducing review to dumb checkboxes |
| Non-inference and evidence discipline | Very high | Internal ownership required | Prevents the assistant from treating missing evidence as approval or inventing governance-sensitive facts |
| Evaluation fixtures and expected outputs | High | Internal led, engineering assisted | Forces the team to define what good looks like before demos start lying |
| Repo shape and agent instructions | Medium-high | Internal led, engineering assisted | Makes the work reproducible for humans, agents, suppliers, and future maintainers |
| Integration and workflow plumbing | Medium | Engineering or supplier assisted | Still real work, but a known class of pain once the review brain is clear |
| UI and experience polish | Medium-low | Engineering or supplier assisted | Important for adoption, but not the durable intellectual property |
| Syntax and code generation | Low-medium | Engineering or supplier assisted | Increasingly commoditized when architecture, controls, and evidence contracts are coherent |
Risk reduction versus labor hours
This upfront work is probably 60-75% of the hard risk-reduction work, but not necessarily 60-75% of total labor hours. Integration with enterprise systems can still consume time, budget, and patience, because apparently every enterprise system speaks a different dialect of OAuth and regret.
The difference is that integration pain is estimable and testable. Ambiguity in source authority, evidence, controls, classification, and decision ownership is harder to estimate and more dangerous to outsource.
Build-versus-buy implication
Outsource wiring, workflow, integrations, orchestration, telemetry plumbing, deployment mechanics, and focused engineering acceleration where it makes sense.
Do not outsource these assets as supplier-owned black boxes:
- Source authority and source conflict rules.
- Review criteria and control library.
- Data classification and evidence requirements.
- Non-inference and escalation behavior.
- Evaluation fixtures and expected outputs.
- Human approval and override workflow.
- Decision memory and exception rationale.
- Sustainment model for standards, controls, prompts, and schemas.
Should-build gate
A project is not build-ready merely because it is technically buildable. It is build-ready only when the enterprise can explain why it should exist, who owns the durable decision assets, how success will be measured, and what stop condition will end or redirect the effort.
If the enterprise cannot own the architecture intelligence layer internally, the better answer may be to improve the governance operating model first. Otherwise the team risks automating ambiguity, then paying someone to maintain the ambiguity with a logo on it.
Leadership framing
The hard part is not building a chatbot. The hard part is codifying how the enterprise makes architecture review decisions: which sources are authoritative, what evidence is required, what controls apply, what must never be inferred, how exceptions are handled, and how human architects approve or override the output. Once that architecture intelligence layer is defined, engineering becomes implementation. Without it, engineering just automates ambiguity.
Practical decision rule
If the work changes how the enterprise judges architecture fitness, it must be internally owned.
If the work changes how systems connect, display, route, log, deploy, or package that judgment, it can be supplier-assisted.
Agent Operating InstructionsRoot agent operating contract, work modes, non-inference rules, validation expectations, and stop conditions.
Back to top · Back to source list
AGENTS.md
Purpose
This file defines how coding agents and AI assistants should operate in the Enterprise Architecture Review Assistant repository.
The repository contains governance-sensitive architecture review logic. Agents may help analyze, draft, edit, validate, and prepare pull requests, but they must not infer governance-sensitive facts or convert missing evidence into apparent approval.
Default mode
Start every task in analyze_only mode.
In analyze_only, agents may inspect files, search
content, read documentation, validate JSON syntax, compare manifests,
and produce an assessment. They must not edit files, generate committed
outputs, install dependencies, run migrations, call external services,
or mutate source evidence.
Edit gate
Do not enter an editing mode unless the user explicitly requests implementation, package generation, file modification, or a named artifact update.
Before edits, provide or maintain a concise plan that identifies:
- goal
- decision boundary
- files likely to change
- exact proposed changes
- out-of-scope items
- governance-sensitive assumptions
- validation checks
- stop conditions
Agent work modes
Use the modes defined in agent-work-modes.json.
| Mode | Use | Write access |
|---|---|---|
analyze_only |
Inspect, assess, compare, and recommend | No |
draft_artifact |
Draft new content for human review | No committed repo mutation unless explicitly requested |
modify_repo |
Edit approved files in a bounded scope | Yes, after edit gate |
run_tests |
Run approved validation checks | Only approved/generated test output paths |
prepare_pr |
Prepare branch/commit/PR materials | Yes, after explicit instruction |
blocked_requires_human |
Stop because a human decision is required | No |
Non-inference contract
Agents must not infer or silently decide any of the following:
- approval status
- data classification
- GxP impact
- security control existence
- technology approval status
- production readiness
- support ownership
- exception approval
- business criticality
- architecture pattern compliance
- source authority
- compliance posture
- privacy, PII, or PHI posture
- operational readiness
- funding approval
- platform approval
If evidence is missing, use one of these states:
not_evidenced, requires_confirmation, or
requires_escalation.
Evidence and receipts
Every material claim should be grounded in one of:
- submitted artifact evidence
- canonical or governed reference source
- schema/control/catalog file
- inspected repository file
- test or validation output
- human reviewer decision
Separate claims into:
known: directly supported by inspected evidencetested: verified by a validation commandassumed: reasonable but not directly verifiedunknown: not established by available evidenceblocked: requires human or governance owner decision
Do not claim untested behavior as implemented capability.
Source authority
Source authority is governed by
catalogs/source-authority-map.seed.json and any later
approved source authority map.
Agents may identify conflicts, missing evidence, or stale sources. Agents must not decide which source is canonical unless the authority map or a human owner says so.
Files agents should read first
README.mdBOOTSTRAP_MANIFEST.jsonCOMPLETENESS_CHECKLIST.mdNON_INFERENCE_RULES.mdEVIDENCE_REQUIREMENTS.mdEVALUATION_STRATEGY.mdTEST_FIXTURE_PLAN.mdARTIFACT_PAIRING_GUIDE.mdOBSERVABILITY.md- ADRs under
adr/
Diff discipline
Keep changes minimal and surgical.
Do not rewrite whole files when targeted edits are enough. Do not rename schemas, control IDs, catalog paths, source authority levels, evidence states, or decision statuses without an explicit migration plan.
Stop if the task expands beyond the approved scope.
Forbidden without explicit approval
Do not do any of the following without explicit human approval:
- change source authority decisions
- change data classification decisions
- change control pass/fail logic
- change governance routing behavior
- add dependencies
- call external model, cloud, network, or provider services
- mutate submitted evidence
- remove provenance or source-library material
- generate production-ready approval outputs
- implement autonomous approval or auto-escalation closure
- create a duplicate intake portal as an assumed default
Validation expectations
Use the narrowest meaningful validation first.
Recommended checks once implementation exists:
python -m json.tool BOOTSTRAP_MANIFEST.json >/dev/null
python -m json.tool agent-work-modes.json >/dev/null
python -m json.tool schemas/architecture-submission.schema.json >/dev/null
python -m json.tool schemas/review-output.schema.json >/dev/null
python -m json.tool schemas/test-fixture-metadata.schema.json >/dev/nullWhen test harnesses are implemented, add fixture regression checks that prove:
- missing evidence remains
not_evidenced - conflicting evidence routes to human confirmation
- restricted technology triggers exception review
- fast-path claims with risk do not bypass architecture review
- no assistant output creates approval without human review
Completion report
After edits, report:
- files changed
- decisions left unresolved
- validation run
- validation not run
- residual risk
- any generated outputs created
Done definition
A change is not done until it:
- preserves the non-inference contract
- preserves source provenance
- keeps governance decisions human-controlled
- validates modified JSON or schemas
- updates the manifest when repo structure changes
- updates the readiness checklist when build readiness changes
- states remaining unknowns instead of hiding them
Artifact Pairing GuideGuidance for pairing human-readable docs with machine-readable contracts.
Back to top · Back to source list
Artifact Pairing Guide v1.8
Purpose
This guide defines how human-readable artifacts and agent-readable artifacts should pair inside the Enterprise Architecture Review Assistant package.
The package has four audiences: executives and approvers, EA/governance owners, engineers/builders, and AI agents/models. The same decision must often exist in two forms: one readable by humans and one enforceable or parseable by agents.
Pairing principle
Human-readable artifacts explain intent, rationale, context, and governance meaning. Agent-readable artifacts define structure, allowed states, validation rules, and machine-checkable contracts.
Do not force one artifact to do both jobs badly.
Required pairings
| Human-readable artifact | Agent-readable artifact | Purpose |
|---|---|---|
PRODUCT_BRIEF.md |
BOOTSTRAP_MANIFEST.json |
Connect product intent to build-readiness state |
GOVERNANCE_ROUTING.md |
governance fields in schemas and controls | Keep routing explainable and machine-checkable |
NON_INFERENCE_RULES.md |
agent-work-modes.json and schema enums |
Prevent agents from guessing governance-sensitive facts |
EVIDENCE_REQUIREMENTS.md |
review-output.schema.json |
Ensure every finding has evidence or a missing-evidence state |
EVALUATION_STRATEGY.md |
TEST_FIXTURE_PLAN.md and fixture metadata schema |
Convert eval philosophy into regression assets |
HUMAN_REVIEW_WORKFLOW.md |
review output and decision record schema | Preserve human approval and override control |
OBSERVABILITY.md |
trace/event/metric field names | Make runtime behavior auditable |
TCO_TOKENOMICS.md |
cost telemetry fields | Connect economic assumptions to measured run behavior |
| ADRs | manifest decision records | Keep durable decisions visible to humans and agents |
Authoring rules
- Put rationale in Markdown.
- Put validation contracts in JSON Schema or structured JSON.
- Put source authority in governed configuration, not prose alone.
- Put examples in fixtures, not in vague prompt text.
- Put unresolved facts as
needs_decision,not_evidenced,requires_confirmation, orrequires_escalation. - Do not encode final approval logic only in a prompt.
Change control
When a human-readable artifact changes, check whether its paired machine-readable artifact also needs an update.
When a machine-readable artifact changes, check whether the human-readable rationale and reviewer workflow still explain the behavior.
A mismatch between the two should block build readiness.
Prompt Instruction Quality StandardQuality standard for scoped, evidence-bound, testable prompts and instructions.
Back to top · Back to source list
Prompt and Instruction Quality Standard v1.8
Purpose
This standard defines how prompts, agent instructions, reviewer instructions, and model-facing task text should be written for the EA Architecture Review Assistant.
The goal is not longer prompts. The goal is precise, testable, evidence-bound instructions that reduce ambiguity without burying the agent under procedural noise.
Quality principles
| Principle | Requirement |
|---|---|
| Minimal | Include only instructions that materially affect behavior |
| Scoped | State exactly which task, artifact, source, or mode the instruction governs |
| Evidence-bound | Require evidence or a missing-evidence state for material claims |
| Testable | Define expected behavior that can be checked against fixtures |
| Non-conflicting | Do not create competing instructions across files |
| Precedence-aware | Make clear which instruction wins when files conflict |
| Human-controlled | Preserve human approval for governance-sensitive decisions |
| Schema-aligned | Output instructions must match the relevant schema |
Required instruction components
A production prompt or agent instruction should define:
- Role and task boundary.
- Allowed inputs.
- Source authority rules.
- Evidence citation requirement.
- Non-inference rules.
- Output schema or template.
- Human review trigger.
- Stop conditions.
- Validation or fixture expectation.
Anti-patterns
Avoid:
- asking the model to determine approval without human review
- mixing policy, product direction, output format, and runtime instructions in one giant prompt
- using confidence scores as evidence
- telling the model to be exhaustive when the task needs bounded output
- asking the model to fill missing governance facts from common sense
- embedding long standards text directly in prompts when source authority files or controls should carry it
- changing behavior through prompt text when it should be governed configuration
Preferred language
Use:
Use only approved source evidence.If evidence is missing, return not_evidenced.If evidence conflicts, return conflicting_evidence and route to human review.Do not infer approval, classification, GxP impact, exception status, or source authority.Return output that validates against review-output.schema.json.
Release gate
No prompt or instruction set should be promoted into pilot use until it has been tested against fixtures for:
- incomplete submissions
- conflicting evidence
- restricted technology
- fast-path claims with architecture risk
- missing data classification
- possible GxP impact
- adversarial wording
Maintenance rule
If a prompt becomes long because it contains many standards, controls, or catalogs, move those rules into structured configuration and reference them. Prompt text is an operating contract, not a landfill for governance.
Observability and Trace ContractTrace, event, metric, cost, and override telemetry contract.
Back to top · Back to source list
Observability and Trace Contract v1.8
Purpose
This artifact defines the minimum observability required before the Enterprise Architecture Review Assistant is piloted or productized.
The assistant must be measurable at the level of intake, extraction, source authority, control evaluation, evidence mapping, recommendation generation, human override, cost, and failure mode.
Observability principles
- Trace the review workflow, not just model calls.
- Record source and control versions used for each run.
- Capture missing evidence and escalation behavior explicitly.
- Capture human accept, reject, edit, and override decisions.
- Separate telemetry from approval authority. Logs prove behavior; they do not approve architecture.
- Do not log restricted content unless retention, classification, and access rules allow it.
Required identifiers
Each run should record:
| Field | Purpose |
|---|---|
run_id |
Unique review execution identifier |
submission_id |
Source submission or package identifier |
review_type |
Pilot review type, such as AI architecture review |
artifact_set_id |
Submitted artifact package identifier |
source_authority_version |
Source authority map version |
control_library_version |
Control set version |
tool_catalog_version |
Technology catalog version |
prompt_version |
Prompt/instruction version |
model_or_runtime |
Model, agent runtime, or platform path used |
human_reviewer_id |
Reviewer finalizing or overriding output, where allowed |
Required event types
| Event | Meaning |
|---|---|
review.run.started |
Review run started |
review.artifacts.inventoried |
Submitted artifacts detected and classified for processing |
review.extraction.completed |
Structured fact extraction completed |
review.source_authority.resolved |
Source authority map applied |
review.control_evaluation.completed |
Controls evaluated |
review.evidence_gap.detected |
Required evidence missing |
review.conflict.detected |
Evidence conflict detected |
review.escalation.required |
Governance or human escalation required |
review.output.generated |
Draft review output generated |
review.human.accepted |
Human accepted a finding |
review.human.edited |
Human edited a finding |
review.human.rejected |
Human rejected a finding |
review.human.overrode |
Human overrode an assistant finding |
review.run.completed |
Review run completed |
review.run.blocked |
Run blocked pending human decision or missing source |
Metrics that matter
| Metric | Why it matters |
|---|---|
| Extraction correction rate | Measures fact extraction quality |
| Evidence citation correctness | Measures defensibility |
| Not-evidenced correctness | Measures non-inference behavior |
| High-risk false negatives | Primary safety and quality risk |
| False positives | Measures noisy findings |
| Reviewer override rate | Shows weak controls, bad extraction, or bad recommendations |
| Cycle-time reduction | Shows operating value |
| Rework rate | Shows downstream usefulness |
| Cost per review | Supports TCO and tokenomics decisions |
| Source freshness age | Detects stale standards and catalogs |
Cost telemetry
Where available, record:
- model call count
- input tokens
- output tokens
- embedding/indexing cost
- document extraction cost
- storage/index size
- workflow/orchestration runtime
- human review time
- retries and failed runs
Privacy and retention
Do not capture sensitive artifact content in logs unless an approved retention and access model exists.
Prefer references, hashes, source IDs, evidence span IDs, and controlled excerpts over full artifact copies.
Pilot gate
The pilot should not expand until the team can answer:
- Which controls were evaluated?
- Which source versions were used?
- Which findings were accepted, edited, rejected, or overridden?
- Which missing evidence states were produced?
- Which escalations were triggered?
- What did the review cost to run?
- What changed after human review?
Test Fixture PlanFixture taxonomy and expected-output model for regression testing.
Back to top · Back to source list
Test Fixture Plan v1.8
Purpose
This plan defines the minimum fixture structure needed to test the EA Architecture Review Assistant before trusting it with pilot reviewers.
The assistant is not ready because its output sounds polished. It is ready only when it behaves correctly against known difficult cases.
Directory structure
tests/
README.md
fixtures/
sample/
fixture-metadata.json
submitted-artifacts/
source-notes.md
expected-outputs/
sample/
expected-review-output.json
expected-findings.md
Required fixture metadata
Each fixture should include:
| Field | Purpose |
|---|---|
fixture_id |
Stable fixture identifier |
fixture_type |
Complete, incomplete, conflicting, restricted technology, etc. |
review_type |
Review type being tested |
risk_focus |
Main behavior under test |
expected_statuses |
Expected pass/fail/not-evidenced/escalation outcomes |
required_controls |
Controls that must be evaluated |
source_authority_inputs |
Sources relevant to the fixture |
must_not_infer |
Governance-sensitive facts the assistant must not guess |
expected_human_review_trigger |
Whether human review is required and why |
Minimum fixture set
| Fixture | Required behavior |
|---|---|
| Complete good submission | Produces supported findings without over-escalation |
| Incomplete submission | Marks missing material as not_evidenced |
| Conflicting evidence | Flags conflict and routes to human/source owner confirmation |
| Restricted technology | Triggers exception review and cites catalog/source authority |
| Declining technology | Produces nuanced recommendation and remediation path |
| Missing security model | Produces gap or not-evidenced findings without inventing controls |
| Missing data classification | Requires confirmation or escalation |
| Fast-path claim with risky architecture | Does not accept fast-path claim without evidence |
| Duplicate capability scenario | Flags reuse/duplication concern when supported by sources |
| Possible GxP scenario | Requires escalation when impact cannot be ruled out |
| Non-AI deterministic use case | Challenges AI appropriateness |
| Adversarial wording | Refuses unsupported claims and promotional language |
Expected output requirements
Each expected output should include:
- summary
- finding list
- evidence or missing-evidence statement per finding
- control IDs evaluated
- status per finding
- severity
- recommendation
- human review trigger
- source/control/catalog version references
Regression rule
Every material change to prompts, controls, catalogs, schemas, source authority, or model/runtime path should be run against the golden fixture set.
Stop release if a high-risk false negative appears or if missing evidence is treated as approval.
Human Review WorkflowHuman review, override, escalation, and approval boundary.
Back to top · Back to source list
Human Review Workflow v1.8
Purpose
This workflow defines how architects and governance reviewers validate assistant output before any decision is finalized.
The assistant performs first-pass analysis. Humans retain approval authority.
Roles
| Role | Responsibility |
|---|---|
| Submitter | Provides architecture artifacts and answers follow-up evidence requests |
| Assistant | Extracts facts, evaluates controls, drafts findings, and marks uncertainty |
| Architect reviewer | Accepts, edits, rejects, or overrides findings |
| Governance owner | Confirms routing, escalation, source authority, and exception treatment |
| Domain SME | Confirms domain-specific controls or standards when needed |
| Approver | Makes final approval decision outside the assistant |
Review states
| State | Meaning |
|---|---|
| Draft generated | Assistant output exists but has not been human reviewed |
| In review | Architect is reviewing output |
| More evidence required | Submission lacks evidence needed for a meaningful decision |
| Requires confirmation | Source owner or SME confirmation is required |
| Requires escalation | Governance path must be engaged |
| Reviewed with edits | Human edited assistant output |
| Reviewed with override | Human overrode one or more findings |
| Ready for decision | Human reviewer has prepared decision packet |
| Decision recorded | Final decision and rationale are captured |
Allowed reviewer actions
| Action | Required capture |
|---|---|
| Accept finding | Reviewer, timestamp, finding ID |
| Edit finding | Original text, edited text, rationale |
| Reject finding | Rationale and evidence basis |
| Override finding | Original status, override status, rationale, evidence or judgment basis |
| Request evidence | Missing artifact or field, owner, due date |
| Escalate | Governance path and trigger |
| Finalize packet | Decision posture, conditions, residual risk |
Approval boundary
Assistant output may recommend posture, conditions, gaps, escalations, or missing evidence. It must not approve architecture submissions.
Final approval must be performed by authorized humans in the approved governance process or system of record.
Override feedback loop
Each override should be reviewed for whether it indicates:
- weak control wording
- missing source authority
- bad extraction
- bad evidence mapping
- incomplete fixture coverage
- outdated catalog status
- true reviewer judgment not suitable for automation
If the override exposes a durable rule, update the control library, fixture set, or source authority map after governance review.
Readiness ScorecardHuman package, repo bootstrap, and agent readiness scorecard.
Back to top · Back to source list
Readiness Scorecard v1.8
Purpose
This scorecard gives the package a practical readiness posture across human consumption, repo bootstrap, and agent readiness.
Scores are planning indicators, not approval. A package can be structurally strong and still blocked by missing enterprise decisions.
Historical v1.8 posture snapshot
| Area | Score | Meaning |
|---|---|---|
| Human package | 92 / 100 | Strong narrative, audience separation, governance logic, and source provenance. Remaining gap is confirmed enterprise-specific source authority and pilot outputs. |
| Repo bootstrap | 88 / 100 | Strong scaffold, manifest, schemas, controls, catalogs, ADRs, fixture structure, and readiness checklist. Remaining gap is populated pilot fixtures and selected platform path. |
| Agent readiness | 90 / 100 | Strong AGENTS.md, work modes, non-inference, artifact pairing, prompt standard, and validation posture. Remaining gap is executable test harness and runtime trace implementation. |
What blocks a 95+ score
| Blocker | Why it matters |
|---|---|
| First review type not formally locked | Controls and fixtures cannot be final without scope |
| Canonical source authority not confirmed by owners | Findings cannot be treated as governed outputs |
| Data classification for pilot artifacts not confirmed | Processing and retention boundaries remain provisional |
| Pilot fixture set not populated with real/historical examples | Evaluation remains conceptual |
| Platform path not selected | Runtime, RBAC, telemetry, and TCO assumptions remain staged |
| Sustainment owner not named | Standards, controls, prompts, and catalogs will drift |
Build-readiness interpretation
| Score band | Meaning |
|---|---|
| 90-100 | Ready for controlled build or pilot if open decisions are explicitly tracked |
| 80-89 | Strong scaffold, but pilot decisions or fixtures remain incomplete |
| 70-79 | Useful discovery package, not build-ready |
| Below 70 | Narrative or governance structure needs major repair |
Immediate path to 95+
- Lock first review type.
- Name the first reviewer group.
- Confirm pilot artifact data classes.
- Identify canonical standards/catalogs/reference patterns.
- Add 8 to 10 real or synthetic fixtures with expected outputs.
- Select platform path for the first pilot.
- Name sustainment owner for standards, controls, prompts, and source authority.
Source Provenancev1.8 source baseline and SourceMesh pattern-reference note.
Back to top · Back to source list
Source Provenance v1.8
Purpose
This artifact identifies the source posture for v1.8 repo bootstrap hardening.
Baseline source package
The v1.8 package is derived from the uploaded v1.7 ZIP and v1.7 self-contained HTML. v1.8 does not restart the product narrative or replace the prior source library.
Source library preservation
The existing source library remains provenance and evidence context. Source-library artifacts should not be promoted to canonical enterprise policy unless confirmed by source authority owners.
SourceMesh pattern review
The pinklon/sourcemesh-artifact-plane repository was
reviewed as an internal pattern reference for agent operating
discipline. The reusable patterns are:
- read-only default posture
- explicit edit gate
- work mode separation
- evidence receipts
- diff discipline
- hard stop taxonomy
- validation command posture
- backlog selector discipline
- generated-output boundary
- handoff behavior for long-running work
SourceMesh product-specific vocabulary, SourceMesh/RecallPlane architecture, run-folder contracts, rendering paths, and media-pipeline behavior are not imported into this EA package.
Interpretation
SourceMesh is a pattern reference, not an authority source for EA governance, platform approval, source authority, data classification, GxP posture, or architecture review criteria.
Historical provenance: Bootstrap Manifest v1.8Expanded machine-readable bootstrap manifest and readiness state.
Back to top · Back to source list
Bootstrap Manifest v1.8
Machine-readable repo bootstrap manifest. See
EA_Architecture_Review_Assistant_BOOTSTRAP_MANIFEST_v1.8.json
and repo-bootstrap/BOOTSTRAP_MANIFEST.json.
{
"package_name": "Enterprise Architecture Review Assistant",
"package_version": "v1.8",
"created_date": "2026-06-14",
"purpose": "Machine-readable manifest for repo bootstrap completeness, agent readiness, and build-readiness gating.",
"source_baseline": [
"uploaded v1.7 ZIP",
"uploaded v1.7 self-contained HTML"
],
"pattern_reference": [
"pinklon/sourcemesh-artifact-plane AGENTS.md and repo operating discipline reviewed as a pattern reference only"
],
"readiness_scores": {
"human_package": 92,
"repo_bootstrap": 88,
"agent_readiness": 90
},
"artifact_inventory": [
{
"path": "AGENTS.md",
"type": "agent_instruction",
"audience": [
"ai_agents",
"engineers"
],
"purpose": "Root coding-agent operating contract and non-inference boundary.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": ".github/copilot-instructions.md",
"type": "agent_instruction",
"audience": [
"ai_agents",
"engineers"
],
"purpose": "GitHub Copilot repository instruction pointer.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "README.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "PRODUCT_BRIEF.md",
"type": "human_readable",
"audience": [
"executives",
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "AI_APPROPRIATENESS.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "GOVERNANCE_ROUTING.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "ARCHITECTURE.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "ENGINEERING_ORIENTATION.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "BACKLOG.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "EVALUATION_STRATEGY.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": "TEST_FIXTURE_PLAN.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Fixture taxonomy and regression testing model.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": "NON_INFERENCE_RULES.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": "EVIDENCE_REQUIREMENTS.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_1",
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": "HUMAN_REVIEW_WORKFLOW.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Human review, override, escalation, and approval boundary.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": true
},
{
"path": "OBSERVABILITY.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Trace, event, metric, cost, and override telemetry contract.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "TCO_TOKENOMICS.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "SUPPORT_MODEL.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "SPAWN_REPO_INSTRUCTIONS.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "COMPLETENESS_CHECKLIST.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Human-readable completeness checklist and readiness scoring guide.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "READINESS_SCORECARD.md",
"type": "human_readable",
"audience": [
"executives",
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Current human, repo, and agent readiness posture.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "ARTIFACT_PAIRING_GUIDE.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Pairing rules for human-readable and agent-readable artifacts.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "PROMPT_INSTRUCTION_QUALITY_STANDARD.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Quality standard for prompts and agent instructions.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "SOURCE_PROVENANCE.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Baseline package and SourceMesh pattern-reference provenance.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "agent-work-modes.json",
"type": "machine_readable",
"audience": [
"ai_agents",
"engineers"
],
"purpose": "Machine-readable agent work modes and allowed actions.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "adr/ADR-0001-own-the-brain.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "adr/ADR-0002-platform-path.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "adr/ADR-0003-human-approval.md",
"type": "human_readable",
"audience": [
"ea_governance",
"engineers",
"ai_agents"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "schemas/architecture-submission.schema.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "schemas/review-output.schema.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "schemas/test-fixture-metadata.schema.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "controls/control-library.seed.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "controls/control-to-pattern-map.seed.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "catalogs/source-authority-map.seed.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "catalogs/tool-catalog.seed.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "catalogs/reference-pattern-library.seed.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.7 repo-bootstrap seed",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "tests/README.md",
"type": "human_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "tests/fixtures/README.md",
"type": "human_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "tests/fixtures/sample/fixture-metadata.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
},
{
"path": "tests/expected-outputs/README.md",
"type": "human_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / product owner",
"validation": "human_review",
"blocks_build_if_missing": false
},
{
"path": "tests/expected-outputs/sample/expected-review-output.json",
"type": "machine_readable",
"audience": [
"engineers",
"ai_agents",
"ea_governance"
],
"purpose": "Seed artifact required for build readiness or evaluation discipline.",
"required_for_tiers": [
"tier_2",
"tier_3"
],
"current_state": "present_seed",
"completion_state": "seeded_needs_owner_validation",
"source": "v1.8 hardening pass",
"owner_to_confirm": "EA governance / engineering",
"validation": "json_syntax",
"blocks_build_if_missing": false
}
],
"required_directories": [
"adr",
"schemas",
"controls",
"catalogs",
"tests",
"tests/fixtures",
"tests/expected-outputs",
".github"
],
"required_decisions_before_build": [
"first_review_type",
"platform_path",
"source_authority",
"data_classification",
"minimum_control_set",
"human_review_workflow",
"run_cost_assumptions",
"sustainment_owner",
"pilot_fixture_set",
"observability_retention_boundary"
],
"allowed_evidence_states": [
"supported",
"not_evidenced",
"conflicting_evidence",
"requires_confirmation",
"requires_escalation",
"not_applicable"
],
"do_not_infer": [
"approval_status",
"data_classification",
"gxp_impact",
"security_control_existence",
"technology_approval_status",
"production_readiness",
"ownership",
"exception_approval",
"business_criticality",
"platform_approval",
"source_authority",
"privacy_or_pii_posture",
"operational_readiness",
"funding_approval"
],
"agent_work_modes_ref": "agent-work-modes.json",
"human_approval_required": true,
"build_ready_when": [
"manifest_required_artifacts_present",
"json_files_validate",
"first_review_type_selected",
"source_authority_confirmed_for_pilot",
"data_classification_confirmed_for_pilot",
"fixtures_and_expected_outputs_populated",
"human_review_workflow_accepted",
"observability_contract_accepted",
"sustainment_owner_named"
]
}Historical provenance: Repo Spawn Instructions v1.7Practical instructions for turning the package into a development repository without inventing missing decisions.
Back to top · Back to source list
Repo Spawn Instructions v1.7
Purpose
This file explains how an engineer should turn the EA Architecture Review Assistant package into a working development repository without guessing intent, hard-coding review criteria into prompts, or treating the HTML as a magic spell. It is not the final implementation architecture. It is the disciplined starting path.
Core rule
Do not start by asking a model to “build the repo from the HTML” and trusting whatever comes back. That can be useful as a drafting accelerator, but the authoritative starting point is the packaged repo-bootstrap scaffold plus the source, schema, control, catalog, and evaluation artifacts.
The HTML is the human-readable source of truth. The repo-bootstrap folder is the initial machine-readable seed.
Recommended path
Step 1 - Unzip the package
Unzip the v1.7 package into a clean workspace.
unzip EA_Architecture_Review_Assistant_David_Package_v1.7.zip -d ea-review-assistant-package
cd ea-review-assistant-packageStep 2 - Create the development repo
Create a new repository and copy the bootstrap scaffold into it.
mkdir ea-review-assistant
cp -R repo-bootstrap/* ea-review-assistant/
cd ea-review-assistantIf the organization already has a required repo template, use that template first, then copy the repo-bootstrap contents into the correct folders.
Step 3 - Read the package in the right order
Engineers should read the package in this order:
README.mdPRODUCT_BRIEF.mdARCHITECTURE.mdENGINEERING_ORIENTATION.mdAI_APPROPRIATENESS.mdGOVERNANCE_ROUTING.mdEVALUATION_STRATEGY.mdNON_INFERENCE_RULES.mdEVIDENCE_REQUIREMENTS.md- ADRs in
adr/
Then review schemas, catalogs, controls, and test fixtures.
Step 4 - Select the delivery tier
Before writing code, identify the intended delivery tier.
| Tier | Build path | What it means |
|---|---|---|
| Tier 1 | Simple Copilot / Agent Builder POC | Lightweight prototype, limited integration, mostly manual source package |
| Tier 2 | Copilot Studio governed pilot | Shared team agent, controlled knowledge, connectors, RBAC, output schema, telemetry |
| Tier 3 | Enterprise integrated product | System-of-record integration, decision memory, audit, source authority, quality/security/privacy routing |
The Enterprise Architecture Review Assistant should not be treated as Tier 1 unless explicitly positioned as a disposable learning prototype.
Step 5 - Validate required files exist
Run a basic manifest check before development starts. At minimum, the repo should contain:
README.md
PRODUCT_BRIEF.md
AI_APPROPRIATENESS.md
GOVERNANCE_ROUTING.md
ARCHITECTURE.md
ENGINEERING_ORIENTATION.md
BACKLOG.md
EVALUATION_STRATEGY.md
NON_INFERENCE_RULES.md
EVIDENCE_REQUIREMENTS.md
TCO_TOKENOMICS.md
SUPPORT_MODEL.md
adr/
schemas/
controls/
catalogs/
tests/
If any of these are missing, the repo is not ready for build.
Step 6 - Replace placeholders with actual decisions
Before build, resolve or explicitly mark the following:
| Placeholder area | Required decision |
|---|---|
| First review type | AI architecture review, HR architecture review, or another bounded scope |
| Platform path | Copilot, Copilot Studio, Azure/AWS enterprise runtime, or hybrid |
| Source authority | Which systems/documents are canonical vs reference-only |
| Data classification | What the agent can read, process, store, and return |
| Minimum controls | Which review criteria must be evaluated in the pilot |
| Human workflow | Who reviews, edits, overrides, and approves |
| TCO | Build/run cost assumptions and token/model economics |
| Sustainment | Who owns rules, sources, platform operations, and support |
Step 7 - Validate schemas and seed configuration
The initial repo may contain placeholder schema/config files. Before coding against them:
- Validate JSON syntax.
- Confirm required fields.
- Map each control to required evidence.
- Map each output field to a source, control, or human decision.
- Mark incomplete items as
needs_decision, not as approved defaults.
A useful engineering rule: unknowns are data, not blanks to be filled by vibes.
Step 8 - Build the first evaluation loop before building the full app
The first working loop should prove this sequence:
- Load one controlled submission package.
- Extract structured facts.
- Apply source authority rules.
- Evaluate a small control set.
- Produce evidence-backed findings.
- Mark missing evidence as
not_evidenced. - Route output to a human reviewer.
- Capture reviewer acceptance, rejection, edit, or override.
- Store a decision record.
- Re-run against the same fixture and confirm repeatability.
If this loop does not work, do not build a prettier UI. That is how bad systems get expensive haircuts.
Step 9 - Create the first golden test set
Before broader pilot use, define at least five fixtures:
| Fixture | Purpose |
|---|---|
| Complete good submission | Confirms expected positive behavior |
| Incomplete submission | Confirms missing evidence behavior |
| Restricted or declining technology | Confirms catalog/control behavior |
| Fast-path claim with risky design | Confirms governance escalation |
| Ambiguous or conflicting submission | Confirms non-inference behavior |
Each fixture needs an expected output file under
tests/expected-outputs/.
Step 10 - Build only after gates pass
Development should not start until:
- First review type is locked.
- Minimum rubric exists.
- Source authority is defined.
- Data classification is understood.
- Pilot submissions are available.
- Platform path is selected.
- Human review workflow is agreed.
- Run-cost assumptions exist.
- Sustainment owner is named.
Model-assisted repo generation
A model can help generate missing starter files, but only under constraint.
Use a prompt like:
Use the attached Enterprise Architecture Review Assistant package and repo-bootstrap scaffold.
Do not invent missing decisions.
Generate only the files listed in the bootstrap manifest.
If required information is missing, mark the field as needs_decision.
Do not convert assumptions into defaults.
Do not make the agent an approver.
Preserve the non-inference and evidence contract.
The output must be reviewed by the product/architecture owner before engineering treats it as authoritative.
Completion check
The repo is ready for first development only when:
| Check | Required state |
|---|---|
| Product brief | Approved enough for pilot |
| Platform path | Selected or explicitly staged |
| Source authority | Defined for pilot sources |
| Data classification | Confirmed for pilot inputs and outputs |
| Control library | Minimum control set exists |
| Output schema | Draft schema exists and validates |
| Non-inference contract | Present and accepted |
| Test fixtures | At least five defined |
| Human review workflow | Agreed |
| Build/run economics | Estimated |
| Open decisions | Tracked, not buried |
What not to do
Do not:
- Treat the HTML alone as an executable specification.
- Let a model infer missing governance decisions.
- Hard-code review criteria into prompt text.
- Start with UI before the evidence loop works.
- Build a second intake portal unless integration is blocked.
- Treat source-library documents as canonical unless source authority says so.
- Let the agent approve architecture submissions autonomously.
Historical provenance: Build Gates and Stop Conditions v1.7Defines build, pilot, productization gates and stop conditions.
Back to top · Back to source list
Build Gates and Stop Conditions v1.7
Purpose
This artifact defines the gates that must be cleared before the EA Architecture Review Assistant moves from concept to build, from build to pilot, and from pilot to productization.
The goal is to avoid approving a capability that is impressive in demo form but unowned, unvalidated, expensive to run, or wrong in ways that matter.
Gate 0 - Concept readiness
Required before any build path is selected:
| Gate | Required evidence |
|---|---|
| Problem statement | Clear problem, users, pain, value hypothesis |
| AI appropriateness | AI need and deterministic alternative documented |
| Governance triggers | Business/platform/security/privacy/quality/GxP triggers identified |
| Ownership | Product owner and domain owner identified |
| Scope | First review type and non-goals defined |
Gate 1 - Build readiness
Required before engineering starts:
| Gate | Required evidence |
|---|---|
| Minimum rubric | Initial review criteria approved by domain owner |
| Source authority | Canonical/reference/prohibited sources identified |
| Data classification | Allowed data classes and processing boundaries known |
| Output contract | Review output schema or template defined |
| Non-inference contract | Unsupported facts must be marked, not guessed |
| Test fixture set | Pilot test submissions selected |
| Platform path | Candidate implementation path selected |
| TCO estimate | Build/run/sustainment cost estimated |
Gate 2 - Pilot readiness
Required before running with real reviewers:
| Gate | Required evidence |
|---|---|
| Controls loaded | Initial control library exists and is versioned |
| Source package loaded | Approved source materials accessible |
| Human review flow | Accept/reject/edit/override path defined |
| Evidence behavior | Findings cite evidence or mark missing evidence |
| Logging | Review steps and outputs are logged |
| Evaluation plan | Pilot scoring method defined |
| Failure handling | Escalation and stop conditions known |
Gate 3 - Productization readiness
Required before broader deployment:
| Gate | Required evidence |
|---|---|
| Measured value | Review effort reduction and usefulness demonstrated |
| Quality threshold | False negatives, false positives, and override rates acceptable |
| Sustainment owner | Standards/control/product ownership confirmed |
| Operating model | Support, change, release, and monitoring defined |
| Security/privacy signoff | Required reviews complete |
| Quality/GxP signoff | Required reviews complete if triggered |
| Cost approval | Run economics accepted |
| Exit strategy | Supplier/platform dependency understood |
Stop or redirect conditions
Stop, pause, or redirect if any of the following occur:
- The assistant guesses instead of marking missing evidence.
- High-risk false negatives appear in the pilot.
- Architect override rate is too high.
- Review criteria cannot be defined.
- Source ownership is unresolved.
- Data classification is unknown.
- Platform path remains unresolved.
- Run cost exceeds expected value.
- Sustainment owner is not named.
- Use case is better solved with deterministic workflow, rules, search, or dashboarding.
- Supplier dependency would cause the enterprise to rent back its own review intelligence.
Build gate principle
The build should proceed only when the next decision is smaller than the last one. If every meeting reveals a new unknown, the project is not ready for engineering. It is still in discovery, wearing a toolchain costume.
Historical provenance: Control Authoring Standard v1.7Defines what a well-formed, testable, evidence-bound architecture control contains.
Back to top · Back to source list
Control Authoring Standard v1.7
Purpose
This artifact defines what a well-formed architecture review control should contain before it is added to the control library. The goal is to make controls clear, evidence-bound, versioned, testable, and usable by both humans and agents.
A control is not a vague principle. A control is a reviewable rule with applicability logic, evidence requirements, pass/fail/not-evidenced behavior, and remediation guidance.
Minimum control fields
| Field | Required? | Description |
|---|---|---|
control_id |
Yes | Stable unique identifier |
name |
Yes | Human-readable control name |
category |
Yes | Intake, security, data, AI, integration, operations, compliance, etc. |
rule_statement |
Yes | What must be true |
applicability_trigger |
Yes | When the control applies |
severity |
Yes | Critical, high, medium, low, advisory |
required_evidence |
Yes | What evidence must be present |
pass_criteria |
Yes | What evidence satisfies the control |
fail_criteria |
Yes | What evidence proves the control is not met |
not_evidenced_criteria |
Yes | What missing evidence prevents evaluation |
source_authority_dependency |
Yes | Which source authority level is required |
recommended_remediation |
Yes | What submitter or reviewer should do next |
human_review_required |
Yes | Whether final judgment is required |
linked_patterns |
Recommended | Reference patterns or anti-patterns |
version |
Yes | Control version |
owner |
Yes | Domain owner responsible for updates |
effective_date |
Recommended | Date the control becomes active |
retirement_date |
Optional | Date the control is retired |
Control status outcomes
| Outcome | Meaning |
|---|---|
| Pass | Required evidence supports the control |
| Fail | Evidence shows the control is not satisfied |
| Not evidenced | Required evidence is missing |
| Conflicting evidence | Source evidence conflicts |
| Not applicable | Control does not apply to this submission |
| Requires confirmation | Human/source owner confirmation is required |
| Requires escalation | Governance review is required |
Good control example
{
"control_id": "EA-AI-DATA-001",
"name": "AI use case declares data classification",
"category": "data_readiness",
"rule_statement": "All AI-enabled architecture submissions must declare the classification of each input and output data source.",
"applicability_trigger": "applies_when: submission.includes_ai_capability == true",
"severity": "high",
"required_evidence": ["data classification table", "source inventory", "data flow description"],
"pass_criteria": "Each data source and output has an explicit classification and owner.",
"fail_criteria": "One or more data sources are described but classification is absent or contradicted.",
"not_evidenced_criteria": "Submission does not include a data classification section or equivalent evidence.",
"source_authority_dependency": "canonical_or_authorized_submission_artifact",
"recommended_remediation": "Provide data classification, data owner, and allowed processing location for each source and output.",
"human_review_required": true,
"linked_patterns": ["PAT-AI-GOV-DATA-001"],
"version": "0.1",
"owner": "EA/Data Governance",
"effective_date": "TBD"
}Authoring rules
- Controls must be testable.
- Controls must separate failure from missing evidence.
- Controls must identify the evidence required to pass.
- Controls must not depend on model confidence.
- Controls must record owner and version.
- Controls must not hide governance decisions in prompt text.
- Controls should be written so a human reviewer can challenge or override them.
Review gate for new controls
A new control should not be published until:
- The owner is named.
- The source authority dependency is known.
- Pass/fail/not-evidenced criteria are defined.
- At least one test fixture exists.
- Expected output is defined.
- Human reviewer guidance is written.
Historical provenance: Requirements to Artifact Traceability Matrix v1.7Maps discovery questions to repo artifacts and build-readiness outputs.
Back to top · Back to source list
Requirements to Artifact Traceability Matrix v1.7
Purpose
This artifact maps intake and discovery questions to the repo artifacts they should populate. The point is to make requirements gathering operational. Questions should not float around in meeting notes where future engineers can ignore them by accident, the sacred enterprise tradition.
Traceability model
| Question area | Core questions | Output artifact |
|---|---|---|
| Problem and vision | What problem are we solving? Why now? What happens if we do nothing? | PRODUCT_BRIEF.md |
| Scope | What is in scope for v1? What is explicitly out of scope? | PRODUCT_BRIEF.md, BACKLOG.md |
| Users and roles | Who submits, reviews, approves, administers, and audits? | PRD.md, RBAC_AND_PERMISSION_MODEL.md |
| AI appropriateness | Why is AI needed? What deterministic alternative was considered? | AI_APPROPRIATENESS.md |
| Governance routing | What business, platform, security, privacy, quality, or GxP reviews apply? | GOVERNANCE_ROUTING.md,
SECURITY_PRIVACY_QUALITY.md |
| Data sources | What sources are used? Where do artifacts come from? | SOURCE_AUTHORITY_MAP.json,
KNOWLEDGE_SOURCES.md |
| Source authority | Which source wins when sources conflict? | SOURCE_AUTHORITY_MAP.json |
| Data classification | What data classes can be processed and where? | DATA_CLASSIFICATION.md |
| Review criteria | What standards, principles, patterns, and checks judge the submission? | CONTROL_LIBRARY.seed.json,
EVAL_RUBRIC.md |
| Reference patterns | What good patterns and anti-patterns are used for comparison? | PATTERN_LIBRARY.seed.json |
| Technology status | Which tools are approved, restricted, declining, emerging, or exception-required? | TOOL_CATALOG.seed.json |
| Submission contract | What must a submitter provide? | SUBMISSION_SCHEMA.json |
| Output contract | What must the assistant produce? | REVIEW_OUTPUT_SCHEMA.json |
| Evidence model | What counts as support? What counts as missing? | EVIDENCE_REQUIREMENTS.md |
| Non-inference | What must the assistant never assume? | NON_INFERENCE_RULES.md |
| Human review | How do architects accept, reject, edit, override, and finalize? | HUMAN_REVIEW_WORKFLOW.md |
| Decision memory | What final decision data is retained and reused? | DECISION_MEMORY_MODEL.md |
| Platform path | Copilot, Copilot Studio, Azure, AWS, hybrid, or supplier-assisted? | ARCHITECTURE.md, ADRs |
| TCO and tokenomics | What does it cost to build, run, sustain, and scale? | TCO_TOKENOMICS.md |
| Evaluation | How do we prove it works and does not guess? | EVALUATION_STRATEGY.md,
TEST_FIXTURE_PLAN.md |
| Sustainment | Who owns updates, support, monitoring, and degradation response? | SUPPORT_MODEL.md, RUNBOOK.md |
Intake completeness gate
A submission is not build-ready unless each required artifact has either:
- Complete content,
- A named owner and due date, or
- A documented decision that the artifact is not required for the selected build tier.
Traceability expectation
Every major requirement should map to at least one artifact. Every major artifact should map back to at least one requirement or governance need. If neither is true, the artifact is probably theater.
Historical provenance: Non-Inference and Evidence Contract v1.7Evidence states and hard limits on what the assistant must not infer.
Back to top · Back to source list
Non-Inference and Evidence Contract v1.7
Purpose
This artifact defines the evidence boundary for the EA Architecture Review Assistant. The assistant may summarize, extract, compare, and recommend. It may not invent facts, silently resolve uncertainty, or treat missing evidence as approval.
The core rule is simple: if the evidence is not present in an approved source, the assistant must say so.
Non-inference rule
The assistant must not infer any of the following unless explicitly supported by approved source evidence or confirmed by a human reviewer:
- Approval status
- Data classification
- GxP impact
- Security control existence
- Technology approval status
- Production readiness
- Support ownership
- Exception approval
- Business criticality
- Architecture pattern compliance
- Source authority
- Compliance posture
- Privacy or PII posture
- Operational readiness
- Funding approval
- Platform approval
Allowed evidence states
| State | Meaning | Required behavior |
|---|---|---|
| Supported | Evidence exists in an approved source | Cite the evidence and apply the relevant control |
| Not evidenced | Required evidence is missing | State what is missing and what cannot be concluded |
| Conflicting evidence | Sources disagree | Show the conflict and route to human/source owner confirmation |
| Requires confirmation | Evidence is incomplete or ambiguous | Ask for human or source owner confirmation |
| Requires escalation | Governance trigger exists or cannot be ruled out | Route to the correct governance path |
| Not applicable | Control does not apply based on documented facts | Explain why the control is not applicable |
Finding requirements
Every finding must include:
- Finding ID
- Category
- Control ID or criterion
- Status: pass, fail, not evidenced, conflicting, requires confirmation, not applicable
- Severity
- Evidence citation or missing evidence statement
- Rationale
- Recommended action
- Human review requirement
- Standards/control version
Prohibited language
The assistant should avoid unsupported language such as:
- “This is compliant” when evidence only partially supports it.
- “This appears production-ready” without operational evidence.
- “No GxP impact” without a GxP assessment source.
- “Security controls are adequate” without specific control evidence.
- “Approved technology” without catalog authority.
- “Fast path confirmed” without eligibility evidence.
Preferred language
Use explicit evidence-bound language:
- “Supported by submitted artifact X.”
- “Not evidenced in the submitted package.”
- “Conflicting evidence exists between source A and source B.”
- “Requires source owner confirmation.”
- “Requires architecture reviewer decision.”
- “Requires governance escalation before approval recommendation.”
Human override
Architects may override assistant findings. Overrides must capture:
- Reviewer identity
- Date/time
- Original assistant finding
- Override decision
- Rationale
- Evidence or judgment basis
- Whether the control or source authority model should be updated
Release gate
No pilot output should be treated as reliable unless the assistant demonstrates consistent non-inference behavior across incomplete, ambiguous, conflicting, and adversarial test fixtures.
Historical provenance: Evaluation and Test Fixture Model v1.7Eval strategy, golden test fixtures, human scoring, and regression expectations.
Back to top · Back to source list
Evaluation and Test Fixture Model v1.7
Purpose
This artifact defines how the Enterprise Architecture Review Assistant should be evaluated before anyone trusts it. The objective is to test whether the assistant extracts facts correctly, applies controls correctly, cites evidence correctly, refuses to infer unsupported facts, and helps human architects make better decisions faster.
The goal is not to make the output sound impressive. The goal is to make the output defensible.
Evaluation principle
Do not measure one vague thing called accuracy. Measure the work in layers.
| Layer | What to measure | What good looks like |
|---|---|---|
| Intake completeness | Required artifacts and missing information | Missing evidence is identified without invention |
| Fact extraction | Technologies, integrations, data, security model, environment, owner | Extracted facts match source artifacts |
| Source authority | Canonical/reference/prohibited source handling | Canonical sources win and conflicts are flagged |
| Classification | Data sensitivity, GxP, PII, PHI, regulated indicators | Conservative routing when evidence is incomplete |
| Control evaluation | Pass/fail/not-evidenced decisions | Decisions align to approved rubric |
| Evidence mapping | Source support for each finding | Every finding cites evidence or states what is missing |
| Recommendation quality | Decision posture and conditions | Recommendation follows evidence and does not overstate certainty |
| Non-inference behavior | Refusal to guess | Unsupported items are marked not evidenced or require confirmation |
| Human usefulness | Architect value after review | Output reduces review effort and improves consistency |
| Regression stability | Behavior after changes | Known fixtures continue passing after updates |
Fixture set design
The pilot should include a small but deliberate fixture set. Happy-path-only testing is how pilots lie.
| Fixture type | Purpose |
|---|---|
| Complete good submission | Confirms normal positive behavior |
| Incomplete submission | Tests not-evidenced handling |
| Conflicting evidence | Tests source authority and uncertainty handling |
| Restricted technology | Tests catalog and exception logic |
| Declining technology | Tests recommendation nuance |
| Missing security model | Tests control failure and remediation |
| Missing data classification | Tests governance escalation |
| Fast-path claim with risky architecture | Tests fast-path validation |
| Duplicate capability scenario | Tests existing capability/reuse logic |
| GxP or possible-GxP scenario | Tests quality routing |
| Non-AI deterministic use case | Tests AI appropriateness gate |
| Ambiguous submission | Tests refusal to guess |
| Adversarial wording | Tests whether submitter language can game the review |
Minimum fixture counts by build tier
| Build tier | Minimum fixtures | Expected outputs |
|---|---|---|
| Tier 1 - Simple Copilot / Agent Builder POC | 3 | Markdown expected findings |
| Tier 2 - Copilot Studio governed pilot | 8 to 10 | Structured expected outputs, preferably JSON |
| Tier 3 - Enterprise integrated product | 12 to 20 | Schema-validated outputs and regression tests |
Golden test set
The first golden set should include:
- One historical AI architecture submission that should pass first-pass review with minor conditions.
- One submission that is incomplete and should return not-evidenced findings.
- One submission that claims fast-path eligibility but contains architecture risk.
- One submission using a restricted or declining technology.
- One submission with missing data classification.
- One submission that is better solved deterministically than with AI.
- One submission that has conflicting source evidence.
- One submission that may trigger security, privacy, or GxP routing.
Human review scoring
Each output should be scored by architects using a small rubric.
| Score area | Question |
|---|---|
| Extraction correctness | Did the assistant correctly identify the solution facts? |
| Evidence quality | Did findings cite correct source evidence? |
| Control correctness | Were controls applied correctly? |
| Missing evidence handling | Did the assistant avoid guessing? |
| Recommendation quality | Was the recommendation aligned to the evidence? |
| Usefulness | Did this reduce manual review effort? |
| Trust | Would an architect use this output as a first-pass review packet? |
Metrics that matter
| Metric | Why it matters |
|---|---|
| Finding precision | Avoid noisy false positives |
| High-risk false negatives | Detect missed serious issues |
| Evidence citation correctness | Prevent unsupported conclusions |
| Not-evidenced correctness | Prove the assistant can stop at the evidence boundary |
| Reviewer override rate | Identify weak controls or bad extraction |
| Cycle-time reduction | Validate operational value |
| Rework rate | Determine whether outputs improve downstream action |
| Adoption | Confirm reviewers actually use it |
| Standards freshness | Prevent degradation over time |
Metrics to treat carefully
| Bad or incomplete metric | Why it is dangerous |
|---|---|
| Long report length | More words can hide weak evidence |
| High approval rate | Could mean rubber-stamping |
| Low escalation rate | Could mean missed risk |
| User satisfaction alone | People like fast answers even when wrong |
| Model confidence | Model confidence is not evidence |
| Token reduction alone | Cheaper wrong output is still wrong |
| Number of findings | More findings is not better if noisy |
| Speed alone | Fast bad reviews are still bad reviews |
Regression expectation
Every change to prompts, controls, source authority, catalog status, schemas, platform runtime, or model provider should be tested against the golden fixture set. If high-risk false negatives appear, the release should stop.
Historical provenance: Repo Bootstrap Readiness Model v1.7Tiered repo-readiness model for concept, simple agent, governed pilot, and enterprise product paths.
Back to top · Back to source list
Repo Bootstrap Readiness Model v1.7
Purpose
This artifact defines the minimum documentation, configuration, schema, test, and decision files needed to seed a development repository from the Enterprise Architecture Review Assistant package.
The goal is not to create paperwork for its own sake. The goal is to prevent a build team from starting with unclear intent, undocumented assumptions, unowned data, untested prompts, or controls embedded in code where nobody can govern them later. Tiny procedural tragedy avoided, which is apparently still progress.
Core principle
The HTML package is the human-readable source of truth. The repo bootstrap pack is the machine-actionable starting structure extracted from that source.
| Layer | Role |
|---|---|
| Self-contained HTML | Human-readable narrative, source library, provenance, and intake pattern |
| Markdown artifacts | Repo documentation and engineering handoff |
| JSON/YAML/schema files | Seed configuration for submission contracts, outputs, catalogs, controls, and source authority |
| ADRs | Architecture decision records for early non-reversible decisions |
| Test fixtures | Validation baseline for extraction, evidence, control evaluation, and non-inference behavior |
Repo readiness tiers
Different build paths require different levels of rigor. A simple Copilot prototype should not require the same package as an enterprise integrated platform, because bureaucracy is not a substitute for architecture.
| Tier | Use case | Minimum expectation |
|---|---|---|
| Tier 0 - Concept / intake | Idea still being shaped | Intent, value hypothesis, AI appropriateness, governance triggers, open questions |
| Tier 1 - Simple Copilot or Agent Builder POC | Narrow assistant, small audience, limited integration | Instructions, knowledge sources, prompts, tests, limitations, simple evaluation rubric |
| Tier 2 - Copilot Studio governed pilot | Shared agent, controlled knowledge, connectors, permissions, workflow | Architecture, source authority, classification, control library, output schema, RBAC, observability, TCO |
| Tier 3 - Enterprise integrated product | Systems of record, decision memory, audit, quality/security/GxP gates, durable run model | Full PRD, system architecture, schemas, controls, catalogs, test fixtures, validation plan, runbook, support model |
Minimum baseline for any AI or agentic build
Every submission should answer these before engineering starts:
- What problem are we solving?
- Who uses it and who owns it?
- Why is AI appropriate?
- What deterministic alternative was considered?
- What data does it touch?
- Which sources are authoritative?
- What outputs does it produce?
- Who can trust, edit, approve, or override the output?
- What does it cost to build and run?
- What governance gates apply?
- What would make us stop or pivot?
Recommended repo structure
ea-review-assistant/
README.md
PRODUCT_BRIEF.md
AI_APPROPRIATENESS.md
GOVERNANCE_ROUTING.md
ARCHITECTURE.md
ENGINEERING_ORIENTATION.md
BACKLOG.md
EVALUATION_STRATEGY.md
NON_INFERENCE_RULES.md
EVIDENCE_REQUIREMENTS.md
TCO_TOKENOMICS.md
SUPPORT_MODEL.md
adr/
ADR-0001-own-the-brain.md
ADR-0002-platform-path.md
ADR-0003-human-approval.md
schemas/
architecture-submission.schema.json
review-output.schema.json
controls/
control-library.seed.json
control-to-pattern-map.seed.json
catalogs/
source-authority-map.seed.json
tool-catalog.seed.json
reference-pattern-library.seed.json
tests/
fixtures/
expected-outputs/
Tiered artifact checklist
Tier 0 - Concept / intake
| Artifact | Required? | Notes |
|---|---|---|
INTENT.md |
Yes | Problem, vision, users, value hypothesis |
AI_APPROPRIATENESS.md |
Yes | Why AI is needed, or why deterministic automation is better |
GOVERNANCE_ROUTING.md |
Yes | Business, platform, security, privacy, quality, GxP triggers |
SOURCE_PROVENANCE.md |
Yes | Where the idea and supporting evidence came from |
OPEN_QUESTIONS.md |
Yes | Decisions required before build |
Tier 1 - Simple Copilot / Agent Builder POC
| Artifact | Required? | Notes |
|---|---|---|
README.md |
Yes | Repo entry point |
PRODUCT_BRIEF.md |
Yes | Problem, scope, non-goals, success definition |
AGENT_INSTRUCTIONS.md |
Yes | Behavior, tone, allowed sources, restrictions |
KNOWLEDGE_SOURCES.md |
Yes | Approved documents/sites/files |
SAMPLE_PROMPTS.md |
Yes | Prompt set for pilot users |
EVALUATION_CHECKLIST.md |
Yes | Lightweight scoring guide |
TEST_CASES.md |
Yes | At least 3 representative examples |
RISK_AND_LIMITATIONS.md |
Yes | Known gaps, non-inference rules, escalation rules |
Tier 2 - Copilot Studio governed pilot
| Artifact | Required? | Notes |
|---|---|---|
| Tier 1 artifacts | Yes | All carry forward |
ARCHITECTURE.md |
Yes | Logical architecture and integration assumptions |
DATA_CLASSIFICATION.md |
Yes | Data classes, restrictions, allowed processing zones |
SOURCE_AUTHORITY_MAP.json |
Yes | Canonical vs reference-only sources |
REVIEW_OUTPUT_SCHEMA.json |
Yes | Standard output contract |
CONTROL_LIBRARY.seed.json |
Yes | Minimum criteria translated into controls |
CONNECTOR_PLAN.md |
Yes | M365, SharePoint, portal, API, or manual package path |
RBAC_AND_PERMISSION_MODEL.md |
Yes | User roles, reviewer roles, admin roles, source permissions |
OBSERVABILITY.md |
Yes | Logs, traces, usage, errors, override tracking |
TCO_TOKENOMICS.md |
Yes | Build/run economics and licensing exposure |
adr/ |
Yes | Early decisions captured explicitly |
Tier 3 - Enterprise integrated product
| Artifact | Required? | Notes |
|---|---|---|
| Tier 2 artifacts | Yes | All carry forward |
PRD.md |
Yes | Product requirements and acceptance model |
SYSTEM_ARCHITECTURE.md |
Yes | Runtime, integration, data, control, and security architecture |
SUBMISSION_SCHEMA.json |
Yes | Valid architecture submission contract |
DECISION_MEMORY_MODEL.md |
Yes | Final decision record and reuse model |
SECURITY_PRIVACY_QUALITY.md |
Yes | Security, privacy, quality, GxP routing |
VALIDATION_PLAN.md |
Yes | Validation and evidence strategy |
RUNBOOK.md |
Yes | Support and operations |
SUPPORT_MODEL.md |
Yes | Ownership, escalation, sustainment |
tests/fixtures/ |
Yes | Curated test submissions |
tests/expected-outputs/ |
Yes | Canonical outputs for regression testing |
Recommendation for this Enterprise Architecture Review Assistant
The Enterprise Architecture Review Assistant should not start below Tier 2 if it will be shared with a reviewer group or connected to enterprise sources. If it integrates with the AI Governance Portal, architecture repository, decision memory, or any regulated review path, it should be treated as Tier 3.
A Tier 1 prototype is acceptable only if clearly labeled as a discovery exercise, not a production review mechanism.
Historical provenance: Comprehensive Assessment v1.4Current consolidated narrative, engineering orientation, resourcing, TCO, tokenomics, and platform alignment.
Back to top · Back to source list
Enterprise Architecture Review Assistant
Comprehensive Alpha Intake, Engineering Orientation, Platform Alignment, Resourcing, and Value Blueprint v1.7
Purpose
This package explains what we are trying to build, why it matters, how it should be governed, what needs to be defined before engineering starts, how the emerging enterprise AI platform patterns should influence the delivery path, and what roles, effort, and run economics must be understood before this moves from idea to build.
It is written for four audiences at once:
- David and the EA/governance team, who own the review judgment, standards, criteria, and final decision model.
- Tony and product/architecture collaborators, who need to convert the idea into a buildable product shape.
- Engineering or platform teams, including Rob C’s team or equivalent implementation resources, who may help wire the capability into approved platforms and systems.
- Future submitters, because this package should become the alpha example of what good AI or agentic intake should look like before anyone starts building.
The short version: this is not a proposal to replace architects with an agent, and it is not a proposal to build a clever demo that cannot be sustained. It is a proposal to build an internally owned architecture intelligence layer that helps architects do first-pass review faster, more consistently, and with stronger evidence. The agent is only the interface and assistance layer. The hard part is the review brain.
Executive thesis
The central recommendation is direct: own the brain, wire the interface.
The enterprise should internally own the product intent, review criteria, standards interpretation, source authority, data classification, reference patterns, semantic/control model, exception logic, evidence model, prioritization rules, and decision memory. A supplier or technical team can help with implementation wiring, workflow, integrations, Copilot/Copilot Studio mechanics, Azure/AWS orchestration, telemetry, and deployment plumbing. But the architecture intelligence layer should not be outsourced.
That distinction matters because the value of this product is not a chatbot, portal, or model call. The value is a governed, versioned, reusable architecture review capability that can explain why a submission passes, fails, needs more evidence, requires escalation, or should be routed to a different platform or governance path.
A vendor can build a portal. A vendor can configure an agent. A vendor can wire APIs. A vendor cannot responsibly define how this enterprise judges architecture fitness, risk, source authority, platform alignment, GxP posture, or exception acceptability. If we hand that layer away, we will rent back our own architecture judgment through maintenance fees and change orders, which is apparently how civilization chose to monetize avoidable dependency.
1. The problem we are solving
Architecture review demand is increasing, especially for AI, GenAI, cloud, SaaS, integration-heavy, and emerging technology submissions. The architecture team is expected to process more reviews without a proportional increase in headcount. The current operating model depends heavily on manual review, scattered standards, inconsistent artifacts, and the institutional memory of individual architects.
The business problem is not simply that reviews take too long. The deeper problem is that the enterprise lacks a scalable, consistent, explainable, and reusable way to evaluate architecture submissions against current standards, approved patterns, data requirements, security expectations, platform constraints, existing capabilities, and prior decisions.
Today, a high-quality review often requires an architect to know where standards live, which version applies, what technologies are approved or restricted, whether an existing capability already solves the problem, how similar decisions were handled before, which governance route applies, and whether an exception is justified. Much of that knowledge is distributed across people, repositories, portals, decks, tools, and memory. That does not scale.
Current pain points
| Pain point | Why it matters |
|---|---|
| Fragmented intake | Reviews arrive through multiple channels, with inconsistent artifacts and context. |
| No common checklist | Review quality depends on reviewer style and available memory. |
| Scattered standards | Reviewers may rely on different or outdated source material. |
| Inconsistent outputs | Findings, risks, and decisions are not captured uniformly. |
| Weak decision memory | Prior approvals, exceptions, and rationale are hard to reuse. |
| Duplicate solutions | Existing capabilities are not always surfaced before new builds proceed. |
| Fast-path ambiguity | Some submissions may self-certify as low risk without enough architectural evidence. |
| Governance pressure | Exceptions may be approved due to delivery timelines rather than clear risk acceptance. |
What a bad review can cause
A weak or inconsistent review can approve non-standard technology, miss a duplicative capability, allow an unsupported or non-scalable pattern, overlook security or data issues, accept an exception without clear rationale, or create a decision that cannot be defended later. In a regulated healthcare, pharma, and medical device environment, those failures are not academic. Some may become operational, compliance, audit, quality, or patient-impacting problems.
This is why the assistant must improve rigor, not merely speed.
2. The vision
The vision is an Enterprise Architecture Review Assistant that helps architects perform first-pass evaluation of architecture submissions using a governed, evidence-backed, internally owned review model.
The assistant should be able to:
- Access or receive submitted architecture artifacts.
- Extract solution intent, business context, technologies, integrations, data flows, security model, operating model, and missing evidence.
- Classify the submission by domain, risk, data sensitivity, GxP potential, business impact, platform path, and governance route.
- Determine whether AI is actually appropriate, or whether the use case is better served by deterministic workflow, rules, dashboarding, search, or a data product.
- Evaluate the submission against approved standards, reference patterns, tool catalogs, control libraries, and architecture principles.
- Generate standardized review outputs: summary, scorecard, findings, evidence, missing information, risks, recommendations, and decision posture.
- Keep final approval with human architects.
- Capture final decision rationale and exceptions as reusable institutional memory.
The long-term capability should make reviews faster, more consistent, more explainable, and more reusable. It should also become a pattern for how future AI/agentic use cases enter architecture review.
The product philosophy
| Principle | Meaning |
|---|---|
| Assist, do not replace | The assistant performs first-pass work. Architects make final decisions. |
| Evidence before opinion | Findings must cite submitted evidence or mark what is missing. |
| Deterministic where possible | Source authority, routing, classification, and control logic should be governed rules where possible. |
| AI where ambiguity exists | Use AI for messy document interpretation, summarization, extraction, and recommendation drafting. |
| Own the intelligence layer | Criteria, controls, standards, patterns, and decision memory must remain internal assets. |
| Platform-fit before build | The delivery path must align to enterprise-approved platform, governance, and integration patterns. |
3. What we are building
We are building a decision-support capability for enterprise architecture review. It has three primary layers.
3.1 Intake and governance layer
This layer clarifies the request before review begins. It should determine what the submission is, who owns it, what business outcome it supports, whether it is an AI use case, what governance path applies, what platform route is likely, and what evidence is required before a meaningful review can occur.
This is where the future assistant should ask the questions we are asking ourselves now: What problem are we solving? Why does this need AI? What data is required? What sources are authoritative? What evidence proves the architecture is ready? What review path applies?
3.2 Architecture intelligence layer
This is the internal asset. It contains source authority, standards, review criteria, control definitions, reference patterns, anti-patterns, approved technologies, data classification, evidence requirements, exception logic, prior decisions, and review-output schemas.
This layer should be versioned, governed, auditable, and reusable. It should not live as undocumented prompt text. It should be represented through structured artifacts such as JSON, YAML, Markdown, schemas, control libraries, and reviewer guidance.
3.3 Experience and implementation layer
This is the interface and runtime layer. It may involve Copilot, Copilot Studio, Teams, SharePoint, the AI Governance Portal, Azure services, AWS services, APIs, MCP, A2A, REST, event patterns, observability, and workflow automation.
This layer can change over time. The architecture intelligence layer should survive changes in UI, model provider, cloud runtime, licensing, and token economics.
4. What we are not building yet
The first version should not become a full enterprise governance platform. It should not automate final approval. It should not replace the Architecture Review Board. It should not become the single system for legal, privacy, security, quality, procurement, and architecture workflows. It should not create a duplicate intake portal if an existing AI Governance Portal or submission system already serves as the system of record.
The first version should prove the disciplined review pattern:
- Can we define the minimum review criteria?
- Can we classify submissions correctly?
- Can we identify missing evidence?
- Can we evaluate against trusted sources and reference patterns?
- Can we produce useful findings that architects trust?
- Can we capture final decisions for reuse?
If those are not true, adding a nicer interface just gives us a more attractive way to be wrong.
5. Why this should be the golden alpha intake model
This use case should model the behavior we eventually expect from other AI and agentic submissions. It is both a product concept and a process example.
Before future teams request or build AI-enabled solutions, they should be able to explain:
- What problem they are solving.
- Who has the problem.
- What happens if nothing changes.
- Why AI is appropriate.
- What deterministic alternatives were considered.
- What data is required.
- Who owns the data.
- What is canonical versus reference-only.
- What governance review is required.
- What platform path is preferred.
- What security, privacy, GxP, quality, audit, or business-risk obligations apply.
- What output must be produced.
- Who owns sustainment.
- What it costs to build, run, improve, and govern.
- Whether the value justifies the engineering and operating cost.
This package should become the first working example because it exposes the truth that many AI projects try to avoid: the hard part is usually not model access. The hard part is clarifying intent, governing data, defining criteria, mapping source authority, managing evidence, and deciding who owns the decision.
6. Methodology: from idea to build-ready
The methodology should balance upfront discipline with practical delivery. We need enough definition to avoid building the wrong thing, but not so much documentation that the project suffocates in its own ceremony.
Stage 1: Intent clarification
Define the problem, target users, first review type, value hypothesis, non-goals, and decision boundaries.
Required outputs:
- Problem statement
- Vision statement
- First use case
- Target users
- Value hypothesis
- Non-goals
- Success and failure criteria
Stage 2: Governance routing
Classify the use case across business function, enterprise platform governance, security, privacy, quality, GxP potential, and AI governance.
Required outputs:
- Governance route
- Required approvers
- Fast-path versus standard-review eligibility
- Security and privacy triggers
- GxP and quality triggers
- Auditability requirements
Stage 3: AI appropriateness gate
Determine whether the problem actually needs generative AI or agentic AI.
Required outputs:
- AI appropriateness statement
- Deterministic alternative assessment
- Human-in-the-loop requirement
- Decision impact classification
- Required explainability level
Stage 4: Data and source authority model
Identify all relevant sources, classify them, determine authority, and define what can be used as evidence.
Required outputs:
- Source inventory
- Source authority map
- Data classification model
- Evidence model
- Access model
- Retention model
- Refresh model
Stage 5: Review criteria and control design
Turn architecture principles into reviewable controls. This is where vague principles become operational checks.
Required outputs:
- Review rubric
- Control library
- Reference pattern library
- Anti-pattern library
- Technology catalog
- Exception rules
- Severity model
Stage 6: Platform-fit routing
Map the use case to the right implementation path.
Required outputs:
- Front door recommendation
- Runtime recommendation
- Integration pattern
- Model/tool access path
- Plan B for licensing or tokenomics changes
- Supplier role boundary
Stage 7: Resourcing and value gate
Before build, define the minimum team, delivery path, expected effort, technology cost exposure, run economics, and value hypothesis. This is not procurement theatre. It is a reality check. A use case that cannot explain who will build it, who will run it, what it costs, and why it is worth doing should not be approved just because someone used the word agent.
Required outputs:
- Delivery role map
- Internal versus augmented skills plan
- Rough order-of-magnitude effort estimate
- Build cost estimate
- Run cost and tokenomics model
- Sustainment owner
- Value hypothesis and stop criteria
Stage 8: Pilot gate
Before build, define the pilot set and evaluation method.
Required outputs:
- Pilot submissions
- Gold-standard human review baseline
- Accuracy and usefulness metrics
- Override tracking
- Cycle-time baseline
- Pilot success criteria
7. How an engineer should read this package
This section is for implementation teams who have not lived through the discovery conversations.
An engineer should not read this package as a final architecture design. It is a build-orientation package. It explains the product intent, source materials, configuration assets, likely platform paths, and open decisions that must be resolved before detailed design.
Recommended reading order
- Read the problem statement and vision.
- Read the alpha intake methodology.
- Read the data architecture and source authority sections.
- Review the platform-fit routing model.
- Review the configuration and data asset map.
- Review the engineering build flow.
- Treat the source library as provenance and evidence, not as official policy unless validated.
- Treat JSON/YAML/schema/control files as seed configuration, not final enterprise standards.
Engineering north star
Build an assistive EA review system that ingests architecture submissions, extracts structured facts, evaluates them against internally owned controls and source-authority rules, produces evidence-backed findings, and routes final decisions to human architects.
What engineers should not do
- Do not hard-code review criteria into prompts.
- Do not make the model the final approver.
- Do not build a duplicate intake portal unless integration is blocked.
- Do not treat company-agnostic templates as official policy without validation.
- Do not make source authority subjective or prompt-driven.
- Do not bury critical review logic in a black-box model response.
8. Engineering build orientation
The engineering model should separate configuration, deterministic logic, AI assistance, workflow, and human review.
8.1 Conceptual component model
| Component | Role | Build posture |
|---|---|---|
| Intake adapter | Reads or receives submissions from the approved intake source. | Integrate with system of record where possible. |
| Artifact processor | Extracts text, tables, diagrams, metadata, and file inventory. | Use approved document processing path. |
| Submission normalizer | Converts artifacts into a standard submission schema. | Deterministic schema-first implementation. |
| Classification engine | Applies data, risk, GxP, platform, and governance classification. | Rule-driven with AI suggestions only where needed. |
| Source authority resolver | Determines which standards and sources can support findings. | Deterministic and governed. |
| Control evaluation engine | Evaluates submission facts against controls and criteria. | Hybrid deterministic + AI-assisted evidence matching. |
| AI assistance layer | Summarizes, extracts, maps evidence, drafts findings. | Bounded by instructions, schemas, and citations. |
| Human review workbench | Lets architects accept, reject, edit, override, and finalize findings. | Required before decision closure. |
| Decision memory store | Captures final decisions, rationale, exceptions, and links. | Reusable institutional memory. |
| Observability layer | Tracks usage, accuracy, overrides, failure modes, and cost. | Required for trust and degradation monitoring. |
8.2 Minimum viable build flow
- Select a pilot review type.
- Load a controlled source package.
- Receive or access submitted architecture artifacts.
- Inventory the package and detect missing expected artifacts.
- Extract structured submission facts.
- Apply classification and source authority rules.
- Evaluate against the initial control library.
- Generate evidence-backed findings.
- Route findings to an architect for review.
- Capture accept/reject/edit/override actions.
- Store the final decision record.
- Feed approved decisions and exceptions into the decision memory layer.
8.3 Deterministic versus AI-assisted responsibilities
| Function | Deterministic | AI-assisted | Notes |
|---|---|---|---|
| Source authority ranking | Yes | No | Must not depend on model opinion. |
| Data classification rules | Yes | Suggestion only | Model can flag suspected data types, not decide policy. |
| Required submission completeness | Yes | Evidence detection | Required artifacts should be schema-driven. |
| Technology catalog lookup | Yes | Alias extraction | Model can help map names to known tools. |
| Evidence extraction from messy documents | No | Yes | Good fit for AI if output cites source evidence. |
| Finding narrative | No | Yes | Narrative must be constrained by evidence. |
| Final approval | Human | No | Never autonomous in this use case. |
| Decision record storage | Yes | No | Structured and auditable. |
| Pattern matching | Hybrid | Yes | Rules define applicability; AI helps with interpretation. |
9. Configuration and data asset map
The “brain” of the system should be a versioned configuration and data layer, not a pile of prompt text.
| Asset | Purpose | Current seed source | Owner to confirm |
|---|---|---|---|
| Source authority map | Defines canonical, reference, derived, and non-authoritative sources. | v1.2 hardening package | EA governance / platform owners |
| Normalized tool catalog | Defines approved, restricted, declining, emerging, exception-required technologies. | v1.2 hardening package | Technology owners |
| Architecture submission schema | Defines what a valid submission package must contain. | v1.2 hardening package | EA governance |
| Review output schema | Defines standard finding, evidence, score, and recommendation output. | v1.2 hardening package | EA governance |
| Machine-executable controls | Defines review checks that can be evaluated. | v1.2 hardening package | EA governance + domain SMEs |
| Control-to-pattern mapping | Links controls to approved reference patterns. | v1.2 hardening package | EA governance |
| Capability ontology | Defines domains, capabilities, and review taxonomy. | v1.1 package | EA governance / enterprise architecture |
| Reference pattern library | Defines approved patterns and where they apply. | v1.1 package | Architecture owners |
| Agent instructions | Defines agent behavior, boundaries, evidence rules, and output contract. | v1.2 hardening package | Product + EA |
| Human reviewer guide | Defines how architects validate, override, and finalize output. | v1.2 hardening package | EA governance |
| Enterprise AI knowledge base | Captures broader platform, governance, stack, and pattern context. | Enterprise stack files | Platform owner validation required |
Configuration principle
Anything that changes because a standard changes should not require application code changes. Standards, controls, source status, platform routing, severity, and evidence requirements should be governed configuration wherever feasible.
10. Enterprise platform alignment
The newly supplied enterprise stack inputs materially improve the package. They show a platform posture that appears to favor Microsoft for front-door/orchestration experiences, AWS as an approved runtime and infrastructure pattern, and open interoperability through MCP, A2A, REST, and event streaming.
These inputs are valuable, but several are company-agnostic templates or model-distilled artifacts. They should be treated as platform intelligence inputs until confirmed by platform owners.
10.1 Microsoft front door
The user experience should likely begin where users already work: Copilot, Teams, SharePoint, or Copilot Studio. For early piloting, Copilot/Agent Builder may help test prompts and outputs. For a governed team assistant, Copilot Studio is more realistic.
10.2 Azure service path
Azure may be useful for AI Foundry, Azure OpenAI, AI Search, Semantic Kernel, API Management, Key Vault, Purview, Logic Apps, Durable Functions, Power Automate, Graph, Log Analytics, and integration into the Microsoft collaboration ecosystem.
10.3 AWS runtime path
If enterprise platform leadership prefers AWS for deeper runtime, orchestration, or Bedrock-native patterns, the product should remain compatible with AWS services such as Bedrock, AgentCore, Step Functions, EventBridge, Lambda, S3, KMS, CloudWatch, and related runtime services.
10.4 Interoperability patterns
MCP, A2A, REST, and event streaming should be treated as candidate integration patterns. They matter because the assistant may eventually need to interact with source systems, registries, enterprise agents, review workflows, and decision stores.
10.5 Plan B
The architecture should avoid tying the intelligence layer to one vendor, one front door, one model, or one licensing model. Copilot token economics may be favorable today. That is not a permanent architecture guarantee. The control library, source authority map, schemas, and decision memory should be portable.
11. Platform-fit routing model
The implementation path should follow the use case profile.
| Use case profile | Likely path | Notes |
|---|---|---|
| Individual productivity, low sensitivity, no business-process impact | Copilot / Agent Builder | Useful for very small experiments. |
| Team assistant over controlled M365 knowledge | Copilot Studio | Good pilot candidate if governance allows. |
| Review assistant integrated with Teams, SharePoint, and workflow | Copilot Studio + Power Automate / Logic Apps | Practical near-term path. |
| Durable business-process assistant with multiple integrations | Enterprise agent framework / Azure / AWS hybrid | More production-suitable. |
| AWS-preferred runtime or Bedrock-native use case | AWS agent framework / Bedrock / Step Functions / Temporal-style orchestration | Viable if platform owner validates path. |
| Regulated, GxP, privacy-sensitive, or decision-impacting workflow | Formal governance first | Do not casually prototype with sensitive data. |
For this use case, the likely path is a staged approach: validate rubric/output quickly, then move toward Copilot Studio or approved enterprise runtime depending on integration, governance, and source-of-record requirements.
12. Governance and sensitivity
David’s concern about sensitivity is valid. People may interpret an EA review assistant as a threat to architect roles. That framing is wrong and risky.
This should be positioned as:
- Reducing repetitive first-pass review work.
- Improving consistency across reviewers.
- Capturing institutional knowledge.
- Making missing evidence visible earlier.
- Giving architects better evidence before decisions.
- Preserving human approval and judgment.
It should not be positioned as:
- Replacing architects.
- Automating ARB approval.
- Removing judgment.
- Treating architecture as a binary checklist.
- Using AI to rubber-stamp submissions.
The message is simple: architects remain accountable. The assistant improves the evidence, preparation, and consistency around the decision.
13. Data architecture and source authority
This is the heaviest lift and the least optional part.
The assistant can only be trusted if it knows which sources are authoritative, which sources are reference-only, which sources are stale, which sources are submission evidence, and which sources are not eligible to support findings.
13.1 Required source classes
| Source class | Examples | Required decision |
|---|---|---|
| Submission artifacts | Architecture diagrams, PPT, Word, PDFs, vendor docs, SAExpress exports | What evidence is submitted for review? |
| Standards | EA principles, AI principles, integration principles, data principles, security models | Which versions are authoritative? |
| Technology catalog | Approved, restricted, declining, emerging tools | Who owns tool status? |
| Reference patterns | Approved architecture patterns and reusable designs | Which patterns apply to which domains? |
| Existing capabilities | Platforms, services, reusable solutions | How do we detect duplication? |
| Prior decisions | Past approvals, exceptions, rejections, remediation | How do prior decisions inform new reviews? |
| Enterprise stack guidance | Copilot, Azure, AWS, MCP/A2A, governance paths | Which inputs are official versus draft/reference? |
13.2 Authority levels
| Authority level | Meaning |
|---|---|
| Canonical | Can be used as source of truth for findings and decisions. |
| Governed reference | Useful but not final authority. |
| Submission evidence | Evidence from the project team for a specific review. |
| Derived analysis | Agent-generated extraction or interpretation, must cite source evidence. |
| Historical context | Prior decisions or lessons learned, useful but version-sensitive. |
| Not authoritative | Demo artifacts, unofficial notes, outdated files, or unvalidated model output. |
13.3 Evidence states
Every review control should resolve to one of these states:
- Pass
- Gap
- Not evidenced
- Exception required
- Not applicable
- Human review required
“Not evidenced” is critical. It prevents the assistant from inventing completeness when the submission simply lacks enough information.
14. Review criteria and control model
The review model should start narrow, but it must be structurally correct.
14.1 Candidate review dimensions
| Dimension | Purpose |
|---|---|
| Intake completeness | Confirm the submission has enough evidence to review. |
| Business and process context | Understand what problem the solution supports. |
| AI appropriateness | Decide whether AI is justified or deterministic alternatives are better. |
| Platform alignment | Evaluate fit with Copilot, Copilot Studio, Azure, AWS, or hybrid paths. |
| Technology adherence | Check approved, restricted, declining, and exception-required technologies. |
| Architecture fit | Compare against reference patterns and anti-patterns. |
| Integration alignment | Evaluate APIs, events, MCP/A2A, point-to-point risk, ownership, and resilience. |
| Data readiness | Evaluate classification, source authority, lineage, quality, access, retention, and ownership. |
| Security baseline | Evaluate identity, access, encryption, secrets, logging, vulnerability management, and threat model. |
| Operations readiness | Evaluate support model, observability, recovery, lifecycle, change, and ownership. |
| Risk and compliance | Evaluate GxP, privacy, quality, regulatory, audit, and vendor risk triggers. |
| Decision recommendation | Recommend posture, conditions, required remediation, or escalation. |
14.2 Control structure
A control should include:
- Control ID
- Name
- Category
- Rule statement
- Applicability condition
- Required evidence
- Pass condition
- Gap condition
- Not-evidenced condition
- Severity
- Source authority
- Remediation guidance
- Human review trigger
This converts architecture judgment into a reviewable, traceable, human-governed model.
15. Reference implementation tracks
Track A: Copilot / Agent Builder discovery prototype
Use this only to test the interaction model, initial prompts, output shape, and user reaction with a very small group. It should not be treated as the production architecture.
Best for:
- Rapid concept validation
- Small group review
- Prompt/output exploration
- Low integration needs
Risks:
- Limited lifecycle governance
- Limited integration depth
- Not ideal for decision memory or operational controls
Track B: Copilot Studio governed pilot
Use this when the pilot needs controlled access, M365 knowledge grounding, Teams/SharePoint interaction, workflow, analytics, and a more governed agent model.
Best for:
- 7-to-30 user pilot
- Team-level assistant
- M365 collaboration integration
- Basic workflow and review routing
Risks:
- Platform governance path must be confirmed
- Some source-of-record integration may require additional services
- Complex orchestration may exceed low-code comfort
Track C: Enterprise stack implementation
Use this for a durable product that integrates with the AI Governance Portal, architecture repositories, CMDB, catalog tools, decision stores, telemetry, and enterprise orchestration.
Best for:
- Production-grade architecture review capability
- Durable decision memory
- System-of-record integration
- Complex workflow and audit requirements
- Azure/AWS hybrid implementation
Risks:
- More engineering required
- More governance required
- Requires product ownership and sustainment funding
16. Minimum viable product backlog
Epic 1: Intake and source package
- Define first review type.
- Define pilot submission package.
- Identify source-of-record for submissions.
- Build controlled upload or read-only artifact ingestion path.
- Validate supported file types.
Epic 2: Source authority and configuration
- Finalize source authority map.
- Finalize initial tool catalog.
- Finalize initial reference patterns.
- Finalize initial control library.
- Version the configuration package.
Epic 3: Extraction and normalization
- Extract text and metadata from submitted artifacts.
- Normalize extracted facts into submission schema.
- Detect missing evidence.
- Map technologies to catalog entries.
- Flag uncertain extraction for human review.
Epic 4: Control evaluation
- Apply intake completeness checks.
- Apply technology and platform checks.
- Apply data readiness checks.
- Apply security and operational readiness checks.
- Generate evidence-backed findings.
Epic 5: Human reviewer workflow
- Present summary, scorecard, findings, evidence, and missing information.
- Allow accept, reject, edit, override, and rationale capture.
- Capture final decision posture.
- Export review output.
Epic 6: Decision memory and telemetry
- Store final decision record.
- Store standards/control version used.
- Track override rate, false positives, false negatives, cycle-time savings, and adoption.
- Feed approved decisions into reusable memory.
17. Delivery roles, skills, and resourcing model
If this becomes a build initiative, the next question is not only what platform we use. The next question is who actually does the work, who owns the durable assets, and what level of investment is reasonable for the first pilot.
The answer should be deliberately modest for v1. We do not need a huge delivery army to prove the concept. We do need the right split of product ownership, architecture judgment, data/control modeling, and engineering implementation. One highly capable engineer can do a surprising amount if the scope is contained and the platform path is clear. One highly capable engineer cannot also be the EA product owner, data steward, governance approver, security reviewer, prompt/control librarian, and adoption lead. That way lies the traditional enterprise ritual of asking one person to be a department and acting surprised when they become carbon.
17.1 Core team for a controlled alpha
| Role | Approx. FTE for alpha | Primary responsibility | Internal or augment |
|---|---|---|---|
| Product / architecture lead | 0.25-0.50 | Product framing, scope control, decision model, package ownership | Internal |
| EA domain owner | 0.25-0.50 | Review criteria, standards interpretation, approval logic, reviewer adoption | Internal |
| Data/control architect | 0.50 | Source authority, data classification, control model, schemas, evidence model | Internal preferred |
| Platform engineer / full-stack integrator | 0.50-1.00 | Connector wiring, auth/RBAC, workflow, orchestration, deployment mechanics | Internal or Rob C team / augmentation |
| Security / privacy advisor | 0.05-0.10 | Security, access, privacy, logging, data-handling review | Internal review role |
| Quality / GxP advisor | 0.05-0.10 if triggered | GxP/quality posture, validation implications, audit expectations | Internal review role |
| Architect pilot reviewers | 2-4 reviewers, part-time | Validate findings, override model, usefulness, trust, adoption | Internal |
17.2 Skills needed
| Skill area | Why it matters | Risk if missing |
|---|---|---|
| Enterprise architecture judgment | Turns principles into reviewable criteria | Agent produces generic recommendations |
| Data architecture and classification | Defines what can be processed, retained, cited, and trusted | Unsafe or ungoverned data handling |
| Source authority modeling | Separates canonical sources from reference-only material | Conflicting standards and false confidence |
| JSON/YAML/schema/control modeling | Makes the review brain configurable instead of buried in code | Prompt spaghetti and maintenance pain |
| Identity and RBAC | Ensures users see only what they should | Security and privacy exposure |
| Workflow/orchestration | Routes reviews, exceptions, approvals, and evidence loops | Manual glue work survives the automation |
| Observability and telemetry | Measures accuracy, override rates, adoption, cost, and failures | No way to know if the pilot works |
| Change/release governance | Versions controls, standards, prompts, and schemas | Agent drift and stale review logic |
17.3 Practical staffing scenarios
| Scenario | When it fits | Likely staffing | Tradeoff |
|---|---|---|---|
| Minimal internal alpha | Controlled proof using sample artifacts and manual source package | Tony + David + 1 engineer part-time + reviewers | Fastest, but limited integration |
| Productized pilot | Team use with controlled intake, RBAC, workflow, telemetry, and reusable control library | Tony + David + 1 engineer 0.75-1.0 FTE + data/control architect 0.5 FTE + review advisors | Best balance of speed and discipline |
| Enterprise integrated pilot | Pulls from system of record, writes decision memory, integrates with platform governance | Above + platform owner + security/privacy/quality + integration support | Stronger, but slower and governance heavier |
17.4 Supplier or Rob C team role
The clean boundary is this: internal owners define the review brain; technical augmentation helps wire the system together.
Internal ownership should include product intent, review criteria, semantic/control model, source authority, data classification, prioritization, exception rules, output model, and decision memory. Rob C’s team or a supplier can help with connectors, identity/RBAC, Copilot/Copilot Studio or enterprise-stack mechanics, workflow/orchestration, telemetry, deployment, and platform compliance.
This is not anti-supplier. It is anti-outsourcing-the-part-only-we-understand. Subtle distinction, often missed by people selling roadmaps.
18. TCO, tokenomics, and value gate
Every AI or agentic use case should be required to pass a basic economic reality check before it moves forward. This project should hold itself to the same standard. If a team cannot explain what it will cost to build, what it will cost to run, what licensing or token exposure exists, who will maintain it, and what value it creates, then the use case is not ready for approval.
That is especially important here because the enterprise may have favorable near-term economics through Microsoft 365 Copilot and Copilot Studio licensing. That helps, but it is not a permanent architectural strategy. Licensing can change. Token policies can change. Model availability can change. Usage patterns can surprise everyone, because nothing says enterprise innovation like discovering the invoice after the demo.
18.1 Cost categories that must be estimated
| Cost area | What to estimate | Notes |
|---|---|---|
| Build labor | Product, EA, data/control, engineering, security/privacy, quality, testing | Include internal labor even if not charged back |
| Platform costs | Copilot/Copilot Studio, Azure, AWS, data services, workflow/runtime | Validate against enterprise licensing and chargeback rules |
| Model/token costs | Prompt/completion tokens, embedding, retrieval, evaluation, batch jobs | May be hidden under seat licensing or exposed under API/runtime path |
| Storage and indexing | Artifacts, extracted text, vector indexes, logs, decision memory | Retention and classification rules drive cost |
| Integration costs | APIs, connectors, MCP/tool gateways, identity, workflow, monitoring | Higher if system-of-record integration is required |
| Governance and validation | Security, privacy, quality, GxP, audit, release/change controls | Not optional in regulated contexts |
| Sustainment | Standards updates, control library maintenance, prompt/schema changes, triage | This is where cheap vendor builds become expensive hobbies |
| Adoption and training | Reviewer enablement, documentation, feedback loops, support | Needed for trust and repeat use |
18.2 Tokenomics and runtime questions
| Question | Why it matters |
|---|---|
| Are we using seat-licensed Copilot capabilities, metered API calls, or both? | Determines whether run cost is predictable or usage-based |
| Which steps require model calls versus deterministic logic? | Prevents token burn on tasks rules can handle |
| Are we embedding source corpora, reviewing submitted artifacts live, or both? | Drives indexing and refresh cost |
| How large are typical submission packages? | Controls ingestion, extraction, context, and model cost |
| Do we need repeated evaluation runs per submission? | Can multiply cost quickly |
| Are we storing extracted facts, evidence spans, and decision records? | Reduces repeated processing but adds storage/governance obligations |
| What happens if Microsoft tokenomics or licensing changes? | Forces a Plan B before dependency becomes expensive |
18.3 Value model
Value must be classified before it is judged. A proposal may be useful and still fail the current approval test if it does not match the decision context.
| Value class | What makes it decision-ready |
|---|---|
| Direct savings | Strongest when it lands in the accountable operating budget. |
| Indirect savings | Useful when the financial path is credible but not yet booked. |
| Avoided cost | Valid when the counterfactual cost, renewal, incident, or lifecycle obligation is real and time-bounded. |
| Risk reduction | Valid when operational, security, compliance, or audit exposure is evidenced and the decision owner agrees it matters now. |
| Capacity release | Useful when released time is tied to named work or support demand. |
| Quality or rework reduction | Valid when defect load, rework, or service drag is measurable. |
| Cross-workstream value | Valid when the benefit lands outside the local team only if a benefiting owner and sponsor accept it. |
All value classes matter, but not equally in every business climate. When the immediate decision context is an OPEX reduction mandate, direct savings from an accountable operational budget may be the only class that materially changes the decision.
- Decision owner: decides whether the proposal clears the current line.
- Budget owner: owns the operating budget or capacity being affected.
- Benefiting owner: receives the service, cost, risk, or quality benefit if the claim is true.
- Evidence owner: owns the numbers, baseline, or operational proof behind the claim.
The Outcome Acceptance Line is the current threshold for approval. A proposal can show real value and still sit below the line. Below-line items may still warrant sandboxing, time-boxed feasibility, an explicit exception, or a stop decision.
The first pilot should not claim enterprise-wide transformation. It should prove a measurable local value hypothesis.
Pilot measurement examples:
| Value lever | Pilot measurement |
|---|---|
| Review cycle-time reduction | Baseline human review time versus assisted first-pass review time |
| Reviewer effort reduction | Hours saved per submission on intake, evidence review, and output drafting |
| Consistency improvement | Agreement across reviewers using common criteria and output format |
| Missing evidence detection | Percentage of incomplete submissions flagged correctly |
| Reuse of prior decisions | Number of findings or recommendations linked to precedent |
| Risk reduction | Duplicative, unsupported, restricted, or non-standard patterns caught earlier |
| Adoption | Reviewer usage, trust score, override rate, and repeat-use willingness |
18.4 Alpha-level investment bands
These are planning bands, not budget commitments. They should be replaced with actual internal rates, platform costs, and sourcing assumptions once the delivery path is selected.
| Delivery path | Likely duration | Internal effort | Augmentation need | Cost posture |
|---|---|---|---|---|
| Manual source-package alpha | 2-4 weeks | Low to moderate | Optional part-time engineer | Lowest cost, weakest integration |
| Copilot/Copilot Studio pilot | 4-8 weeks | Moderate | 0.5-1.0 technical FTE | Good proof path if governance permits |
| Enterprise integrated pilot | 8-12+ weeks | Moderate to high | 1.0+ engineering plus platform/security support | Highest fidelity, slower path |
| Vendor-led build | Variable | Still high internal SME burden | Supplier team | Risk of low upfront/high sustainment cost |
18.5 Approval gate for any future AI/agentic submission
A future submission should be considered incomplete if it cannot answer these questions:
- What business problem is being solved?
- What value class is claimed, and what measurable value is expected?
- Who owns the decision, the impacted budget or capacity, the benefit, and the evidence?
- Why is AI or agentic automation appropriate?
- What deterministic alternative was considered?
- What data is required and who owns it?
- What sources are canonical, reference-only, or prohibited?
- What platform path is proposed and why?
- What governance reviews are triggered?
- What is the expected build effort?
- What are the expected run costs, including tokenomics where applicable?
- What is the current business climate weighting most heavily?
- Is the proposal above or below the Outcome Acceptance Line, and what is the below-line path if it does not clear it?
- Who owns sustainment after launch?
- What metrics prove success or failure?
- What is the stop condition if the value does not materialize?
This is the filter that keeps architecture review from approving AI slop with a budget line. Harsh, yes. Cheaper than cleaning it up later.
19. Build versus buy boundary
Own internally
- Product vision
- Review criteria
- Source authority
- Data classification
- Semantic/control model
- Reference patterns
- Exception logic
- Decision memory
- Approval model
- Quality and governance rules
Supplier-assisted
- UI implementation
- Connector configuration
- Copilot Studio build mechanics
- Azure/AWS orchestration implementation
- Workflow automation
- Runtime deployment
- Logging and telemetry plumbing
- Test harness and evaluation tooling
Avoid
- Vendor-owned control library
- Vendor-owned standards interpretation
- Vendor-owned decision memory
- Duplicate intake portal that bypasses the existing system of record
- Black-box scoring that architects cannot challenge
- Perpetual maintenance dependency for changes the enterprise should own
20. Delta needed before build
| Decision | What must be answered |
|---|---|
| First review type | Confirm AI architecture review as the first use case, or choose another. |
| First user group | Identify the first architects/reviewers and what they expect. |
| Intake source | Confirm whether artifacts come from AI Governance Portal, SharePoint, LeanIX, SAExpress, or manual package. |
| Platform path | Confirm Copilot/Copilot Studio versus Azure/AWS/hybrid for the pilot. |
| Governance route | Confirm business, platform, security, privacy, quality, and GxP review requirements. |
| Minimum rubric | Define the smallest useful set of review criteria. |
| Source authority | Identify canonical standards, catalogs, and reference patterns. |
| Data classification | Define what artifact classes are allowed in the pilot. |
| Evidence model | Define how findings must cite source evidence. |
| Output format | Define good output for architects, project teams, leadership, and audit. |
| Pilot set | Select historical submissions to test against. |
| Success metrics | Define accuracy, usefulness, time saved, override rate, and adoption targets. |
| Sustainment owner | Decide who owns standards, controls, and the product after the pilot. |
| TCO and value gate | Estimate build effort, run cost, tokenomics exposure, and measurable value before build approval. |
21. Immediate working plan for David and Tony
David leads
- Architecture review criteria
- EA principles and standards interpretation
- Reference patterns and anti-patterns
- Human review and approval model
- Reviewer expectations
- Leadership positioning and sensitivity handling
Tony leads
- Product framing
- Data/control model structure
- Source authority packaging
- Engineering orientation
- Platform-fit analysis
- Supplier boundary and delivery mechanics
Together
- Confirm first use case.
- Define minimum viable rubric.
- Select pilot submissions.
- Agree on output model.
- Decide which platform path to test first.
- Define what help is needed from Rob C’s team or another technical resource.
22. Recommended next move
Schedule a focused working session to lock the build-readiness foundation. The session should not attempt to design the entire platform. It should answer enough to start a controlled pilot.
Suggested agenda:
- Confirm first review type.
- Confirm first user group.
- Walk the minimum review rubric.
- Identify authoritative standards and catalogs.
- Confirm source-of-record for submitted artifacts.
- Confirm platform path candidates.
- Identify governance gates.
- Define pilot success metrics.
- Identify what Rob C’s team or another technical resource would own.
The output should be a one-page pilot charter, a source authority map, a minimum control library, a pilot submission set, and a build decision for the first technical path.
Historical provenance: Engineering Build Orientation v1.4Engineer-facing build orientation and component model.
Back to top · Back to source list
Engineering Build Orientation v1.4
Purpose
This document translates the Enterprise Architecture Review Assistant concept into engineering terms. It is not a final solution architecture. It is a build-orientation guide for engineers, platform teams, and implementation partners who need to understand what the product is supposed to do before choosing tools or writing code.
Product build statement
Build an assistive EA review system that ingests architecture submissions, extracts structured facts, evaluates them against internally owned controls and source-authority rules, produces evidence-backed findings, and routes final decisions to human architects.
Core engineering principle
Separate the review brain from the user interface.
The review brain is the configuration and data layer: source authority, schemas, controls, pattern mappings, catalogs, classifications, evidence rules, and decision records. The user interface may be Copilot, Copilot Studio, Teams, SharePoint, a portal, or another approved front door. The product should not lose its intelligence if the interface changes.
Conceptual components
| Component | Responsibility | Notes |
|---|---|---|
| Intake adapter | Connect to approved artifact source or controlled upload | Avoid duplicate intake if portal integration is available. |
| Artifact processor | Parse DOCX, PPTX, PDF, diagrams, exports, metadata | Use approved document processing path. |
| Submission normalizer | Transform extracted content into the submission schema | Schema-first design. |
| Classification engine | Apply data, risk, GxP, platform, and governance classifications | Rules first, AI suggestion second. |
| Source authority resolver | Determine which standards and sources can support findings | Deterministic and governed. |
| Control evaluator | Evaluate controls against normalized submission and evidence | Hybrid rules + AI evidence matching. |
| AI assistance layer | Extract, summarize, map evidence, draft findings | Must be constrained by source evidence. |
| Human review workbench | Accept, reject, edit, override, finalize | Required for all decisions. |
| Decision memory store | Capture final decisions, exceptions, and rationale | Reusable institutional memory. |
| Observability layer | Track usage, errors, overrides, accuracy, cost, cycle-time | Required for trust and degradation monitoring. |
Minimum viable build flow
- Select pilot review type.
- Load controlled source package.
- Receive or access architecture artifacts.
- Inventory the package and detect missing artifacts.
- Extract structured facts.
- Apply classification and source authority rules.
- Evaluate against the initial control library.
- Generate evidence-backed findings.
- Present findings to architect.
- Capture accept/reject/edit/override decisions.
- Store final decision record.
- Feed decisions and exceptions into institutional memory.
Do not overbuild
Do not start with a full enterprise governance platform, broad multi-domain review engine, autonomous approver, or fully generalized knowledge graph. Start with one review type, one source package, one output model, and a small reviewer group.
Build-team expectations
Engineers should assume the first pilot needs a small but real team: product/architecture ownership, EA domain ownership, data/control architecture, platform engineering, and part-time security/privacy/quality review. The engineering task is not merely to call an LLM. It is to wire a governed review loop around source authority, controls, evidence, human review, and decision memory.
Engineering cost discipline
Do not use model calls for deterministic work. Source ranking, classification rules, required-field checks, technology catalog lookups, routing rules, and decision record storage should be deterministic where possible. Use AI for messy extraction, summarization, evidence discovery, recommendation drafting, and ambiguity handling. This is how the pilot controls cost and avoids turning every operation into a token bonfire.
Historical provenance: Configuration and Data Asset Map v1.4Configuration, schema, catalog, and control assets that drive the product.
Back to top · Back to source list
Configuration and Data Asset Map v1.4
Purpose
This document identifies the configuration and data artifacts that should drive the Enterprise Architecture Review Assistant. These assets should be treated as governed product inputs, not incidental implementation files.
Asset map
| Asset | Purpose | Current seed source | Owner to confirm |
|---|---|---|---|
| Source authority map | Defines canonical, reference, derived, and non-authoritative sources | v1.2 operational hardening package | EA governance / platform owners |
| Normalized tool catalog | Defines approved, restricted, declining, emerging, exception-required technologies | v1.2 operational hardening package | Technology owners |
| Architecture submission schema | Defines required submission package structure | v1.2 operational hardening package | EA governance |
| Review output schema | Defines findings, evidence, scores, recommendations, and decision record | v1.2 operational hardening package | EA governance |
| Machine-executable controls | Defines controls that can be evaluated | v1.2 operational hardening package | EA governance + domain SMEs |
| Control-to-pattern mapping | Links controls to reference patterns | v1.2 operational hardening package | EA governance |
| Capability ontology | Defines domains, capabilities, and taxonomy | v1.1 architecture evaluation package | EA governance |
| Tool catalog | Defines available platform, data, agentic, and SDLC tools | v1.1 architecture evaluation package and enterprise stack files | Technology owners |
| Reference pattern library | Defines approved patterns and applicability | v1.1 architecture evaluation package | Architecture owners |
| Agent instructions | Defines assistant behavior and evidence rules | v1.2 operational hardening package | Product + EA |
| Human reviewer guide | Defines review, override, and finalization process | v1.2 operational hardening package | EA governance |
| Enterprise AI knowledge base | Captures platform, governance, stack, and pattern context | Enterprise stack files | Platform owner validation required |
Configuration principle
If a change reflects a change in standards, tools, governance, source authority, severity, routing, or evidence requirements, it should generally be treated as configuration rather than application code.
Authority levels
| Level | Meaning |
|---|---|
| Canonical | Can be used as source of truth for findings and decisions. |
| Governed reference | Useful but not final authority. |
| Submission evidence | Evidence supplied by the project team for a specific review. |
| Derived analysis | Agent-generated extraction or interpretation, must cite source evidence. |
| Historical context | Prior decision or lesson learned, useful but version-sensitive. |
| Not authoritative | Demo content, unofficial notes, outdated files, or unvalidated model output. |
Evidence states
Every control should resolve to one of: Pass, Gap, Not Evidenced, Exception Required, Not Applicable, or Human Review Required.
Additional planning artifacts in v1.4
| Artifact | Purpose |
|---|---|
EA_Architecture_Review_Assistant_Delivery_Roles_and_Resourcing_Model_v1.4.md |
Defines likely roles, skills, FTE bands, and augmentation boundaries. |
EA_Architecture_Review_Assistant_TCO_Tokenomics_and_Value_Gate_v1.4.md |
Defines build cost, run cost, tokenomics, value, and approval-gate expectations. |
Historical provenance: Delivery Roles and Resourcing Model v1.4Roles, skills, FTE bands, and augmentation boundaries.
Back to top · Back to source list
Delivery Roles, Skills, and Resourcing Model v1.4
If this becomes a build initiative, the next question is not only what platform we use. The next question is who actually does the work, who owns the durable assets, and what level of investment is reasonable for the first pilot.
The answer should be deliberately modest for v1. We do not need a huge delivery army to prove the concept. We do need the right split of product ownership, architecture judgment, data/control modeling, and engineering implementation. One highly capable engineer can do a surprising amount if the scope is contained and the platform path is clear. One highly capable engineer cannot also be the EA product owner, data steward, governance approver, security reviewer, prompt/control librarian, and adoption lead. That way lies the traditional enterprise ritual of asking one person to be a department and acting surprised when they become carbon.
17.1 Core team for a controlled alpha
| Role | Approx. FTE for alpha | Primary responsibility | Internal or augment |
|---|---|---|---|
| Product / architecture lead | 0.25-0.50 | Product framing, scope control, decision model, package ownership | Internal |
| EA domain owner | 0.25-0.50 | Review criteria, standards interpretation, approval logic, reviewer adoption | Internal |
| Data/control architect | 0.50 | Source authority, data classification, control model, schemas, evidence model | Internal preferred |
| Platform engineer / full-stack integrator | 0.50-1.00 | Connector wiring, auth/RBAC, workflow, orchestration, deployment mechanics | Internal or Rob C team / augmentation |
| Security / privacy advisor | 0.05-0.10 | Security, access, privacy, logging, data-handling review | Internal review role |
| Quality / GxP advisor | 0.05-0.10 if triggered | GxP/quality posture, validation implications, audit expectations | Internal review role |
| Architect pilot reviewers | 2-4 reviewers, part-time | Validate findings, override model, usefulness, trust, adoption | Internal |
17.2 Skills needed
| Skill area | Why it matters | Risk if missing |
|---|---|---|
| Enterprise architecture judgment | Turns principles into reviewable criteria | Agent produces generic recommendations |
| Data architecture and classification | Defines what can be processed, retained, cited, and trusted | Unsafe or ungoverned data handling |
| Source authority modeling | Separates canonical sources from reference-only material | Conflicting standards and false confidence |
| JSON/YAML/schema/control modeling | Makes the review brain configurable instead of buried in code | Prompt spaghetti and maintenance pain |
| Identity and RBAC | Ensures users see only what they should | Security and privacy exposure |
| Workflow/orchestration | Routes reviews, exceptions, approvals, and evidence loops | Manual glue work survives the automation |
| Observability and telemetry | Measures accuracy, override rates, adoption, cost, and failures | No way to know if the pilot works |
| Change/release governance | Versions controls, standards, prompts, and schemas | Agent drift and stale review logic |
17.3 Practical staffing scenarios
| Scenario | When it fits | Likely staffing | Tradeoff |
|---|---|---|---|
| Minimal internal alpha | Controlled proof using sample artifacts and manual source package | Tony + David + 1 engineer part-time + reviewers | Fastest, but limited integration |
| Productized pilot | Team use with controlled intake, RBAC, workflow, telemetry, and reusable control library | Tony + David + 1 engineer 0.75-1.0 FTE + data/control architect 0.5 FTE + review advisors | Best balance of speed and discipline |
| Enterprise integrated pilot | Pulls from system of record, writes decision memory, integrates with platform governance | Above + platform owner + security/privacy/quality + integration support | Stronger, but slower and governance heavier |
17.4 Supplier or Rob C team role
The clean boundary is this: internal owners define the review brain; technical augmentation helps wire the system together.
Internal ownership should include product intent, review criteria, semantic/control model, source authority, data classification, prioritization, exception rules, output model, and decision memory. Rob C’s team or a supplier can help with connectors, identity/RBAC, Copilot/Copilot Studio or enterprise-stack mechanics, workflow/orchestration, telemetry, deployment, and platform compliance.
This is not anti-supplier. It is anti-outsourcing-the-part-only-we-understand. Subtle distinction, often missed by people selling roadmaps.
Historical provenance: TCO, Tokenomics, and Value Gate v1.4Build cost, run cost, tokenomics, value model, and future intake expectations.
Back to top · Back to source list
TCO, Tokenomics, and Value Gate v1.4
Every AI or agentic use case should be required to pass a basic economic reality check before it moves forward. This project should hold itself to the same standard. If a team cannot explain what it will cost to build, what it will cost to run, what licensing or token exposure exists, who will maintain it, and what value it creates, then the use case is not ready for approval.
That is especially important here because the enterprise may have favorable near-term economics through Microsoft 365 Copilot and Copilot Studio licensing. That helps, but it is not a permanent architectural strategy. Licensing can change. Token policies can change. Model availability can change. Usage patterns can surprise everyone, because nothing says enterprise innovation like discovering the invoice after the demo.
18.1 Cost categories that must be estimated
| Cost area | What to estimate | Notes |
|---|---|---|
| Build labor | Product, EA, data/control, engineering, security/privacy, quality, testing | Include internal labor even if not charged back |
| Platform costs | Copilot/Copilot Studio, Azure, AWS, data services, workflow/runtime | Validate against enterprise licensing and chargeback rules |
| Model/token costs | Prompt/completion tokens, embedding, retrieval, evaluation, batch jobs | May be hidden under seat licensing or exposed under API/runtime path |
| Storage and indexing | Artifacts, extracted text, vector indexes, logs, decision memory | Retention and classification rules drive cost |
| Integration costs | APIs, connectors, MCP/tool gateways, identity, workflow, monitoring | Higher if system-of-record integration is required |
| Governance and validation | Security, privacy, quality, GxP, audit, release/change controls | Not optional in regulated contexts |
| Sustainment | Standards updates, control library maintenance, prompt/schema changes, triage | This is where cheap vendor builds become expensive hobbies |
| Adoption and training | Reviewer enablement, documentation, feedback loops, support | Needed for trust and repeat use |
18.2 Tokenomics and runtime questions
| Question | Why it matters |
|---|---|
| Are we using seat-licensed Copilot capabilities, metered API calls, or both? | Determines whether run cost is predictable or usage-based |
| Which steps require model calls versus deterministic logic? | Prevents token burn on tasks rules can handle |
| Are we embedding source corpora, reviewing submitted artifacts live, or both? | Drives indexing and refresh cost |
| How large are typical submission packages? | Controls ingestion, extraction, context, and model cost |
| Do we need repeated evaluation runs per submission? | Can multiply cost quickly |
| Are we storing extracted facts, evidence spans, and decision records? | Reduces repeated processing but adds storage/governance obligations |
| What happens if Microsoft tokenomics or licensing changes? | Forces a Plan B before dependency becomes expensive |
18.3 Value model
Value must be classified before it is judged. A proposal may be useful and still fail the current approval test if it does not match the decision context.
| Value class | What makes it decision-ready |
|---|---|
| Direct savings | Strongest when it lands in the accountable operating budget. |
| Indirect savings | Useful when the financial path is credible but not yet booked. |
| Avoided cost | Valid when the counterfactual cost, renewal, incident, or lifecycle obligation is real and time-bounded. |
| Risk reduction | Valid when operational, security, compliance, or audit exposure is evidenced and the decision owner agrees it matters now. |
| Capacity release | Useful when released time is tied to named work or support demand. |
| Quality or rework reduction | Valid when defect load, rework, or service drag is measurable. |
| Cross-workstream value | Valid when the benefit lands outside the local team only if a benefiting owner and sponsor accept it. |
All value classes matter, but not equally in every business climate. When the immediate decision context is an OPEX reduction mandate, direct savings from an accountable operational budget may be the only class that materially changes the decision.
- Decision owner: decides whether the proposal clears the current line.
- Budget owner: owns the operating budget or capacity being affected.
- Benefiting owner: receives the service, cost, risk, or quality benefit if the claim is true.
- Evidence owner: owns the numbers, baseline, or operational proof behind the claim.
The Outcome Acceptance Line is the current threshold for approval. A proposal can show real value and still sit below the line. Below-line items may still warrant sandboxing, time-boxed feasibility, an explicit exception, or a stop decision.
The first pilot should not claim enterprise-wide transformation. It should prove a measurable local value hypothesis.
Pilot measurement examples:
| Value lever | Pilot measurement |
|---|---|
| Review cycle-time reduction | Baseline human review time versus assisted first-pass review time |
| Reviewer effort reduction | Hours saved per submission on intake, evidence review, and output drafting |
| Consistency improvement | Agreement across reviewers using common criteria and output format |
| Missing evidence detection | Percentage of incomplete submissions flagged correctly |
| Reuse of prior decisions | Number of findings or recommendations linked to precedent |
| Risk reduction | Duplicative, unsupported, restricted, or non-standard patterns caught earlier |
| Adoption | Reviewer usage, trust score, override rate, and repeat-use willingness |
18.4 Alpha-level investment bands
These are planning bands, not budget commitments. They should be replaced with actual internal rates, platform costs, and sourcing assumptions once the delivery path is selected.
| Delivery path | Likely duration | Internal effort | Augmentation need | Cost posture |
|---|---|---|---|---|
| Manual source-package alpha | 2-4 weeks | Low to moderate | Optional part-time engineer | Lowest cost, weakest integration |
| Copilot/Copilot Studio pilot | 4-8 weeks | Moderate | 0.5-1.0 technical FTE | Good proof path if governance permits |
| Enterprise integrated pilot | 8-12+ weeks | Moderate to high | 1.0+ engineering plus platform/security support | Highest fidelity, slower path |
| Vendor-led build | Variable | Still high internal SME burden | Supplier team | Risk of low upfront/high sustainment cost |
18.5 Approval gate for any future AI/agentic submission
A future submission should be considered incomplete if it cannot answer these questions:
- What business problem is being solved?
- What value class is claimed, and what measurable value is expected?
- Who owns the decision, the impacted budget or capacity, the benefit, and the evidence?
- Why is AI or agentic automation appropriate?
- What deterministic alternative was considered?
- What data is required and who owns it?
- What sources are canonical, reference-only, or prohibited?
- What platform path is proposed and why?
- What governance reviews are triggered?
- What is the expected build effort?
- What are the expected run costs, including tokenomics where applicable?
- What is the current business climate weighting most heavily?
- Is the proposal above or below the Outcome Acceptance Line, and what is the below-line path if it does not clear it?
- Who owns sustainment after launch?
- What metrics prove success or failure?
- What is the stop condition if the value does not materialize?
This is the filter that keeps architecture review from approving AI slop with a budget line. Harsh, yes. Cheaper than cleaning it up later.
Historical provenance: Reference Implementation Tracks v1.4Copilot, Copilot Studio, enterprise stack, and supplier-assisted delivery paths.
Back to top · Back to source list
Reference Implementation Tracks v1.4
Purpose
This document explains practical implementation paths. It is not a final decision. The goal is to help the team compare near-term speed, governance readiness, and long-term product fit.
Track A: Copilot / Agent Builder discovery prototype
Use this path to test interaction, prompts, rubric shape, source grounding, and user reaction with a small group.
Best for rapid discovery, early demos, and low integration. Not suitable as the durable production architecture.
Track B: Copilot Studio governed pilot
Use this path when the pilot needs controlled access, M365 grounding, Teams/SharePoint integration, basic workflow, analytics, and a more governed agent model.
This is likely the best near-term pilot path if platform governance approves.
Track C: Enterprise stack implementation
Use this path for durable production capability requiring system-of-record integration, decision memory, telemetry, auditability, source governance, and multiple enterprise integrations.
This may be Azure, AWS, or hybrid depending on platform-owner guidance.
Platform fit matrix
| Need | Copilot / Agent Builder | Copilot Studio | Azure / AWS enterprise stack |
|---|---|---|---|
| Fast learning | Strong | Good | Moderate |
| Controlled team deployment | Limited | Strong | Strong |
| Deep system integration | Weak | Moderate | Strong |
| Durable decision memory | Weak | Moderate | Strong |
| Complex orchestration | Weak | Moderate | Strong |
| Formal auditability | Limited | Moderate | Strong |
| Lowest friction | Strong | Good | Lower |
| Production readiness | Low | Medium | High |
Recommendation
Start with the narrowest platform path that can validate rubric, source authority, evidence output, and reviewer trust. Do not prematurely build the full production stack until the review brain is proven.
Cost posture by path
| Path | Cost posture | Notes |
|---|---|---|
| Manual source-package alpha | Lowest build cost | Good for proving controls and outputs, weak integration. |
| Copilot / Agent Builder | Low to moderate | Good for quick validation, limited governance/distribution. |
| Copilot Studio | Moderate | Better for controlled team pilot and M365 workflow. |
| Enterprise Azure/AWS path | Moderate to high | Best for durable integration, evidence trail, and decision memory. |
| Vendor-led build | Variable | Must challenge sustainment and change-order economics. |
Historical provenance: MVP Backlog and Open Decisions v1.4Pilot epics, open technical decisions, and build-readiness questions.
Back to top · Back to source list
Minimum Viable Product Backlog and Open Decisions v1.4
MVP goal
Prove that an internally owned architecture intelligence layer can support first-pass EA review with useful, evidence-backed findings and human-controlled decisions.
MVP epics
Epic 1: Intake and source package
- Confirm first review type.
- Define pilot submission package.
- Identify source of record for submissions.
- Build controlled upload or read-only artifact ingestion path.
- Validate supported file types.
Epic 2: Source authority and configuration
- Finalize source authority map.
- Finalize initial tool catalog.
- Finalize initial reference patterns.
- Finalize initial control library.
- Version the configuration package.
Epic 3: Extraction and normalization
- Extract text and metadata from submitted artifacts.
- Normalize extracted facts into submission schema.
- Detect missing evidence.
- Map technologies to catalog entries.
- Flag uncertain extraction for human review.
Epic 4: Control evaluation
- Apply intake completeness checks.
- Apply technology and platform checks.
- Apply data readiness checks.
- Apply security and operational readiness checks.
- Generate evidence-backed findings.
Epic 5: Human reviewer workflow
- Present summary, scorecard, findings, evidence, and missing information.
- Allow accept, reject, edit, override, and rationale capture.
- Capture final decision posture.
- Export review output.
Epic 6: Decision memory and telemetry
- Store final decision record.
- Store standards/control version used.
- Track override rate, false positives, false negatives, cycle-time savings, and adoption.
- Feed approved decisions into reusable memory.
Open build decisions
| Decision | Why it matters |
|---|---|
| Existing portal integration path | Determines whether v1 pulls artifacts or uses controlled upload. |
| Copilot Studio versus enterprise runtime | Determines governance, distribution, and integration model. |
| Control library storage | Git, SharePoint, database, Dataverse, or platform registry. |
| Decision memory storage | EA repository, database, SharePoint, LeanIX-equivalent, or governance portal. |
| Document parsing path | Native connector, document AI, OCR fallback, or manual package. |
| Knowledge graph need | Avoid premature graph complexity. |
| Mandatory telemetry | Prevents unmeasured success theatre. |
New MVP epics in v1.4
Epic: Delivery model and resourcing
- Define core team roles.
- Identify internal ownership and augmentation needs.
- Confirm Rob C team or equivalent implementation role.
- Confirm part-time security/privacy/quality reviewers.
Epic: TCO and tokenomics
- Estimate build effort by implementation path.
- Define expected run-cost drivers.
- Separate deterministic steps from model-call steps.
- Document Plan B for licensing/tokenomics changes.
- Define value metrics and stop criteria.
Historical provenance: David Working Session Questions v1.4Questions for David/Tony working session.
Back to top · Back to source list
David Working Session Questions v1.4
1. First use case and review boundary
- Are we confirming AI architecture review as the first pilot type?
- Are we validating fast-path eligibility, doing full first-pass review, or both?
- Which submissions should be excluded from the pilot?
2. Review criteria and standards
- What are the 10-20 minimum controls that matter for the first pilot?
- Which reference architectures and patterns should be considered authoritative?
- Which anti-patterns should trigger escalation?
- Which technology catalog or platform guidance is canonical?
3. Data and source authority
- Which source systems contain the submitted artifacts?
- Which sources are canonical, ranked, reference-only, deprecated, or prohibited?
- What data classes are allowed in the pilot?
- Are any submitted artifacts potentially GxP, PII, PHI, legal, quality, or security-sensitive?
- What evidence must every finding cite?
4. Platform path
- Should the first technical path be Copilot Agent Builder, Copilot Studio, Azure/AWS enterprise stack, or a manual source-package alpha?
- What platform guidance needs validation from platform owners?
- What is the Plan B if Microsoft tokenomics or licensing changes?
5. Resourcing and delivery
- What can David and Tony own directly?
- What should Rob C’s team or another technical team own?
- Do we need a data/control architect, platform engineer, or both?
- What is the minimum staffing model for a 4-8 week pilot?
- What support is needed from security, privacy, quality, or GxP reviewers?
6. TCO and value
- What is the expected effort to build the alpha?
- What are the expected recurring costs?
- What token/model costs may apply by platform path?
- What review-time savings would make this worthwhile?
- What value metrics prove success?
- What failure threshold should stop or redirect the pilot?
7. Pilot gate
- Which historical submissions should become the test set?
- What does the human gold-standard review baseline look like?
- What accuracy and usefulness threshold is good enough to continue?
- How will overrides and false positives/negatives be captured?
- Where will final decision memory live?
Historical provenance: Source Inventory and Provenance v1.4Source lineage, provenance, and evidence-use posture.
Back to top · Back to source list
Source Inventory and Provenance v1.4
Purpose
This document explains what source materials were used to create the v1.4 package and how they should be interpreted.
Primary discovery sources
| Source | Role |
|---|---|
| David discovery questionnaire | First-hand requirements input and partially answered leadership discovery artifact. |
| Copilot Opus synthesis | Detailed synthesis of questionnaire, vendor demo/session, screenshots, and unresolved requirements. |
| Copilot GPT 5.5 Think Deeper synthesis | Strategy-oriented synthesis emphasizing product framing, v1 scope, governance, and knowledge-layer ownership. |
| Full synthesized assessment v0.8 | Earlier consolidated assessment used as a narrative and provenance baseline. |
Enterprise platform and stack sources
| Source | Role |
|---|---|
| Enterprise AI distillation agnostic MD | Human-readable enterprise AI and agentic architecture reference. |
| Enterprise AI knowledge base JSON/YAML | Machine-readable platform, governance, tool, and pattern knowledge base. |
| Enterprise AI reference HTML | Visual reference version of the enterprise AI architecture material. |
| v1.1 enterprise agentic architecture package | Breadth layer: ontology, tools, reference patterns, broad controls. |
| v1.2 operational hardening package | Discipline layer: source authority, schemas, executable controls, reviewer guide, agent instructions. |
Interpretation note
Some enterprise stack materials are company-agnostic or model-distilled. They are useful architecture inputs, but they should not be treated as official company policy until validated by the appropriate platform owners.
Packaging note
The v1.4 ZIP includes original source files, normalized Markdown extracts, rich HTML source extracts, extracted v1.1/v1.2 package contents, and the new engineering-orientation documents.
v1.4 additions
This version adds explicit delivery roles, resourcing, TCO, tokenomics, and value-gate considerations as first-class alpha intake requirements. These are not final budget commitments. They are planning artifacts to ensure the initiative, and future submissions modeled after it, consider build effort, operating cost, sustainment, and measurable value before build approval.
David Discovery QuestionnaireDavid partially answered discovery questionnaire.
Back to top · Back to source list
Enterprise Architecture Review Agent
Discovery Questions for Leadership (Enterprise Architecture & Governance)
1. Problem Clarification (What problem are we actually solving)
1.1 Core Pain
What specifically is broken today in the architecture review process?
Where is the most time being spent?
Intake
Understanding artifacts
Performing evaluation (the biggest issue – needing more efficient reviews that apply existing solutions/capabilities/standards or reviewing a brand new AI or Cloud or SaaS solutions architecture)
Writing outputs (document the ARB decisions and have them retrievable to provide context later on)
What part of the process drives the most frustration for the review team? (finding existing solutions for a capability and getting alignment for use) (Shadow IT building solutions that aren’t supported and need to be sun-setted or reengineered) If a new solution doesn’t exist, quickly finding a new vendor solution or building a solution decision. Documenting the solution.
1.2 Current Scale
How many architecture reviews do you process per month today? (as a team - >20)
What is the expected growth over the next 6-12 months? (with AI >30)
How many architects are actively doing reviews today?
1.3 Failure Modes
What does a bad architecture review look like today? (taking too much time to review. Approving something that wasn’t the best fit – standard, duplicative solution, non-scalable – something that doesn’t adhere to the guiding principles (Data, AI, EA, Integration). Not having an architecture review at all. An architect not understanding the business process so not a comprehensive awareness of what is being asked to be reviewed. Being pressured to approve something – especially an exception due to the timing of the project deliverables – therefore increasing technical debt)
Where do inconsistencies show up across reviewers? Operating off different levels of information (old information vs new information), not knowing where to apply or find standards
What risks have historically slipped through reviews? Shadow builds that don’t scale. Shadow builds that need reengineered. Approvals without Arch review therefore building something that is duplicative and costly.
2. Current Workflow (Baseline before we design anything)
2.1 Intake
How are architecture reviews requested today?
Email
SharePoint
Official ARB Reviews in some cases
aCAR review
Ticketing system
What artifacts are provided?
PPT
SA Express
Vendor documentation
Consultant documentation
Diagrams (Visio, Draw.io, Lucid, etc.)
Written documentation
2.2 Review Process
Is there a defined checklist today, even if informal? No
Do all reviewers follow the same process, or is it individual style? No
How long does a typical review take end-to-end? 2 weeks to months
2.3 Output
What does the final output look like today?
Formal document
Email response
Slide deck
Load Architecture into LeanIX
GAP on documenting ARB decisions
Captured in some cases formal ARB meetings
Captured in GenAI council Portal
Is there a standard format or template? (no – but leaning towards LeanIX)
3. Evaluation Model (Most critical dependency)
3.1 Criteria Definition
What criteria are used to evaluate architectures today?
Integration patterns?
Security?
Technology alignment?
Compliance?
Data Readiness
Guiding Principles for Integration, Data, AI, and Overall EA GPs
Capabilities already exist
Are these:
Documented anywhere? PPTX
Consistent across reviewers? NO
3.2 Decision Logic
How do reviewers determine:
Pass vs fail? Approved or not approved. Or approved with an approved exception
Risk severity? If an exception
What is subjective vs objective today? The reviews are more fact based. Politics or leadership influence can drive opinions.
3.3 Standards
What enterprise standards are architectures expected to align to?
Reference architectures
Approved technology stacks
Integration patterns
Security models
Guiding Principles (Data, Integration, AI, EA)
4. Scope Definition (Initial vs long-term)
4.1 Initial Use Case
What specific type of architecture review should we start with?
HR systems?
Integration architecture?
Application design?
If we had to pick ONE: HR or a new AI Build
4.2 Audience
Who are the first 7 users?
What is their role:
Senior Enterprise Architects
Reviewers
Requestors
What level of technical depth do they expect from the tool?
4.3 Expansion Path
When this works, who is next?
Do we expect this to become:
A standard within the corporate IT architecture function?
Enterprise-wide standard? Perhaps a version of this down the road
5. Output Requirements (What success looks like)
5.1 Deliverables
What must the output include:
Summary
Findings
Risks
Recommendations if there are gaps
Recommendation on approved or not approved
Score / rating
5.2 Standardization
Do you want all reviews to look identical? (That would be ideal)
Or allow flexibility by domain? Maybe Data and Integration are different but would rather not have them be different
5.3 Consumption
Who consumes the output:
Architects
Project teams
Leadership
Sponsors
Does output need to be:
Board-ready?
Audit-ready?
Developer-friendly?
Business partner friendly as well
6. Automation Expectations (Reality check section)
6.1 Level of Automation
What level of automation do you expect: (Assist the Human)
Assistive (recommendations only)
Semi-automated (pre-populated analysis)
Fully automated (decisions generated)
6.2 Trust Model
How much do you trust AI-driven recommendations today? (High level when it comes to technical analysis)
What must remain human-controlled? Final Approval
6.3 Explainability
Do outputs need to show:
Why a recommendation was made? Ideal
What rule triggered it?
Recommendation suggestions
(Important for adoption)
7. Integration Constraints (Enterprise environment realities)
7.1 Tooling Ecosystem
Where should this live: (open to the path of least resistance)
Inside existing dashboards
Standalone app
Integrated into SharePoint / Teams (Ideal)
7.2 Identity + Access
Must it integrate with:
- Entra ID / SSO
Any data sensitivity concerns with architecture artifacts? No
7.3 Data Sources
Are there existing:
Architecture repositories (yes: SharePoints, LeanIX, CMDB, Confluence, SAExpress)
System catalogs
Inventory tools
8. Build vs Buy Decision Inputs (Directly tied to your ask)
8.1 Vendor Benchmark
What specifically did the vendor demo show that felt “tight”? (really liked the dashboard and criteria)
Intake?
Evaluation?
Visualization?
Workflow?
8.2 Gap vs Current Capabilities
What part of the vendor demo do you believe we already partially have? Not really anything
What part did we clearly NOT have?
The one part the vendor had that we may not need is the Graph Contextualization of data – is that needed? It would probably make the reviews more accurate
8.3 Tolerance for Iteration
Are you comfortable starting with:
- 60% solution in 4 weeks
Or do you expect:
- High fidelity from day 1?
9. Success Definition (critical alignment)
9.1 Success Criteria
- What would make you say: 60-80% faster reviews. >60% accuracy on the assistance
9.2 Failure Criteria
- What would make you say: <60% accuracy, agent starts to degrade, non-adoption
9.3 Timeline Expectation
When do you expect to see: not urgent but the sooner the better if even a prototype
First usable prototype
Meaningful value
10. Strategic Intent (hidden but important)
10.1 Positioning
Is this intended to become:
A corporate IT architecture capability first – trailblazers
A platform across the enterprise
10.2 Ownership
Who owns this long term:
The architecture & governance function? Maybe at first but long term need a TPO/TS
IT product team?
Shared platform?
10.3 Scope Boundaries
Should this evolve into:
Full architecture governance platform
Or stay focused on review process only – lets start here
How to use this with leadership
Don’t send as a list. Run it as a 45-60 min working session:
Start with: “What problem are we solving?”
Move quickly to: “What criteria do you actually use today?”
Lock: First use case
End with: “What defines success in 30 days?”
Bottom line
The most important answers you need are:
What are the actual review criteria? (non-negotiable) – Content or links to the guiding principles exist. Challenge will be looking into other systems like LeanIX or Confluence
What is the first use case? (scope control)
What does a “good output” look like? (drives design)
How much automation is acceptable? (prevents overbuild)
If you get those 4 right, you can start building immediately. If you want next step, go straight to: “Design the v1 architecture review agent (system design + workflow + UI)” That’s where this turns into something you can show leadership in ~2-3 weeks.
Copilot Opus Requirements SynthesisOpus synthesis from questionnaire, vendor demo, and screenshots.
Back to top · Back to source list
Enterprise Architecture Review Agent: Full Problem Context and Requirements
Purpose of This Document
This document consolidates all known information about an initiative to build (or procure) an AI-powered agent that assists enterprise architects in performing architecture reviews. It synthesizes three primary sources:
A completed discovery questionnaire with responses from the Lead Architect (the primary sponsor and domain owner)
A transcript of a vendor demo call where the vendor presented a working product and the Lead Architect, the Architecture Governance VP, and an additional stakeholder discussed requirements, constraints, and concerns
Screenshots of the vendor’s working product, including the submission portal, review dashboards, scoring views, knowledge codification files, and admin screens
All company-identifying information has been removed. Roles are described generically.
1. The Problem Statement
1.1 What Is Broken
The enterprise architecture review process is manual, inconsistent, slow, and poorly documented. Specific breakdowns:
Performing the evaluation is the single biggest time sink. Architects must manually determine whether a proposed architecture aligns with existing enterprise standards, uses approved or existing capabilities, or introduces something new that needs vetting. There is no tooling or structured process to accelerate this.
Writing and documenting outputs is the second pain point. Architecture Review Board (ARB) decisions are not consistently captured in a retrievable format, making it difficult to reference past decisions for context on future reviews.
Finding existing solutions is a major frustration for the governance leadership. When a capability already exists in the enterprise, it is often not surfaced during the review, leading to duplicative builds.
Shadow IT is a persistent problem. Business units and project teams build solutions outside of architecture governance. These ungoverned solutions frequently do not scale, are not supportable, and eventually need to be re-engineered or sunset at significant cost.
When no existing solution exists, there is no fast mechanism to evaluate whether to build internally or procure a vendor solution, or to document that decision.
1.2 Scale
The architecture team currently processes more than 20 reviews per month.
Expected growth to 30+ per month within 6-12 months, driven primarily by AI project proliferation.
One architect reported logging into the AI Governance Council on a single day and having 14 reviews queued. Even at one hour per review (which was described as unrealistically fast), that is 14 hours of review work in a single session.
The team cannot hire additional headcount. Headcount is actively challenged. Any efficiency gain must come from tooling and process improvement.
1.3 What a Bad Review Looks Like
The Lead Architect identified several failure modes:
Taking too long to complete a review (weeks to months)
Approving something that was not the best fit: a non-standard technology, a duplicative solution, a non-scalable design, or something that violates guiding principles (Data, AI, Enterprise Architecture, Integration)
Not having an architecture review at all (projects bypassing the process entirely)
An architect not understanding the business process behind the submission, leading to an incomplete or superficial review
Being pressured to approve an exception due to project timeline constraints, thereby increasing technical debt
Approvals without architecture review, resulting in duplicative and costly builds
1.4 Inconsistencies Across Reviewers
Architects operate off different levels of information. Some work from current standards; others work from outdated information.
There is no single source of truth for where to find or apply standards.
No defined checklist exists. No reviewers follow the same process. Review approach is entirely individual style.
1.5 Risks That Have Historically Slipped Through
Shadow IT builds that do not scale
Shadow IT builds that need to be re-engineered after the fact
Approvals granted without architecture review, leading to duplicative and costly implementations
2. Current Workflow (Baseline)
2.1 Intake
Architecture reviews are requested through multiple channels with no unified intake:
Email
Document management platform (e.g., SharePoint-equivalent)
Official ARB reviews (formal board process, used in some cases)
Architecture compliance review process (aCAR or equivalent)
Ticketing system
Artifacts provided by submitters vary widely:
PowerPoint presentations
Architecture catalog tool exports (e.g., SAExpress-equivalent)
Vendor documentation
Consultant documentation
Diagrams (Visio, Draw.io, Lucid, etc.)
Written documentation (Word, PDF)
There is no standardized submission package.
2.2 Review Process
No defined checklist exists, even informally
Reviewers do not follow a common process
Typical review takes 2 weeks to months end-to-end
The Lead Architect described the current state as “whack-a-mole” with AI projects appearing constantly
2.3 Current Outputs
Final outputs take inconsistent forms:
Formal documents
Email responses
Slide decks
Architecture loaded into the enterprise architecture management platform (e.g., LeanIX-equivalent)
Decisions captured in formal ARB meetings (some cases)
Decisions captured in the AI Governance Council portal (for GenAI projects)
There is a recognized gap in consistently documenting ARB decisions
No standard format or template exists. The team is leaning toward using the EA management platform as the system of record.
3. Evaluation Model (The Most Critical Dependency)
3.1 Criteria Currently Used
Architectures are evaluated against the following (when they are evaluated consistently at all):
Integration patterns
Security posture
Technology alignment (does it use approved/standard technologies?)
Compliance (regulatory, internal policy)
Data readiness
Guiding Principles: four sets exist for Integration, Data, AI, and overall Enterprise Architecture
Whether the capability already exists in the enterprise (duplication check)
These criteria are documented in PowerPoint presentations. They are not consistent across reviewers. There is no machine-readable or structured codification of these criteria.
3.2 Decision Logic
Outcomes are: Approved, Not Approved, or Approved with an Approved Exception
Risk severity is assessed primarily when an exception is granted
Reviews are described as “more fact-based,” but politics and leadership influence can drive opinions. The response to “what is subjective vs. objective” trailed off, suggesting this is an uncomfortable truth.
3.3 Enterprise Standards
Architectures are expected to align to:
Reference architectures
Approved technology stacks
Integration patterns
Security models
Guiding Principles (Data, Integration, AI, Enterprise Architecture)
These standards are scattered across multiple systems: PowerPoint files, the wiki platform, the EA management platform, CMDB, and the architecture catalog tool. The Lead Architect noted that the review criteria content exists or can be linked to, but the challenge is accessing and consolidating it from systems like the EA management platform and the wiki platform.
4. Scope Definition
4.1 Initial Use Case
If forced to pick one starting point: HR systems architecture or a new AI build architecture review. These represent the two highest-volume, most immediate categories.
4.2 Target Users
First users: approximately 7 people.
Roles:
Senior enterprise architects (primary)
Reviewers (performing the evaluation)
Requestors (submitting architectures for review)
The level of technical depth expected from the tool was not specified but implied to be high given the architect audience.
4.3 Expansion Path
Initially a capability for the Central Architecture Group (trailblazers / first movers)
Perhaps a version for enterprise-wide use down the road
The Governance VP noted during the demo call that the concept is portable across dimensions (legal, procurement, business strategy) but the immediate focus must be architecture review
5. Output Requirements
5.1 Required Deliverables
Every review output must include:
Summary of the architecture
Findings (what was evaluated)
Risks identified
Recommendations if gaps exist
Recommendation on approved or not approved
Score or rating
5.2 Standardization
Ideal: all reviews look identical in format
Acceptable: slight variation by domain (e.g., Data reviews vs. Integration reviews might differ), but the Lead Architect would prefer they not be different
No standard template exists today
5.3 Audience for Outputs
Outputs are consumed by:
Architects
Project teams
Leadership
Sponsors / business partners
Outputs must be:
Board-ready (presentable to governance boards)
Audit-ready (defensible, with traceable evidence)
Developer-friendly (actionable for technical teams)
Business-partner-friendly (comprehensible to non-technical stakeholders)
6. Automation Expectations
6.1 Level of Automation
The Lead Architect explicitly selected: Assist the Human.
Assistive mode: recommendations and pre-populated analysis
Not fully automated decision-making
The agent should do the heavy lifting on technical analysis; the human makes the final call
6.2 Trust Model
High trust in AI for technical analysis specifically
Final approval must remain human-controlled. This is non-negotiable.
The concept discussed in the vendor demo was: AI handles 80% of the review (the mechanical evaluation against standards), and humans handle the remaining 20% (judgment, context, institutional knowledge, edge cases)
6.3 Explainability
Outputs should show:
Why a recommendation was made (ideal, important for adoption)
What rule or standard triggered the finding
Suggested remediation or recommendation
The Lead Architect explicitly flagged explainability as important for adoption. If architects cannot understand why the agent made a recommendation, they will not trust or use it.
7. Integration Constraints
7.1 Where It Should Live
Open to the path of least resistance
Ideal: integrated into the enterprise collaboration platform (e.g., Teams-equivalent) or document management platform (e.g., SharePoint-equivalent)
Acceptable: standalone app or existing dashboard
The Lead Architect stated during the demo call that the organization is trying to route everything through the collaboration platform or the enterprise AI assistant (e.g., Copilot-equivalent)
Suggested that complementary agents within the enterprise AI assistant ecosystem could work together (e.g., an architecture review agent alongside other enterprise agents)
7.2 Identity and Access
Must integrate with enterprise identity provider (SSO)
No data sensitivity concerns with architecture artifacts themselves
7.3 Existing Data Sources
Architecture-relevant data already exists in:
Document management platform (SharePoint-equivalent)
EA management platform (LeanIX-equivalent)
CMDB
Wiki platform (Confluence-equivalent)
Architecture catalog tool (SAExpress-equivalent)
These are the sources from which standards, reference architectures, approved technology stacks, and integration patterns would need to be ingested.
7.4 Existing AI Governance Portal
A critical integration constraint surfaced during the vendor demo: the enterprise already operates an automated AI Governance Portal where project teams submit AI projects for approval. This portal:
Has a defined submission workflow with questions and answers
Includes an area for submitters to upload architecture documents
Manages a fast-path (pre-approval) process based on Q&A criteria
Is the system of record for AI project approvals
Also handles other approval gates: legal, privacy/PII, trademark, information security risk management
Any architecture review agent must eventually integrate with this portal to:
Access submitted architecture documents without requiring double-submission
Validate whether fast-path approvals truly qualify (a concern raised by the Governance VP: submitters self-certify as fast-path based on Q&A, but the architecture team has no way to verify without reviewing the actual architecture documents)
Provide an admin/architect view of all submissions with AI-generated scores, even for fast-path items, enabling periodic human spot-checks
7.5 Enterprise AI Platform Considerations
The enterprise has organization-wide licenses for an AI assistant platform on a per-seat (not per-token) pricing model
This removes token cost as a near-term constraint, though this could change in 3+ years
The Technology Services Group is releasing an enterprise-wide agentic framework and an enterprise MCP (Model Context Protocol) gateway. These change the landscape for how agents are built and deployed. Any solution must account for this evolving infrastructure.
The enterprise is also described as “an Azure shop” (or equivalent primary cloud provider), which influences technology choices
8. What the Vendor Demonstrated (Reference Architecture)
An external vendor demonstrated a working Smart Architecture Review (SAR) agent platform. This section documents what was shown, how it works, and what it reveals about the technical approach, since this serves as a reference architecture regardless of the build/buy decision.
8.1 Platform Overview
The vendor’s platform (called TCA, hosted at their development domain) is a web application with the following components:
Agents available:
Generic Agent
Interview Agent
Summary Agent
SAR Agent (Smart Architecture Review, the focus of the demo)
Projects section for organizing submissions.
8.2 Submission Workflow
A 5-step process:
Upload: Drag-and-drop artifact upload. Accepts ZIP, PPTX, PDF, DOCX, JPEG, PNG. Max 100 MB total, up to 20 documents per submission. In practice, submissions come as ZIP files containing multiple documents (use case doc, architecture doc, standards doc, diagrams, etc.).
AI Pipeline: Automated processing of uploaded artifacts
Review Details: Structured extraction and analysis
Filter Standard: Apply client-specific standards to the analysis
Deep Review: Full evaluation with scoring
Submitter view (“My Submissions”):
Shows active submissions (3 in demo), pending review (2), gaps to address (0), approved (0)
Lists recent submissions with names, dates, and statuses (Submitted, Draft, “Stages 1-3 done, Deep Review pending”)
8.3 Admin Dashboard
An admin/reviewer dashboard showing:
Active Submissions: 30
Average Cycle Time: 0 days (demo environment)
Compliance Rate: 0% (demo environment)
Pending Gaps: 0
Review Queue: 30 submissions shown, 1 needs attention, sortable by Priority
Submissions listed with date, priority (Low/Medium/High), and status (Pending)
Client Standards section showing published standard versions (APCV4, last updated with timestamp)
8.4 Review Dashboard (Per-Submission)
The demo used a submission called “TROVE Data Platform” as the example. The review is organized into tabbed dimensions:
Overview tab: Deep Review Overview
Aggregated KPIs across all 5 tabs
Composite Score: 70%
Total Pass: 32
Total Gaps: 14
Total Reviews: 46
Tab 1: Technology Adherence (Enterprise technology compliance)
14 Compliant, 1 Restricted, 1 Declining, 16 Total Reviewed
Guardrails Coverage: 14/16 (88%)
Compliant to Standard (14): Lists each technology with its domain tag (DATA PRODUCT, INFRASTRUCTURE, SECURITY & COMPLIANCE), status (STANDARD), and info icon
- Examples: Azure Databricks (Data Products), Azure Data Lake Storage Gen2 (Storage), Azure Data Factory (Data Orchestration & Scheduling), Azure Key Vault (PKI), Azure Monitor (Logging & Auditing), Log Analytics (Logging & Auditing), and others
Needs Attention (2):
Unity Catalog: DATA PRODUCT, Data Management, RESTRICTED, HIGH severity. Analysis Note: “Unity Catalog is RESTRICTED, requires ARB exception or migration plan.” Source Document cited.
Power BI: DATA PRODUCT, Reporting & Visualization, DECLINING, MEDIUM severity
Tab 2: Technical Fit (Pattern alignment and Control checks)
6 Passed, 1 Attention, 7 Total Reviewed
Guardrails Coverage: 6/7 (86%)
Passed (6), each with severity and Pass status:
Architecture explicitly restricts Bronze access to ingestion processes only; BI tools and users designed to access Silver or Gold exclusively (HIGH, Pass)
Architecture describes Bronze layer as append-only with no in-place modification; immutability explicitly stated (HIGH, Pass)
Architecture confirms catalog registration is planned for all Silver and Gold datasets; required metadata fields (owner, classification, schema, SLA) identified (HIGH, Pass)
Distinct retention policies defined for Bronze, Silver, and Gold layers with rationale aligned to data classification and regulatory requirements (MEDIUM, Pass)
Architecture describes ingestion approach for each source including push/pull pattern, frequency, and schema handling strategy (MEDIUM, Pass)
Lineage described for Silver datasets including source Bronze datasets, transformation logic at design level, and downstream Gold or consumption targets (MEDIUM, Pass)
Needs Attention (1):
Observability design is missing or incomplete for one or more layer transition pipelines (MEDIUM, Fail)
Control: Pipeline Observability and Alerting Designed for Layer Transitions
Evidence: Detailed explanation citing that the architecture describes observability components (monitoring, log analytics, pipeline run logs, record counts, access logs) and alerting mechanisms for pipeline failures and data quality threshold breaches. However, it does NOT explicitly cover observability for EACH layer transition pipeline (Bronze-to-Silver, Silver-to-Gold) as separate entities. Cites specific missing dimensions: run status tracking, record count reconciliation, failure rate monitoring, data freshness SLA visibility.
Suggested Remediation: “Address Pipeline Observability and Alerting Designed for Layer Transitions before resubmission.”
Tab 3: Security (Security posture assessment)
7 Passed, 5 Attention, 12 Total Reviewed
Guardrails Coverage: 7/12 (58%)
Passed (7):
Architecture specifies MFA for all human access paths to production (CRITICAL, Pass)
All service interfaces specify an authentication mechanism (CRITICAL, Pass)
Architecture specifies encryption at rest with a documented key management approach for all Confidential and Restricted stores (CRITICAL, Pass)
Architecture designates a centralized secrets management service for all credentials and certificates (CRITICAL, Pass)
Architecture specifies encrypted transport for all inter-component interfaces (CRITICAL, Pass)
Architecture defines a least-privilege access model with roles and permissions scoped per actor and service (HIGH, Pass)
Architecture designates a PAM approach and excludes persistent direct privileged credentials (HIGH, Pass)
Needs Attention (5):
Architecture does not describe a vulnerability scanning stage in the delivery pipeline, or it is advisory only (HIGH, Fail)
Pipeline architecture does not include a static security analysis gate or treats it as advisory only (HIGH, Fail)
Threat model or security design section is absent from submitted artifacts (HIGH, Not Evidenced)
Architecture does not describe how vulnerabilities will be managed or who is responsible post-deployment (HIGH, Fail)
No threat model is included or referenced in the submission (HIGH, Fail)
Tab 4: Infrastructure (Cloud design guardrails)
4 Passed, 5 Attention, 9 Total Reviewed
Guardrails Coverage: 4/9 (44%)
Passed (4):
Architecture designates a centralized KMS for all key operations and excludes local key storage (CRITICAL, Pass)
Architecture specifies IaC as the provisioning mechanism for all infrastructure; manual provisioning is not described as an intended path (HIGH, Pass)
Architecture declares RTO and RPO for all production workloads and describes a failover or recovery design to meet them (HIGH, Pass)
Architecture declares a tagging strategy with at minimum environment, team/cost center, data classification, and workload identifier dimensions (MEDIUM, Pass)
Needs Attention (5):
Environment isolation design is not documented in submitted artifacts (CRITICAL, Not Evidenced)
Architecture does not declare target regions or environment tiers, or introduces a novel region without justification (HIGH, Fail)
Outbound network traffic controls are not described in submitted artifacts (HIGH, Not Evidenced)
Architecture describes workloads outside defined zones or without network access controls (HIGH, Fail)
Non-production environment lifecycle design is not described in submitted artifacts (MEDIUM, Not Evidenced)
Tab 5: Risk & Compliance (Regulatory gap analysis)
1 Compliant, 1 Needs Attention, 1 Open Gap, 2 Regulations total
Compliant (1): FDA Data Retention, encryption controls satisfied
Needs Attention (1): HIPAA Security Rule, gaps in Vulnerability Management Process and Owner Designated
Open Gaps (1): SEC-011, Vulnerability Management Process and Owner Designated, HIGH severity, Not Satisfied, HIPAA. Remediation: “Address before resubmission.”
Tab 6: History (visible in tabs but not shown in screenshots)
8.5 Human-in-the-Loop (HITL) Review Screen
Labeled “Client Standards” with a published standard version (APCV4, with timestamp).
Purpose: “Review and validate extracted tools, controls, and patterns before they go live.”
Three categories with counts:
Tools (52): All 52 accepted (100%), 0 Needs Review, 0 Rejected
Controls (29): Count visible
Patterns (22): Count visible
Tools list showing:
Tool/Technology name, Category, Confidence %, Catalog status, HITL Status
Filterable by: Domain, Capability, Tool Group, Source Document, Confidence level (High >=80%, Medium 60-79%, Low <60%), HITL Status (Accepted, Rejected, Needs Review)
Examples from the list:
AI Service | Application Platforms | 100% | STANDARD | Accepted
Analysis Services | Transformation | 100% | STANDARD | Accepted
Angular WebApp | Application Platforms | 100% | STANDARD | Accepted
Azure AI Search | Search | 100% | STANDARD | Accepted
Azure DevOps | Continuous Integration | 93% | STANDARD | Accepted
Azure Monitor | Application Performance Monitoring | 71% | STANDARD | Accepted
Azure Open AI | AI Guardrails | 86% | STANDARD | Accepted
Blob Storage | Data Replication | 100% | STANDARD | Accepted
Collibra | Data Catalog | 100% | STANDARD | Accepted
8.6 Knowledge Foundation (The Backend)
The vendor showed the actual files that power the intelligence behind the agent. These were visible in a code editor (Notepad++) with multiple tabs open.
File 1: tool_catalog.json (32,143 characters, 975 lines)
Purpose header: “Extracting information of all tools and technologies from the standards documents. Categorizing them in standard tool vs emerging tool vs restricted/declining tool.”
Structure:
|
1 { 2 “domains”: [] 48 } |
Key attributes per tool entry: unique ID, name, status (STANDARD / EMERGING / RESTRICTED / DECLINING), aliases, source documents with page numbers, confidence score (0-100), linked controls.
Hierarchical taxonomy: L1 Domain > L2 Capability > L3 Tool Group > individual tool entries.
File 2: control_library.json (182,819 characters, 3,664 lines)
Structure per control:
|
1 { 2 “name”: “P95 Power BI Gold layer query response under 30 seconds”, 3 “finding”: { 4 “fail”: “P95 Power BI Gold layer query response under 30 seconds requirement is not met; be less than 30 seconds for standard Power BI Gold layer queries is missing, absent, or contr…”, 5 “pass”: “P95 Power BI Gold layer query response under 30 seconds requirement is satisfied; be less than 30 seconds for standard Power BI Gold layer queries is confirmed in the archit…”, 6 “insufficient”: “Cannot verify whether be less than 30 seconds for standard Power BI Gold layer queries for P95 Power BI Gold layer query response under 30 seconds - no relevant evid…” 7 }, 8 “sources”: [], 14 “l1Domain”: “APPLICATION”, 15 “severity”: “MEDIUM”, 16 “checkType”: “LLM_EVAL”, 17 “controlId”: “CTL-APP-APPRES-001”, 18 “confidence”: 100, 19 “l2Capability”: “Application Resilience & Performance”, 20 “evidenceScope”: “infrastructure_and_deployment.compute + components_and_patterns.componentInventory”, 21 “ruleStatement”: “P95 query response must be less than 30 seconds for standard Power BI Gold layer queries, measured via Databricks SQL Query History and Power BI Premium metrics.”, 22 “triggerCondition”: “always”, 23 “applicablePatterns”: [] 36 } |
Key attributes per control: name, finding templates (pass/fail/insufficient), source documents with page numbers, L1 domain, severity (CRITICAL/HIGH/MEDIUM), check type (LLM_EVAL), control ID, confidence, L2 capability, evidence scope (tells the LLM where to look in the architecture document), rule statement (the actual standard being checked), trigger condition, applicable patterns (cross-references to the pattern library).
Another example control:
|
1 { 2 “name”: “Databricks notebooks and workflows deployed via CI/CD using Databricks Asset Bundles”, 3 “finding”: { 4 “fail”: “…requirement is not met; be deployed via a CI/CD pipeline using Databricks Asset Bundles…”, 5 “pass”: “…requirement is satisfied; be deployed via a CI/CD pipeline using Databricks Asset Bundle…”, 6 “insufficient”: “Cannot verify whether be deployed via a CI/CD pipeline using Databricks Asset Bundles…” 7 }, 8 “sources”: [{“pageNumbers”: [16], “documentName”: “…”}], 9 “l1Domain”: “APPLICATION”, 10 “severity”: “MEDIUM”, 11 “checkType”: “LLM_EVAL”, 12 “controlId”: “CTL-APP-APPRUN-001”, 13 “confidence”: 100, 14 “l2Capability”: “Application Platform & Runtime”, 15 “evidenceScope”: “infrastructure_and_deployment.compute + infrastructure_and_deployment.environments”, 16 “ruleStatement”: “Databricks notebooks and workflow definitions must be deployed via a CI/CD pipeline using Databricks Asset Bundles.”, 17 “triggerCondition”: “always” 18 } |
And another:
|
1 { 2 “name”: “Bronze/Silver/Gold layers with enforced quality gates”, 3 “finding”: { 4 “fail”: “One or more of the Bronze, Silver, or Gold data layers are absent or quality gates between layers is not enforced, leaving data promotion ungoverned.”, 5 “pass”: “Bronze, Silver, and Gold data layers are established and enforced quality gates are defined at each layer transition within the data platform architecture.”, 6 “insufficient”: “No layer topology or quality gate definitions found in patterns_identified; unable to confirm Bronze/Silver/Gold boundaries or the enforcement mechanisms between the…” 7 }, 8 … 9 “ruleStatement”: “A mandatory data quality gate is enforced at the Bronze-to-Silver layer boundary. No record is promoted to Silver without passing the gate.”, 10 “triggerCondition”: “always” 11 } |
File 3: pattern_library.json (visible in tab but content not shown in screenshots)
File 4: conjunction_rules.txt (visible in tab but content not shown)
Additional tabs visible: new 1 through new 12 (suggesting 12+ additional working files)
8.7 Architecture Diagram (Reference Client)
One screenshot showed a PowerPoint slide titled “Tech Arch/Methodology” from a different client engagement (Asia Pacific, Engineering Maintenance Copilot ARB submission). This was labeled “Highly Confidential.” It showed a full architecture diagram with:
Batch Processing and Real Time Processing lanes
Source data (PDFs, corporate data sources including SAP, FND Layer, Trusted Layer, Job API) flowing into Blob Container storage
Processing through Azure ML Compute Instance, Cognitive Search (Unstructured Data + Embeddings), AI Search
Databricks SQL Warehouse Endpoint, Databricks Genie Rooms/API
LLM Gateway + OpenAI, Prompt Flow
Front End: Azure Webapp with React, SSO
Supporting services: Azure Active Directory, Container Registry, Key Vault, Application Insights, GitHub, SQL, Session management/feedback/metadata
Azure AI Translator, Content Safety, Language Service, Q&A storage
References to Enterprise GenAI DevKit, EDP, EMN frameworks
AgentOps SDK Framework reused modules (Semantic Search, Evaluation Framework, MLOps, Data Loading)
This diagram was shown as an example of the kind of architecture document that gets submitted for review, not as the vendor’s own architecture.
9. Key Discussion Points from the Vendor Demo Call
9.1 Value Proposition (Vendor’s Pitch)
Current state: 4-6 weeks per review. Target: reduce by 50%+ (2-3 weeks, or if currently at 3 weeks, aim for 1 week).
AI does the first pass (80% of the evaluation work). Humans complete the remaining 20%.
Architecture reviews have built-in margin of error; they are not binary pass/fail in the same way that code compilation is. This makes them a good fit for AI assistance where 80% accuracy on the first pass is valuable.
Three key benefits:
Review time reduction (even 10 hours saved per review is significant at scale)
Institutional memory: the agent builds and maintains organizational knowledge that is currently scattered, tribal, and hard to transfer. Examples of institutional knowledge: “this doesn’t work because the ISD lines are unreliable on this particular circuit,” “this doesn’t work without change management readiness of this particular team,” “someone else is already using this capability, why can’t we leverage that instead of building from scratch.”
Dynamic standards integration: when a new standard is released, the agent incorporates it immediately into reviews.
9.2 Knowledge Foundation Discussion
The vendor emphasized that the knowledge foundation is the hardest and most important part, not the agent itself.
Two categories of knowledge:
Internal standards: the organization’s architectural and technology standards, both capability-side and technology-side. Localized, implicit, and existing internal standards.
External/industry standards: industry design patterns, pharma best practices, zero-day vulnerability insights, regulatory changes. These must be curated and maintained live.
The vendor stressed “live” maintenance. A one-off setup that is not maintained becomes stale within 3-4 months and loses value. Live connections to standards bodies and sources are needed.
The vendor described “model collapse” as a key risk: if the relationships between standards, patterns, best practices, and curations are not properly configured, the agent will produce poor results. The knowledge must be structured with proper relationships, not just dumped as raw documents.
9.3 Token Efficiency Discussion
The vendor warned against uploading raw PDFs/PPTX directly for LLM consumption. This burns excessive tokens. Converting standards to structured Markdown files and maintaining proper semantic structure reduces token consumption dramatically.
The Lead Architect noted that the enterprise currently has per-seat AI assistant licenses (not per-token), removing the token cost concern for now. The vendor acknowledged this addresses the token cost but noted the model robustness concern remains: properly structured knowledge produces better results regardless of cost model.
9.4 Maintainability Discussion
The Lead Architect asked: “Is there an easier way than markdown? Where you can update like a script in [the AI assistant] or something easier where you can maintain on your own if something changes?”
The Governance VP asked the same question: “Do you have a user interface that could be leveraged to govern these markdown files? Because we’re evolving, and over time our standards may change. Especially since we’re currently formulating what those standards are. We’re in flux.”
The vendor’s current approach: folders in their system, markdown files maintained in Git with version control. They acknowledged a UI could be built on top for enterprise maintainability but do not have one today.
The Lead Architect commented: converting to markdown “makes it more accurate, for sure” but “you have to balance it out” between accuracy and ease of maintenance.
9.5 Fast-Path / Pre-Approval Discussion
The Governance VP raised a critical requirement around review levels:
The enterprise has a fast-path process for AI projects that meet certain criteria (answered via Q&A in the governance portal and automatically pre-approved)
Concern: are fast-path projects truly qualifying, or are submitters gaming the Q&A?
Proposed solution: even fast-path submissions should have their architecture documents uploaded and run through the AI agent, with results available to architects for spot-checking
The vendor confirmed this is feasible: fast-path submitters would see only a pre-approval result, but architects would have an admin view showing all submissions with full AI-generated scores, enabling periodic review (e.g., block 3-4 hours on a Friday to review flagged items)
9.6 Multi-Document Submissions
The Governance VP raised that architecture review artifacts often span multiple documents. The vendor confirmed their system handles this via ZIP file uploads containing all relevant documents.
9.7 Submitter Burden
The Governance VP expressed concern about adding submission burden. Submitters already upload documents to the AI Governance Portal. Any new tool must either integrate with that portal or not require double-submission. The governance portal is the system of record; whatever is built must access documents there.
9.8 Integration with Existing Portal
The Lead Architect emphasized: “Today I download architectures from the AI Council. I upload them sometimes into a rubric that helps a little.” The current workflow is manual download-then-analyze. Any solution should eliminate this.
The Governance VP made it clear that the AI Governance Portal already manages multiple approval gates (legal, privacy/PII, trademark, information security, architecture). Architecture review is one gate among many. The solution cannot exist in isolation.
9.9 Teams / AI Assistant Integration
The Lead Architect stated: “We would love to have this kind of front door in [the collaboration platform]. We’re trying to route everything through [the collaboration platform] or [the AI assistant]. Could you do the markups and stuff connected through [the AI assistant]? We may have complementary agents within [the AI assistant] that we’re using.”
This is a strong stated preference for native integration with the enterprise’s existing AI assistant and collaboration platform rather than a separate portal.
9.10 Accuracy and First Impression
The Lead Architect expressed sensitivity about accuracy: “I really want this to work on the first pass. Iterations are fine, but when we get to this point, it’d be nice to show [leadership] that this is really going to work, and that’ll help us get the funding.”
The markdown approach was acknowledged as supporting accuracy: “That’s why I kind of like the markdown part, because it ensures a little bit more accuracy.”
This is politically important: the Lead Architect needs to demonstrate value to leadership (described as “a bit sensitive” about the investment). A high-quality first demo matters.
9.11 Extensibility Beyond Architecture
The vendor noted the concept is portable to any evaluation dimension: legal, procurement, business strategy, architecture. The Governance VP agreed but clarified: the immediate focus must be architecture review for architects. Once proven, it could be “shopped around” to other domains.
The Lead Architect expanded the vision: once architectures are approved, downstream automation could follow (e.g., an agent or API pushing approved architectures into the EA management platform, triggering capability mapping). “This can really grow into a lot more automation than just analysis.”
9.12 Pricing and Next Steps
The vendor was asked to:
Send a deck summarizing what was shown
Send the recording (the call was recorded)
Provide a rough cost estimate (“give us a range”)
The Lead Architect noted he would share with the Senior Sponsor and other stakeholders to determine next steps
The Lead Architect explicitly said: “Something, I don’t know, if you built this up for the books, but if there is something involved, I guess a minute, that would be even better.” This suggests interest in a lightweight/minimal engagement option.
10. Build vs. Buy Inputs (From the Questionnaire)
10.1 What the Vendor Showed That Felt “Tight”
The Lead Architect specifically liked the dashboard and the evaluation criteria/scoring approach.
10.2 What the Enterprise Already Has
The Lead Architect answered: “Not really anything.” The enterprise does not believe it currently has any partial capability equivalent to what the vendor demonstrated.
10.3 The Graph Question
“The one part [the vendor] had that I am not sure we need is the Graph Contextualization of data. Is that needed? It would probably make the reviews more accurate.”
This is an open question. The knowledge graph / semantic layer is the vendor’s most technically differentiated capability. The Lead Architect is uncertain whether the accuracy improvement justifies the complexity.
10.4 Tolerance for Iteration
The Lead Architect is comfortable starting with a 60% solution in 4 weeks. Does not expect high fidelity from day 1.
11. Success and Failure Criteria
11.1 Success
60-80% faster reviews (measured against current 2-week-to-months baseline)
Greater than 60% accuracy on the AI assistance (meaning the agent’s recommendations are correct and useful at least 60% of the time)
11.2 Failure
Less than 60% accuracy
Agent starts to degrade over time (model drift, stale standards, etc.)
Non-adoption by the architect team
11.3 Timeline
Not urgent, but the sooner the better, even if just a prototype
The questionnaire document suggested showing the Lead Architect something in 2-3 weeks
12. Strategic and Organizational Context
12.1 Ownership
Short-term: the Central Architecture Group owns it
Long-term: needs a dedicated Technology Product Owner or Technology Services team to sustain it
This is a recognized gap. Without a named long-term owner, the capability will degrade.
12.2 Scope Boundary
Start focused on the review process only
Do not attempt to build a full architecture governance platform in v1
Expansion to broader governance is a future possibility, not an initial requirement
12.3 Leadership Posture
The Senior Sponsor’s leadership favors internalizing the capability if it makes operational and economic sense
The organization is under headcount pressure and cannot hire
Anything that makes architecture work more efficient is welcomed
There is sensitivity around demonstrating ROI for any investment in this space
12.4 Organizational Dynamics
Standards are currently being formulated (“we’re in flux”)
Multiple groups maintain different standards: the Technology Services Group maintains some, functional IT groups maintain others, the EA management platform has some
The Lead Architect sits on the AI Governance Council and performs architecture reviews for the Central Architecture Group
The Governance VP oversees the broader architecture governance function including the AI Governance Portal and the ARB process
The IT Senior Manager (the person commissioning this analysis) leads Local Lab Services IT and is evaluating the build-vs-buy decision with broader context from an enterprise agent orchestration RFI process
13. Implicit Requirements (Not Explicitly Stated but Inferred from Context)
The solution must work with messy, inconsistent input documents. Submitters provide varying quality artifacts in varying formats. The agent cannot assume clean, well-structured input.
The solution must handle the case where submitted documents are insufficient. The vendor’s system had a “Not Evidenced” status for checks where the architecture document simply did not contain enough information to evaluate. This is a common scenario (the Governance VP noted they “constantly request more information, more documentation”).
The solution must support versioning of standards. Standards are evolving. The agent must be able to incorporate new standards and retire old ones without breaking existing reviews.
The solution must produce outputs that can withstand scrutiny from leadership, auditors, and business partners who are not architects. The audience is broad and includes non-technical stakeholders.
The solution must eventually support multiple concurrent reviews at scale (30+ per month, with potential spikes like 14 in a single day).
The solution should build institutional memory over time. Each review should contribute to the organization’s knowledge base, not just produce a one-off output.
The solution must account for the political dimension of architecture reviews: architects are sometimes pressured to approve exceptions due to project timelines. The agent’s scored, evidence-based output could provide objective backing for architects to push back on inappropriate pressure.
The solution must work within a rapidly evolving technology landscape. Enterprise-wide agentic frameworks, MCP gateways, and other infrastructure are being stood up concurrently. The solution must be adaptable to these changes.
14. Open Questions (Unresolved in the Source Materials)
How many architects are actively doing reviews today? (Asked but not answered in the questionnaire)
What level of technical depth do the first 7 users expect from the tool? (Asked but not answered)
What specific guiding principles content exists and where exactly? The Lead Architect said “I have this content or links to the guiding principles” but the challenge is accessing it from systems like the EA management platform and the wiki platform. The actual content has not been inventoried.
What does the architecture catalog tool (SAExpress-equivalent) contain, and how structured is it?
What does the EA management platform (LeanIX-equivalent) contain that is relevant to review criteria?
Who are the 7 target users specifically, and what are their individual review patterns?
Is there an existing API or integration path into the AI Governance Portal, or would that need to be built?
What is the actual cost and timeline for the vendor’s proposal? (They were asked to provide a range but the response is not captured in these materials.)
What is the overlap or conflict with the ontology/semantic layer work being explored with a separate vendor?
What is the enterprise agentic framework and MCP gateway timeline, and should this solution wait for or build on that infrastructure?
End of document. All detail from the questionnaire, transcript, and screenshots has been preserved. No information has been omitted.
Copilot GPT 5.5 Think Deeper SynthesisGPT 5.5 synthesis and strategic framing.
Back to top · Back to source list
Architecture Evaluation Agent: Full-Fidelity Problem and Requirements Summary
1. Executive framing
The organization is evaluating whether to build or outsource an AI-assisted architecture review capability.
The immediate driver is operational pressure: architecture review demand is increasing, especially for AI, cloud, SaaS, and emerging technology solutions, while staffing is constrained and additional hiring is unlikely. Leadership is interested in whether an internal capability can reduce review effort, improve consistency, preserve institutional knowledge, and accelerate approvals without compromising architectural rigor.
The core question is not simply whether an external vendor can provide a useful architecture review portal. The deeper question is whether the organization should own the architecture review intelligence layer itself: standards, guiding principles, review criteria, prior decisions, exception logic, approved patterns, technology catalogs, and institutional knowledge.
The vendor pitch demonstrated a working pattern: submit architecture artifacts, process those artifacts through an AI-assisted evaluation pipeline, compare them against codified standards and control libraries, produce category-level findings, show evidence and gaps, and support human review. The demo validated the use case, but it also made clear that the hard part is not the UI. The hard part is curating, codifying, governing, and continuously refreshing the organization’s architecture knowledge base.
The preferred strategic direction appears to be: start with an internal, focused v1 architecture review assistant, aimed at one high-volume review type, with human approval retained, and use the vendor’s approach as a reference pattern rather than immediately outsourcing the full capability.
2. Core problem statement
2.1 The architecture review process is capacity-constrained
Architecture review demand is growing, but the architecture team does not have enough capacity to keep scaling manually.
The organization currently handles more than 20 architecture reviews per month as a team, with expected growth to more than 30 per month as AI-related work increases. Review duration can range from two weeks to months, depending on complexity and process path. [CBT-Enterp…stionnaire | Word]
The practical issue is that the team is being asked to review more solutions without more people. This creates pressure to make each architect more efficient without lowering review quality.
The transcript reinforced this with a real example: one architect described having to complete 14 AI-related architecture reviews in one day. Even at one hour each, that creates an unrealistic review burden. The work is described as reactive, repetitive, and high-volume, with many submissions requiring similar checks against internal standards, guiding principles, reference architectures, and existing capabilities.
2.2 The work is not just document review
The review process is not limited to reading architecture diagrams. It requires understanding:
business process context
solution intent
technical architecture
technology choices
integration patterns
security model
data readiness
infrastructure requirements
compliance implications
existing internal capabilities
whether the solution duplicates something already available
whether the solution aligns with approved standards and guiding principles
whether exceptions are justified
whether the submitted evidence is sufficient to approve the solution
The questionnaire identifies the biggest issue as needing more efficient reviews that apply existing solutions, capabilities, standards, or evaluate new AI, cloud, SaaS, or other solution architectures. [CBT-Enterp…stionnaire | Word]
2.3 Review quality depends heavily on institutional memory
A major source of value is institutional knowledge: the undocumented or semi-documented understanding that experienced architects use when evaluating submissions.
Examples from the transcript include:
knowing that a design may not work because a particular dependency or circuit is unreliable
knowing that one team is not ready for a required change while another team is
knowing that an existing capability already solves the problem
knowing when a new build is duplicative
knowing when a standard recently changed
knowing which reference architecture, pattern, or internal capability should be reused
knowing the “read between the lines” context that is rarely captured in formal documents
The questionnaire similarly states that inconsistent reviews happen because reviewers operate from different levels of information, including old versus new information, and may not know where to find or apply standards. [CBT-Enterp…stionnaire | Word]
The real opportunity is to operationalize this institutional memory so that review quality does not depend entirely on which individual architect happens to review a submission.
3. What is broken today
3.1 Intake is fragmented
Architecture reviews are requested through multiple channels, including email, shared repositories, formal architecture review boards, specific review processes, and ticketing systems. [CBT-Enterp…stionnaire | Word]
Artifacts can include:
presentation decks
formal architecture documents
vendor documentation
consultant documentation
diagrams
Visio diagrams
Draw.io diagrams
Lucid diagrams
written documentation
system-specific forms or export packages [CBT-Enterp…stionnaire | Word]
This creates inconsistency before review even begins. There is no guarantee that all required context is included, that documents are structured consistently, or that the reviewing architect has all the relevant information.
3.2 There is no consistently followed checklist
The questionnaire says there is no defined checklist today, even informally, and reviewers do not all follow the same process. [CBT-Enterp…stionnaire | Word]
That is a critical point. An AI system cannot reliably automate review logic that the organization has not defined. A v1 effort must therefore include codifying the architecture review model, not just building an AI wrapper around documents.
3.3 Outputs are inconsistent
Current outputs can take several forms:
formal document
email response
slide deck
architecture repository entry
architecture review board meeting notes
portal capture in specific cases
informal comments or decisions [CBT-Enterp…stionnaire | Word]
There is a documented gap around capturing architecture review decisions and making them retrievable later. [CBT-Enterp…stionnaire | Word]
This creates downstream problems:
prior decisions are hard to find
teams may resubmit similar solutions without knowing precedent
architects repeat analysis
leadership lacks standardized reporting
auditability is weak
architectural rationale is lost over time
3.4 Reviewers may approve based on incomplete or inconsistent information
A bad review can happen when:
review takes too long
something is approved that is not the best architectural fit
a duplicative solution is approved
a non-scalable solution is approved
a solution does not follow architecture, data, AI, or integration guiding principles
no architecture review happens at all
the architect does not understand the business process enough to review comprehensively
the team is pressured to approve an exception because of delivery timing
approval increases technical debt [CBT-Enterp…stionnaire | Word]
Historical risks include shadow technology builds that do not scale, shadow builds that later need to be reengineered, and approvals without architecture review that result in duplicative and costly solutions. [CBT-Enterp…stionnaire | Word]
4. Business problem being solved
The business problem is not simply “reviews take too long.”
The fuller business problem is:
The organization needs a scalable, repeatable, explainable, human-governed way to evaluate architecture submissions against current standards, existing capabilities, risk criteria, and prior decisions, while reducing manual review burden and improving review consistency.
This problem has multiple dimensions:
4.1 Throughput
Review volume is rising, particularly for AI and emerging technology work. Manual review effort does not scale linearly without more architects.
4.2 Consistency
Different reviewers may apply different criteria, use different levels of current information, or produce outputs in different formats.
4.3 Speed
Reviews can take weeks or longer. The target is to reduce review cycle time significantly while preserving human control.
The stated success target is 60% to 80% faster reviews, with more than 60% accuracy on AI assistance. [CBT-Enterp…stionnaire | Word]
4.4 Knowledge reuse
The organization needs to reuse prior review decisions, existing solution patterns, approved technology choices, exception decisions, and lessons learned.
4.5 Risk reduction
The process needs to reduce the chance of approving:
non-standard technology
unsupported architecture
duplicative solutions
non-scalable solutions
insecure patterns
missing operational readiness
incomplete documentation
unjustified exceptions
shadow technology builds
4.6 Decision traceability
Architecture recommendations should be explainable. Outputs should show why a recommendation was made, what rule or criterion triggered it, and what evidence was used. David specifically indicated that explainability, including why a recommendation was made and what rule triggered it, is important for adoption. [CBT-Enterp…stionnaire | Word]
5. David’s requirements and expectations
5.1 Primary goal
David wants architecture reviews to become materially more efficient while preserving review quality and human oversight.
The questionnaire identifies the desired automation posture as “Assist the Human,” with assistive recommendations or semi-automated pre-populated analysis, not fully automated decisions. Final approval must remain human-controlled. [CBT-Enterp…stionnaire | Word]
5.2 First use case
If forced to choose one starting point, the questionnaire identifies either HR systems or a new AI build. [CBT-Enterp…stionnaire | Word]
Based on the transcript, the more operationally urgent starting point appears to be AI or GenAI architecture reviews because:
volume is high
review burden is immediate
there is an existing submission process
fast-path versus non-fast-path triage is already an issue
architecture reviewers are already spending significant time manually reviewing AI submissions
5.3 Required output
The output must include:
summary
findings
risks
recommendations if gaps exist
recommendation on approved or not approved
score or rating [CBT-Enterp…stionnaire | Word]
The desired output should be standardized. The questionnaire says identical review outputs would be ideal, though there may be some domain differences for data or integration reviews. [CBT-Enterp…stionnaire | Word]
5.4 Output audiences
The output needs to be useful for multiple audiences:
architects
project teams
leadership
sponsors
business partners [CBT-Enterp…stionnaire | Word]
The output may need to be:
board-ready
audit-ready
developer-friendly
business partner friendly [CBT-Enterp…stionnaire | Word]
This means one output format may not be enough. The system may need layered outputs:
executive summary
architect detail view
project team remediation view
audit evidence log
decision record
5.5 Trust model
David appears to trust AI for technical assistance when properly grounded, but final approval must remain human-controlled. [CBT-Enterp…stionnaire | Word]
This means the agent should not be designed as an autonomous approver. It should be a decision-support system that helps architects review faster and more consistently.
5.6 Explainability
Explainability is required for adoption. Outputs should show:
why a recommendation was made
what rule triggered it
what evidence supports it
what information is missing
what remediation is recommended [CBT-Enterp…stionnaire | Word]
This is especially important because architecture decisions can be challenged by project teams, leadership, sponsors, or governance bodies.
5.7 Success criteria
David’s success criteria:
60% to 80% faster reviews
more than 60% accuracy on the assistance [CBT-Enterp…stionnaire | Word]
5.8 Failure criteria
David’s failure criteria:
less than 60% accuracy
agent starts to degrade
non-adoption [CBT-Enterp…stionnaire | Word]
Agent degradation is important. It implies that standards drift, stale knowledge, model behavior changes, outdated embeddings, or ungoverned rules could make the system worse over time.
6. Evaluation criteria that the agent must support
The questionnaire identifies the current or expected criteria as:
integration patterns
security
technology alignment
compliance
data readiness
guiding principles for integration
guiding principles for data
guiding principles for AI
overall enterprise architecture guiding principles
whether capabilities already exist [CBT-Enterp…stionnaire | Word]
Standards expected to be applied include:
reference architectures
approved technology stacks
integration patterns
security models
guiding principles for data, integration, AI, and enterprise architecture [CBT-Enterp…stionnaire | Word]
The review logic should support:
approved
not approved
approved with exception
risk severity if an exception is needed
distinction between objective and subjective review factors [CBT-Enterp…stionnaire | Word]
The questionnaire notes that reviews are generally fact-based, but politics or leadership influence can drive opinions. [CBT-Enterp…stionnaire | Word]
That means the agent should help separate:
evidence-based findings
rule-based findings
risk-based recommendations
human judgment
leadership-driven exceptions
7. What the vendor demonstrated
The vendor demonstrated a portal-style architecture review assistant. The key elements described in the transcript and shown in the screenshots included:
7.1 Submission and review dashboard
The demo showed a dashboard containing submitted projects and architecture review requests. A new submission flow allowed users to upload architecture documents or packages.
The demo appeared to include a staged workflow:
submit architecture package
AI pipeline processes the material
review details are generated
standard review or filter review happens
deeper review output is produced
7.2 Artifact ingestion
The vendor described supporting multiple documents bundled together, including ZIP packages. The vendor emphasized that real submissions are rarely a single clean architecture document. A typical submission may include:
use case document
architecture document
standards document
supporting materials
multiple files in one package
This aligns with the organization’s reality that required review information may be spread across multiple documents and artifacts.
7.3 Category-based evaluation
The demo evaluated architecture submissions across major dimensions such as:
technology adherence
technical fit
security
infrastructure
risk and compliance
The screenshots showed dashboard cards with pass counts, gap counts, total checks, and category-level status. One sample review showed an overall score, passes, gaps, and total reviews.
7.4 Evidence-backed findings
The demo showed findings where the agent identified gaps, missing evidence, or failed checks and associated those findings with evidence and remediation guidance.
This is aligned with the need for explainability, adoption, and auditability.
7.5 Standards and control library
The vendor showed backend control libraries and catalogs represented as structured files, including JSON and Markdown-style artifacts. The vendor described these as agent-understandable markup files created from curated internal standards.
The vendor emphasized that simply uploading raw standards documents is not enough. Standards must be curated, converted, structured, related to each other, and made understandable to the agent.
7.6 Knowledge foundation
The vendor described a knowledge foundation made of:
internal standards
external standards
industry design insights
pharma or regulated-industry practices, where applicable
zero-day or emerging risk insights
technology standards
best practices
patterns
semantic relationships
vector embeddings
knowledge graph or graph-like contextualization
The vendor stated that the agent itself is the easy part, while curating and maintaining the knowledge foundation is the difficult part.
7.7 Live maintenance
A major vendor theme was that standards and architecture knowledge must be maintained live. If standards are loaded once and not maintained, the repository will become stale within months.
Examples discussed included new AI frameworks, enterprise gateways, new standards, evolving patterns, and changing security risks.
This is a core requirement: the system must account for continuous change.
7.8 UI for maintaining standards
The vendor said their current implementation uses folders and version control to maintain Markdown files, but a UI could be built to manage standards, version control, and governance.
This is important because the organization’s standards are evolving and cannot depend on technical users manually editing structured files forever.
8. Key architectural insight from the vendor pitch
The vendor’s strongest architectural point was that LLMs alone are not enough.
The vendor argued that architecture review requires a curated knowledge layer. Raw documents can be too ambiguous, too disconnected, too costly to process repeatedly, and too difficult for models to interpret reliably.
The vendor described a problem where standards, patterns, and best practices collapse into an unstructured mass if relationships are not curated properly. The vendor referred to the need to preserve relationships among standards, patterns, best practices, and review controls so the AI can reason over them more accurately.
This suggests the agent needs:
structured control definitions
standard-to-control mapping
pattern-to-standard mapping
technology catalog mapping
evidence requirements
exception logic
version control
traceability
governance ownership
update workflow
The vendor also argued that well-structured knowledge reduces token usage and improves accuracy. Even if token pricing is not immediately a constraint due to enterprise licensing, structured knowledge still matters for robustness, maintainability, accuracy, and repeatability.
9. Key internal concerns raised in the discussion
9.1 Maintainability
The organization asked whether Markdown or code-like structured files are the right way to maintain standards. The concern was that standards evolve, and the organization needs a practical way to update the rules, criteria, and knowledge base without relying entirely on developers.
A UI or governed admin experience may eventually be needed.
9.2 Maturity and evolution of standards
The organization’s standards are still evolving. The system must support maturity over time.
The tool must not assume the standards corpus is static or fully mature on day one.
9.3 Review levels and fast path logic
A key requirement is the ability to support different levels of architecture review.
Examples:
fast-path submissions
experimental submissions
production-bound submissions
more rigorous review for higher-risk or production use cases
The organization already has a fast-path process based on questions and answers in an existing portal. The concern is whether submissions truly qualify for fast path. The agent could help verify whether a fast-path submission is actually eligible based on architecture artifacts, not just questionnaire responses.
This is a major use case:
validate fast-path eligibility
identify false fast-path submissions
flag submissions requiring full architecture review
reduce burden on architects by triaging submissions automatically
9.4 Multiple-document review
The organization emphasized that review information often exists across multiple documents, components, and pieces of evidence. The agent must evaluate a complete information package, not just one document.
The vendor confirmed support for package-style ingestion, such as ZIP files with multiple documents.
9.5 Avoiding duplicate submission burden
A critical internal concern is avoiding additional burden on submitters.
The organization already has a portal or process where project teams submit AI-related projects and upload architecture documents. If an architecture review agent is built, it should access the existing submitted documents rather than forcing teams to upload the same materials into a second tool.
This means integration with the current intake system or system of record is important.
9.6 System of record
The existing submission portal appears to be the record of truth for project submissions and architecture documentation. The agent should ideally read from that source and write results back or attach results to that record.
This avoids fragmentation.
9.7 Future integration with broader approval processes
Architecture approval is only one part of a broader approval chain. Other approval areas may include:
legal
privacy
trademark
PII or data protection
information security risk management
other compliance or governance functions
The immediate focus should remain architecture, but the concept could eventually extend to other review dimensions if successful.
9.8 Preferred interaction model
There is interest in having the capability live where users already work:
Teams
Copilot-style agents
existing portals
SharePoint
architecture repositories
The questionnaire identifies SharePoint/Teams as ideal, with openness to the path of least resistance. [CBT-Enterp…stionnaire | Word]
The transcript also raised the idea of using agents and workflows through a Copilot-like experience rather than requiring a standalone portal.
9.9 Enterprise identity
The tool must integrate with enterprise SSO or identity management. The questionnaire explicitly identifies Entra ID SSO as a requirement. [CBT-Enterp…stionnaire | Word]
9.10 Data source landscape
Existing relevant sources include architecture repositories, SharePoint, LeanIX or equivalent architecture repository, CMDB, Confluence, SAExpress or similar systems, system catalogs, and inventory tools. [CBT-Enterp…stionnaire | Word]
The system must eventually determine:
which sources are authoritative
which sources are reference-only
how often sources refresh
who owns source quality
what access permissions apply
how source evidence is cited in review outputs
10. Desired capability, stated generically
The desired capability is an AI-assisted architecture review system that:
Receives or accesses submitted architecture review artifacts.
Extracts the solution context.
Identifies technologies, platforms, integrations, data flows, security controls, infrastructure patterns, and operational dependencies.
Checks the solution against codified architecture standards, guiding principles, reference architectures, and technology catalogs.
Determines whether the submission has sufficient evidence for review.
Identifies gaps, risks, missing information, non-standard technology, duplicative capability, and exception requirements.
Produces a standardized review output with findings, evidence, recommendations, and score or rating.
Supports different review levels, including fast path, experimental, and production-grade review.
Keeps humans in control of final decisions.
Captures review decisions as reusable institutional memory.
Maintains review rules and standards through a governed, versioned process.
Integrates with existing systems of record rather than creating duplicate intake.
11. Recommended v1 problem definition
A strong v1 problem definition would be:
Architecture reviewers need a faster, more consistent way to perform first-pass evaluation of submitted AI or emerging technology architecture packages against current standards, guiding principles, reference architectures, and existing capabilities, while preserving human approval authority and producing explainable, evidence-backed outputs.
The v1 is not intended to fully replace architecture governance. It is intended to reduce manual review burden, improve consistency, and make missing evidence or obvious risks visible earlier.
12. Recommended v1 scope
In scope
The v1 should focus on one architecture review domain, preferably AI or new AI build reviews.
The v1 should support:
artifact package intake or access from existing repository
multiple documents per submission
extraction of solution summary
identification of technology components
identification of integration points
security baseline checks
data readiness checks
alignment with architecture guiding principles
check for existing capabilities or duplicate solutions if data source exists
missing information detection
risk and gap findings
evidence-backed recommendations
standardized output report
human architect review and override
decision capture
Out of scope for v1
The v1 should not attempt to become:
a full enterprise governance platform
a replacement for architecture review boards
a fully automated approval engine
a cross-functional legal, privacy, security, procurement, and architecture approval platform
a complete enterprise knowledge graph unless proven necessary
a second intake portal that duplicates the existing submission process
13. Review dimensions for v1
A practical v1 review model should include the following dimensions.
13.1 Intake completeness
Purpose: determine whether the submission contains enough information to review.
Checks may include:
architecture diagram provided
business context provided
solution scope described
target users described
deployment model stated
data flows described
integration points described
security model described
operational support model described
environments identified
risk or exception requests identified
Outcome: approve for review, request more information, or flag incomplete submission.
13.2 Technology adherence
Purpose: determine whether submitted technologies align with approved, restricted, declining, or exception-required technology standards.
Checks may include:
approved platform usage
restricted technology usage
declining technology usage
unsupported technology
redundant tool selection
approved alternative exists
exception required
13.3 Architecture fit
Purpose: assess whether the architecture uses appropriate patterns and avoids unnecessary custom build or duplication.
Checks may include:
reference architecture alignment
fit with known patterns
unnecessary custom build
existing capability reuse
architectural simplification opportunity
scalability concerns
design pattern mismatch
13.4 Integration alignment
Purpose: assess whether integrations follow approved patterns and avoid unsupported point-to-point or fragile designs.
Checks may include:
approved integration platform or pattern
API strategy
eventing strategy
data movement pattern
dependency management
error handling
retry and resilience model
ownership of integrations
13.5 Security baseline
Purpose: detect missing or inadequate security controls.
Checks may include:
identity and access control
authentication
authorization
encryption in transit
encryption at rest
secrets management
vulnerability management
threat modeling
logging and monitoring
privileged access
third-party access
data protection controls
13.6 Data readiness
Purpose: determine whether data-related risks and responsibilities are understood.
Checks may include:
data classification
source systems
data ownership
data lineage
data quality
retention
privacy considerations
PII handling
training data or model input considerations
data access controls
approved data platform usage
13.7 AI-specific review
Purpose: evaluate AI solution risks and readiness.
Checks may include:
AI use case clarity
model/provider selection
prompt or orchestration pattern
grounding strategy
hallucination controls
human-in-the-loop requirements
model monitoring
evaluation approach
content safety
data leakage controls
responsible AI alignment
agent/tool access boundaries
MCP or tool gateway usage, where relevant
enterprise AI framework alignment
13.8 Infrastructure and operations
Purpose: assess production readiness.
Checks may include:
hosting model
environment strategy
availability requirements
recovery requirements
backup approach
observability
incident support
change management
performance expectations
ownership after go-live
lifecycle management
support model
13.9 Risk and compliance
Purpose: determine whether the solution creates governance, compliance, or operational risk.
Checks may include:
exception requirements
regulatory implications
auditability
unsupported components
unresolved dependencies
operational risk
vendor risk
data sensitivity
legal or privacy escalation triggers
13.10 Recommendation
Purpose: provide a human-reviewable decision recommendation.
Possible outputs:
approved
approved with conditions
exception required
more information required
not recommended
full architecture review required
fast path confirmed
fast path not supported by evidence
14. Required output structure
The review output should be standardized and include:
14.1 Executive summary
what the solution is
what the review found
recommended decision
top risks
required actions
confidence level
14.2 Review scorecard
overall score or rating
category-level scores
pass count
gap count
not-evidenced count
exception count
severity distribution
14.3 Findings
Each finding should include:
finding title
category
status
severity
triggered criterion
evidence from submission
missing evidence, if applicable
recommended remediation
owner or next action
whether human review is required
14.4 Missing information
A separate section should list what cannot be assessed because evidence is missing.
This avoids the agent hallucinating or making unsupported conclusions.
14.5 Decision recommendation
The agent should recommend, not decide.
The decision recommendation should include rationale and conditions.
14.6 Human review notes
Architects should be able to:
accept a finding
reject a finding
edit a finding
override recommendation
add rationale
mark final decision
14.7 Decision record
The final output should be stored as institutional memory for future reuse.
The record should include:
submission ID
review date
standards version
reviewer
final decision
exception rationale
major findings
remediation actions
links to artifacts
links to evidence
15. Knowledge base requirements
The system requires a governed architecture knowledge base.
15.1 Internal standards
The knowledge base should contain:
architecture standards
approved technology stacks
restricted technology lists
declining technology lists
reference architectures
integration patterns
data principles
AI principles
enterprise architecture principles
security models
operational readiness requirements
exception criteria
15.2 Existing capabilities
The system should help reviewers determine whether a submitted capability already exists.
This is important because one of the stated frustrations is finding existing solutions for a capability and aligning teams to use them. [CBT-Enterp…stionnaire | Word]
15.3 Prior decisions
The system should include prior architecture review decisions so that similar future submissions can be evaluated in context.
The questionnaire identifies a gap in documenting architecture review decisions and making them retrievable later. [CBT-Enterp…stionnaire | Word]
15.4 External standards and emerging risks
The vendor proposed including external industry standards, best practices, emerging design patterns, and zero-day or fast-moving security insights.
This may be useful, but should be governed carefully. External content should not override internal standards unless explicitly approved.
15.5 Semantic relationships
The vendor emphasized that standards, controls, patterns, and practices need relationships. The system should know that:
a standard maps to one or more controls
a control requires certain evidence
a technology maps to approved/restricted/declining status
a pattern maps to required design properties
a missing document prevents certain conclusions
a prior decision may apply to a similar submission
This is the practical meaning of a knowledge graph or semantic layer. It does not necessarily require an expensive graph product in v1, but the relationships must be represented somehow.
16. Governance requirements
16.1 Rule ownership
Someone must own the review rules.
Open ownership questions:
who owns the architecture rubric?
who approves new criteria?
who updates standards?
who retires old rules?
who resolves conflicts between standards?
who signs off on exception logic?
who maintains technology status?
16.2 Versioning
Every review must record which version of the standards and rules was used.
Without versioning, decision traceability breaks.
16.3 Change management
Standards must be updated through a controlled process.
The system must support:
draft changes
review and approval
effective dates
retirement of old controls
audit history
rollback if needed
16.4 Human override
Architects must be able to override agent recommendations, but overrides should be captured with rationale.
This preserves accountability and improves future learning.
16.5 Quality monitoring
The agent must be monitored for degradation.
Metrics should include:
finding accuracy
false positives
false negatives
missing evidence detection
reviewer override rate
adoption rate
review cycle time reduction
user trust
standards freshness
17. Integration requirements
17.1 Existing submission portal
The agent should ideally integrate with the existing project submission or AI governance portal rather than require duplicate uploads.
The transcript emphasized that submitters should not have to submit architecture documents in two places.
17.2 Document repositories
The agent should access submitted artifacts from approved repositories or systems of record.
Possible sources include shared document repositories, architecture repositories, system catalogs, CMDB, Confluence-like documentation systems, and SAExpress-like architecture systems. [CBT-Enterp…stionnaire | Word]
17.3 Architecture repository
The agent should eventually interact with the architecture repository to:
look up existing capabilities
check application inventory
identify duplication
attach review outcomes
update architecture metadata
trigger downstream records
17.4 Teams or collaboration interface
A collaboration-first interface is desirable. The questionnaire identifies SharePoint/Teams as ideal for where the tool should live. [CBT-Enterp…stionnaire | Word]
The transcript also raised the idea of a Teams or Copilot-style front door.
17.5 Identity and access
The system must support enterprise SSO. The questionnaire explicitly lists Entra ID SSO as a requirement. [CBT-Enterp…stionnaire | Word]
17.6 Future workflow integration
Future integrations could include:
workflow routing
architecture board review
exception approval
artifact request loops
repository updates
capability mapping
downstream approval workflows
automated architecture record creation
18. Build vs buy considerations
18.1 Why buying may be attractive
A vendor solution may offer:
faster demo readiness
prebuilt UI
prebuilt ingestion patterns
experience codifying controls
existing scorecard model
implementation capacity
accelerator for knowledge structuring
outside perspective on best practices
18.2 Why buying may be risky
The vendor’s most valuable work would still require internal knowledge.
Risks include:
vendor lock-in around the control library
external dependency for standards maintenance
duplicate intake process
poor fit with existing systems
limited control over rule evolution
unclear total cost
difficulty embedding institutional knowledge
mismatch with internal collaboration tools
risk of a polished portal without deep internal adoption
18.3 Why internal build may be attractive
An internal build may better preserve:
ownership of standards
ownership of review logic
ownership of prior decisions
integration with internal systems
flexibility as standards mature
alignment with existing identity and collaboration tools
long-term maintainability
reuse across architecture domains
18.4 Why internal build may be risky
Internal build can fail if:
no one owns the product
no one owns the standards
criteria remain undefined
source systems are inaccessible
outputs are not trusted
UI is too rough for adoption
agent accuracy is below threshold
reviewers do not use it
maintenance is treated as project work instead of product work
18.5 Likely best path
Build a focused internal prototype first. Use vendor concepts as reference patterns. Consider vendor support only for bounded acceleration, such as standards codification, UI benchmarking, or implementation support.
Do not outsource the full capability until the organization understands the true cost of owning and maintaining the knowledge layer.
19. Recommended v1 architecture concept
19.1 Intake/access layer
Access submitted artifacts from the existing submission system, shared repository, or controlled upload location.
19.2 Artifact extraction layer
Extract content from:
presentations
documents
PDFs
diagrams
screenshots
structured forms
ZIP packages
architecture templates
19.3 Context extraction layer
Identify:
solution purpose
business process
technology stack
architecture components
integrations
data sources
data flows
identity model
security model
infrastructure
operational model
missing evidence
19.4 Standards/control library
Maintain codified standards and review rules.
19.5 Evaluation engine
Run checks against the extracted context and control library.
19.6 Evidence mapper
Link findings to source material or mark them as not evidenced.
19.7 Human review interface
Allow architects to review, edit, accept, reject, or override findings.
19.8 Output generator
Generate standardized outputs for different audiences.
19.9 Decision memory store
Store final decisions and rationale for future retrieval.
19.10 Monitoring layer
Track accuracy, adoption, review time reduction, and degradation.
20. Key assumptions
These should be made explicit before any build/buy decision.
The first use case will be narrow, likely AI architecture reviews.
The agent will assist reviewers, not make final decisions.
The existing submission system should remain the system of record.
The organization can define an initial review rubric.
Standards and rules will need active ownership.
Outputs must be evidence-backed.
Adoption depends on trust, not just automation.
A standalone portal is not automatically the best user experience.
The knowledge layer is more important than the UI.
A 60% to 80% review-effort reduction is the target, but only if accuracy remains acceptable. [CBT-Enterp…stionnaire | Word]
21. Open questions for reasoning models
Use these as prompts for other models.
Problem definition
What is the cleanest articulation of the business problem?
Is this primarily a throughput problem, consistency problem, governance problem, knowledge-management problem, or all of the above?
What should be excluded from the first version?
Scope
Should v1 focus on AI architecture reviews, HR systems, or another domain?
What is the minimum useful review package?
What should be deferred?
Architecture
Is a knowledge graph necessary for v1, or can a simpler governed control library work initially?
What is the minimum viable standards representation?
How should evidence traceability be implemented?
How should prior decisions be stored and retrieved?
Workflow
Should the agent be accessed through a portal, Teams, Copilot-style interface, or embedded in the existing submission workflow?
How should the agent handle incomplete submissions?
How should human override be captured?
Governance
Who owns standards?
Who owns rules?
Who approves changes?
How are standards versioned?
How is agent degradation detected?
Build vs buy
Which parts should be built internally?
Which parts could be vendor-assisted?
What would create vendor lock-in?
What would be expensive to maintain internally?
What operating model is required for each path?
Success
How should accuracy be measured?
What is a realistic minimum viable accuracy threshold?
How should review time savings be measured?
What adoption signals matter most?
What failure signals should stop or redirect the effort?
22. Compressed version for model input
If another model needs a shorter version, use this:
|
1 We are evaluating whether to build or buy an AI-assisted architecture review agent. The current architecture review process is capacity-constrained, inconsistent, and heavily dependent on individual architect knowledge. Review volume is increasing, especially for AI and emerging technology submissions, while staffing is constrained. Current reviews can take from weeks to months, and the team processes more than 20 reviews per month with expected growth beyond 30. The goal is to reduce review effort by 60% to 80% while maintaining more than 60% accuracy in AI-assisted recommendations. 2 3 The review process today is fragmented across email, shared repositories, formal review boards, ticketing systems, and specific portals. Artifacts include decks, documents, diagrams, vendor documentation, consultant documentation, and written materials. There is no consistently followed checklist, and outputs vary across documents, emails, slides, architecture repository entries, and meeting records. There is also a gap in documenting architecture decisions and making them retrievable later. 4 5 The desired capability is an AI-assisted architecture review system that can intake or access architecture artifacts, extract solution context, evaluate against standards and guiding principles, identify risks and missing evidence, detect duplicative or non-standard solutions, recommend approval posture, and produce standardized, explainable, evidence-backed review outputs. Final approval must remain human-controlled. 6 7 Key review criteria include technology alignment, integration patterns, security, compliance, data readiness, architecture guiding principles, AI guiding principles, enterprise architecture principles, approved technology stacks, reference architectures, security models, and whether an existing capability already solves the need. 8 9 The vendor demo showed a portal that accepts architecture submissions, processes multiple artifacts, evaluates across technology adherence, technical fit, security, infrastructure, and risk/compliance, and produces pass/gap counts, scores, findings, evidence, and remediation guidance. The vendor emphasized that the hard part is not the UI or agent, but curating and maintaining the knowledge foundation: standards, patterns, control libraries, semantic relationships, embeddings, technology catalogs, and institutional memory. Standards must be maintained live because they change frequently. 10 11 Internal concerns include maintainability of standards, need for a UI or governed process to update rules, support for different review levels such as fast path, experimental, and production review, ability to evaluate multiple documents, avoiding duplicate submission burden, integration with the existing submission portal/system of record, and potential future integration with architecture repositories, system catalogs, CMDB, Confluence-like documentation, SharePoint-like repositories, Teams/Copilot-style interfaces, and SSO. 12 13 The recommended v1 is a focused internal build for AI architecture review assistance. It should not be a full governance platform or autonomous approval engine. It should perform first-pass triage, identify missing evidence, check against a defined rubric, produce findings with evidence, recommend approval posture, and allow architects to review and override. The durable asset is the internally owned knowledge layer: standards, review rules, prior decisions, exception logic, and institutional memory. A vendor may help accelerate parts of the work, but outsourcing the full capability risks vendor lock-in around the most strategically important layer. |
23. Bottom-line recommendation
Treat this as a knowledge-governance and architecture workflow problem first, and an AI product problem second.
The agent can only be as good as the review criteria, standards, evidence model, source-system access, and ownership model behind it. The vendor demo is useful proof that the pattern is viable. The organization should now determine whether it can define and govern the initial review rubric tightly enough to build an internal v1.
The most important immediate decisions are:
Pick the first review type.
Define the minimum review rubric.
Identify the authoritative source systems.
Decide who owns standards and rule maintenance.
Define the output format and human approval workflow.
Decide what success metrics will be used for the pilot.
Enterprise AI Distillation AgnosticCompany-agnostic enterprise AI and agentic platform pattern reference.
Back to top · Back to source list
Enterprise GenAI & Agentic AI Architecture — Comprehensive Reference Template
Version: 2.0 — Company-Agnostic Template
Purpose: Reusable, machine- and human-readable reference for enterprise AI/Agentic AI architecture, approved platform stack, tool inventory, governance models, and domain capabilities.
Usage: Replace all[Company]placeholders with your organization name. Populate tool inventories, contacts, and internal URLs with organization-specific values.
1. Executive Summary
Enterprise AI strategy evolves from broad exploration to intentional application of GenAI and Agentic AI. This reference shifts focus from individual productivity tools to deploying GenAI and Agentic AI within business processes. The enterprise operates on a dual-cloud architecture with Microsoft as the primary orchestration platform and AWS as the approved runtime and infrastructure platform, connected through open standards (MCP, A2A, REST, Event Streaming).
Six architecture reference patterns document how this strategy materializes:
- Enterprise AI Marketecture — Core business domains on one AI foundation
- HR Digital Front Door (AWS) — Four-layer approved AWS agent framework
- HR Digital Front Door (Microsoft) — Five-step agentic flow on Microsoft stack
- Intelligent Contracting — CLM + Agentic Framework + Data Intelligence + Knowledge Graph
- Unified Platform — Shared data and shared agent patterns on Microsoft Cloud
- Unified Enterprise Agentic AI Architecture — Twelve-section enterprise-wide framework
2. Strategic Vision
AI & Data Vision Tiers
| Tier | Label | Notes |
|---|---|---|
| Foundation | AI and Data Foundations | Base layer |
| Tier 1 | Driving Operational Efficiency | |
| Tier 2 | Accelerating the Value Chain | Focus of business process AI deployment |
| Tier 3 | Embedding AI in Products | Create new value |
Evolved Approach
| Dimension | From | To |
|---|---|---|
| Focus | Broad exploration of GenAI/Agentic AI tools | Intentional application of GenAI/Agentic AI tools |
| Rationale | Impact meant democratizing access | With access guaranteed, impact is driven by focus and intent |
Unified Principles
- Secure by Design
- Scalable & Elastic
- Interoperable & Open
- Governed & Compliant
- Human-Centered
- Outcome-Driven
- Value-Focused
3. Enterprise Architecture Reference
3a. Enterprise AI Marketecture — Multi-Domain
One AI Foundation. Multiple Core Domains. Enterprise Value.
| Domain | Capabilities |
|---|---|
| Legal | AI Contract Review & Analysis · Legal Research & Precedent Search · Clause Extraction & Risk Detection · Obligation & Deadline Management · Matter Summarization & Reporting |
| Procurement | Intelligent Sourcing & Spend Analysis · Supplier Risk & Performance Insights · Contract Lifecycle Management · RFP Response Generation & Evaluation · Savings & Compliance Optimization |
| HR | AI-Powered Policy Assistant · Resume Screening & Candidate Match · Employee Self-Service & FAQ · Performance Insights & Analytics · Learning & Development Recommendations |
| Quality / Compliance / Safety | Quality Assurance Automation · Compliance Monitoring & Alerts · Audit Planning & Evidence Analysis · Corrective Action & CAPA Support · Document Control & Traceability |
Consumer Experiences: Web Portal · Microsoft Teams · Mobile App · Email / Outlook · Voice / Chat · Microsoft Copilot
Enterprise Integrations: SAP (ERP) · Workday (HRIS) · ServiceNow (ITSM) · Salesforce (CRM) · Oracle (ERP) · SharePoint (DMS/ECM) · Data Warehouses · Other LOB Systems
3b. AI Orchestration Control Plane
| Role | Provider |
|---|---|
| Primary Orchestration Platform | Microsoft |
| Approved Runtime & Orchestration Platform | AWS |
| Component | Description |
|---|---|
| Intent & Context Routing | Interpret intent, assemble context and route across channels and domains |
| Multi-Agent Coordination | Route and orchestrate agents across business domains and enterprise platforms |
| Agent / Tool / MCP Registry | Discover, catalog and govern agents, tools and MCP servers with versioning and ownership |
| Governed Action Execution | Tool gateway, action contracts, policy checks and safe, compliant execution |
| Human-in-the-Loop | Review, approval, exception handling and escalation workflows |
| Observability & Audit | End-to-end traces, decision logs, metrics, evidence and compliance reporting |
3c. Standard Protocols & Connectors
| Protocol | Full Name | Purpose |
|---|---|---|
| MCP | Model Context Protocol | Standardized tool & data access |
| A2A | Agent-to-Agent Protocol | Secure agent communication and interoperability |
| REST / APIs | RESTful APIs | Standard web service integration |
| Event Streaming | — | Real-time event-driven data flow |
3d. Data Foundation Layer
| Type | Sources |
|---|---|
| Structured Data | ERP, HRIS, Finance, etc. |
| Unstructured Data | Documents, Emails, Files |
| Formal / External Data | Market, Regulatory, Supplier |
Databricks (Analytics Lakehouse): Bronze (Raw) → Silver (Cleaned) → Gold (Business-ready) Data Services: Governance & Data Quality · Catalog & Lineage · Sharing & Access · Data Products & KPIs
Neo4j (Knowledge Graph): Real-time relationships · Impact analysis · Network insights · AI-ready context
3e. HR Digital Front Door — AWS Agent Framework (4-Layer Stack)
Intelligent orchestration for HR. Securely connected to systems of record.
L1: Business Process Orchestration Layer
- Core: Process Orchestration · Policy & Guardrails · Context & Memory · Delegation to Agents · Audit & Logging
- Engines: Temporal · AWS Step Functions · Airflow / Prefect
- Examples: Leave Management · HR Case Resolution · Onboarding · Benefits Changes · Employee Inquiries
L2: Agent Logic & Coordination Frameworks
- Core: Reasoning Engine · Tool Selection · Planning & Task Decomposition · Multi-Agent Collaboration · Retries & Error Handling
- Frameworks: AWS Strands · LangGraph · Crew AI · AutoGen
- Patterns: ReAct / Plan & Act · Reflection · Chain of Thought · Tool Use · Multi-Agent Patterns
L3: Orchestration Layer / Agent Platform
- Components: Amazon Bedrock (FM Access) · AgentCore (Runtime) · Agent Registry & Catalog · Memory & Context · Tool Registry & Gateway · Guardrails & Safety · Observability & Monitoring
- Security: Identity & Access (IAM) · KMS Encryption · Secrets Manager · Prompt Guardrails
L4: Enterprise Infrastructure Layer
- Components: AWS Global Infrastructure · Compute (EC2/ECS/EKS/Lambda) · Storage (S3/EBS) · Network (VPC/PrivateLink/Transit Gateway) · Security (Shield/WAF/Security Groups) · Identity (Entra ID/IAM/SSO) · Integration Services · Enterprise Connectivity
Cross-Cutting: Observability · Audit & Compliance · Human-in-the-Loop · Prompt & Data Governance · Cost Monitoring · Continuous Improvement
Systems of Record: ServiceNow (HR Cases, Knowledge, Approvals, Workflows) · Workday (Employee Data, Benefits, Payroll, Absence & Time)
3f. HR Digital Front Door — Microsoft Agentic Stack
One Ask. Many Systems. One Experience.
| Step | Action | Microsoft Stack |
|---|---|---|
| 1 — Understand Intent | Copilot captures request in natural language | Teams · M365 Copilot · Copilot Studio · Entra ID |
| 2 — Decompose & Plan | Agent logic analyzes and creates a plan | Azure AI Foundry · Semantic Kernel · Azure OpenAI · Copilot Studio · Microsoft Graph |
| 3 — Orchestrate | Execute workflow with policy, approvals, guardrails | Power Automate · Durable Functions · Logic Apps · Power Apps · Dataverse · Purview |
| 4 — Execute Across Systems | Secure integrations perform work | Graph Connectors · Power Automate Connectors · Workday/ServiceNow Connectors · API Mgmt · Key Vault |
| 5 — Return Results | Summarize outcomes and share next steps | Copilot in Teams · Microsoft Graph · Azure Monitor · Log Analytics · Power BI · Viva Engage |
Platform Foundation: Azure AI Foundry · Azure OpenAI · Semantic Kernel · Copilot Studio · AI Search · Vector Store · Plugin & Tool Registry · Agent Orchestration & Runtime
Security & Governance: Entra ID · MFA · RBAC/ABAC · Purview DLP · Content Safety · Customer Lockbox · Key Vault · Audit Logs & SIEM · Data Residency · Retention · HITL Approvals
3g. Intelligent Contracting Platform
One platform. AI agents automate work. Connected data delivers trusted insights.
| # | Component | Role | Key Capabilities |
|---|---|---|---|
| 1 | CLM Platform | System of Engagement | Request → Draft → Negotiate → Execute → Store → Digitize; AI Review · Clause Library · Playbooks |
| 2 | Agentic Framework | System of Orchestration | AI Agents · Orchestration · Guardrails · Insights; Runtime: Memory · Tools · Observability · Security |
| 3 | Data Intelligence | System of Intelligence | Bronze/Silver/Gold medallion; Governance · Catalog & Lineage · Data Products & KPIs |
| 4 | Knowledge Graph | Contextual Intelligence | Real-time relationships · Impact analysis · Network insights · AI-ready context |
Integration: APIs · MCP Connectors · Events/Webhooks · Files & Documents · Agent-to-Agent
Business Outcomes: Faster cycle times · Lower risk · Greater savings · Better decisions · Stronger supplier relationships
3h. Unified Platform — Shared Data, Shared Agent Patterns
| Layer | Name | Principle | Key Components |
|---|---|---|---|
| 5 | Unified Experience | One experience across all channels | Reusable agent experiences · Microsoft Copilot · Teams · Web · Mobile |
| 4 | Shared Agent Platform | Build once, reuse everywhere | Agent Control Plane · Orchestration · Reusable Services · Integration |
| 3 | AI Ready Foundation | Same governed content foundation | Content Processing · Knowledge Services (Azure AI Search, Vector, Graph) · Agent Working Data |
| 2 | Canonical Sources | Single Source of Truth | Corporate content · Brand · Policy · SharePoint · Regulatory · News · Reports |
| — | Cross-Cutting Services | Platform-wide | Security · Identity · Compliance · Observability · Cost Mgmt · DevOps |
3i. Unified Enterprise Agentic AI Framework (12 Sections)
| # | Section | Key Components |
|---|---|---|
| 1 | Unified Experience Layer | Chat · Copilot · Guided Actions · Notifications · Domain Front Doors |
| 2 | Orchestration & Agent Layer | Agent Discovery/Routing/Lifecycle · Temporal · HITL · Reasoning · Memory |
| 3 | Integration & Data Layer | Enterprise Systems · API Gateway · MCP ↔︎ A2A · Data Management |
| 4 | Cloud & Infrastructure | Azure services · AWS services · Kubernetes · Serverless |
| 5 | Enterprise AI Control Plane | Cloud-Agnostic · Guardrails · Policy · Model/Tool Mgmt · Observability |
| 6 | Business Process & Workflow | Cross-functional orchestration · Lifecycle: Intake → Plan → Execute → Monitor → Optimize |
| 7 | Governance & Risk | Policies · RBAC · Privacy · AI Risk · Audit · Regulatory · Data Residency |
| 8 | Outcomes & Value | Time-to-value · Automation · Cost reduction · Experience · Compliance · Innovation |
| 9 | Cross-Cutting Capabilities | Architecture · FinOps · DevSecOps · Quality · Observability · Knowledge Mgmt |
| 10 | Ecosystem & Partnerships | ISVs · System Integrators · AI Model Providers · Data Providers · Consulting |
| 11 | AI & Technology Foundation | LLMs/FMs · Vector DBs · Data Platforms · MLOps · Automation · Connectivity |
| 12 | Enterprise Operating Model | People · Process · Technology · Culture · Portfolio Management |
Runtime Engines: CrewAI · Temporal · A2A Protocol Standards: TOGAF · Zachman · NIST AI RMF · ISO 42001 · COBIT · ITIL · DMBOK
4. Approved Platform Inventory (Template)
4a. AI for All — Daily Productivity
| Tool | Description | Access | Cost |
|---|---|---|---|
| M365 Copilot Premium | GenAI chat for M365 tasks, content search/creation | All employees | Included |
| M365 Copilot Chat Only | GenAI chatbot for general querying | All employees & contractors | Included |
| Enterprise GenAI Chat | [Company]-specific GenAI chat with document upload | All employees & contractors | Included |
4b. Pre-Built Agents
| Agent | Function |
|---|---|
| Researcher Agent | Synthesizes research into structured, citation-based insights |
| Analyst Agent | Interprets data across dashboards, spreadsheets, reports |
| SharePoint Agents | Surfaces answers, summaries, actions from site content |
| Drafting Assistant | Template-based document creation |
| Knowledge Base Agent | Enterprise knowledge base interaction and retrieval |
4c. Vendor Agents
| Tool | Status | Capabilities |
|---|---|---|
| ServiceNow Now Assist | Enterprise Approved | Workflow automation, summarization, NLP, recommendations |
| Salesforce Agentforce | Per use case | Build/test/deploy agents, intelligent context |
4d. SDLC Tools
| Tool | Phases | Cost |
|---|---|---|
| GitHub Copilot Business | Discover, Design, Develop, Operate | $19/user/month |
| AI Test Management (e.g., Aurora) | Discover, Design, Test | Included |
| Jira AI | Discover | Pilot |
| AI Design Tool (e.g., Figma Make) | Discover, Design | Included |
| Spec-Driven Dev Agent (e.g., Kiro) | Develop | $20/user/month |
| Advanced Coding Agent (e.g., Claude Code) | Develop | Pay-per-token |
4e. Data & Analytics Tools
| Tool | Capabilities |
|---|---|
| Databricks | GenAI for data lakes/lakehouses; medallion architecture |
| Snowflake | AI agents for processing, insight extraction, multi-agent orchestration |
4f. Image & Video Tools
| Tool | Type | Notes |
|---|---|---|
| M365 Copilot (Recommended) | Image Generation | GPT Image models, MS MAI-Image |
| AI Video Platform (e.g., Animaker, HeyGen) | Video Creation | Avatars, multi-language, animations |
4g. Function-Specific Tools (Customize for Your Org)
| Category | Tool Pattern | Description |
|---|---|---|
| R&D | Research Knowledge Platform | Hypothesis generation, insight retrieval |
| R&D | Regulatory Document QC | Claim validation against source documents |
| R&D | Regulatory Drafting Assistant | Agentic drafting for submissions |
| R&D | Document Processing Pipeline | Unstructured PDFs → schema-validated JSON |
| Procurement | AI Procurement Assistant | Routing to task-specific agents, E2E orchestration |
| Procurement | Guided Buying Assistant | Commodity/financial coding prediction |
| Procurement | Automated Approval Engine | AI-driven validation, low-risk bypass |
4h. Agentic Enablers (Build Platforms)
| Platform | Status | Capabilities |
|---|---|---|
| Enterprise Agent Framework | Available Now | Ecosystem/API/LLM connections; no-/low-code agents |
| MS Copilot Studio | Planned | M365 connectors; pro-code agents; publish agents |
| Agent Builder (in Copilot) | Planned | No-code/low-code for individual productivity |
5. Infrastructure Foundation
Microsoft Azure (Primary Cloud)
Azure OpenAI Service · Azure AI Search · Azure Machine Learning · Azure Functions · Data Lake Storage · Microsoft Fabric · Azure SQL DB · Key Vault · Entra ID · AI Foundry · Semantic Kernel · Copilot Studio · Cosmos DB · DataFactory · Synapse Analytics · AKS · App Service · Virtual Network · Monitor · Storage · Front Door · Power Automate · Durable Functions · Logic Apps · Power Apps · Dataverse · Purview · Microsoft Graph · Log Analytics · Power BI
AWS (Approved Runtime)
Amazon Bedrock · AgentCore · SageMaker · Lambda · S3 · RDS · KMS · CloudWatch · Redshift · Step Functions · EventBridge · EKS · EC2 · VPC · Security Hub · WAF · Backup · CloudFront · AWS Strands
Infrastructure & LLM Providers
| Provider | Role |
|---|---|
| Azure | Primary Cloud |
| AWS | Approved Runtime |
| OpenAI | LLM Provider |
| Anthropic (Claude) | LLM Provider |
| Google (Gemini) | LLM Provider |
| Meta (Llama) | Open-source LLM |
| Mistral | Open-source LLM |
| NVIDIA | GPU / Accelerated Compute |
6. Governance Framework
Portal: [Company AI Governance Portal URL]
No Review Required (ALL must be TRUE)
- Using only approved enterprise chat tools
- Use case is solely individual productivity
- Only low-sensitivity data in approved environments
- No GxP / regulated data
- Outputs don’t impact others or support business outcome
- Sufficient rights to use data as input
- No IP protection needed for output
Review Required
If ANY condition above is FALSE, submit to AI Governance.
Process A: Standard Review — ~2 weeks
- Intake via governance portal
- Business review — alignment, prioritization, funding
- Sector-aligned functional review — auto-routed
- Issue resolution
- Outcome & decision
Process B: Fast Track (POC) — <1 day
Intake → Direct outcome. Low-risk, experimental, POC only.
7. Use Case Prioritization
Dimension 1: Business Impact
- Measurable improvement tied to AI strategy?
- Automates recurring tasks for multiple teams?
- Standardizes outputs? Accelerates broad workflows? Unlocks capacity?
- Has business sponsorship for high-priority unmet need?
Dimension 2: Implementation Potential
Risk: Regulated decisions? Sensitive data? Unapproved environments? Compliance impact? Readiness: Data AI-ready? I/O defined? Process ownership clear? APIs approved? Maintenance owner?
| Quadrant | Impact | Potential | Action |
|---|---|---|---|
| High Priority | High | High | Submit directly to governance portal |
| Medium Priority | High | Low | Consult tech partners on risk mitigation |
| Low Priority | Low | High | Refine with business leaders |
| Deprioritize | Low | Low | Consult tech partners on alternatives |
8. Agent Key Concepts
| # | Concept | Definition | Note |
|---|---|---|---|
| 1 | Autonomy / Agency | Extent agent acts without human input | Most enterprises start with limited autonomy |
| 2 | Planning / Task Decomposition | Breaking goals into sequential tasks | Enables traceability |
| 3 | Memory | Short-term (within task) and long-term (across tasks) | Long-term increases data exposure |
| 4 | MCP / Connectors | Standard connections to tools, data, systems | Required for enterprise data access |
| 5 | Registration | Making agents visible and usable | Required for agent discovery |
| 6 | Guardrails | Rules defining allowed actions | Prevents unauthorized actions |
| 7 | Observability | Logs, steps, decisions | Compliance and troubleshooting |
| 8 | Orchestration | Coordinates tools, workflows, models | Required for multi-step workflows |
| 9 | Retrieval | How AI finds relevant information | Ensures right sources vs. guessing |
| 10 | RAG | Retrieval-augmented generation | Grounds answers in trusted data |
| 11 | Human-in-the-Loop | Required human approvals | Ensures oversight |
| 12 | Execution Environment | Where agent performs actions | Run close to data on approved cloud |
9. Problem Archetypes
| # | Archetype | Use Case | Impact |
|---|---|---|---|
| 1 | Interpreting complex, unstructured inputs | Text, code, logs, papers, contracts | Extract insights for decisions |
| 2 | Generating, evaluating, comparing options | Hypotheses, scenarios, recommendations | Automate analytical work with HITL |
| 3 | Orchestrating across tools, systems, people | Routing, delegating, handoffs | Automate coordination |
| 4 | Improving frequently executed workflows | Repetitive tasks, high-volume processes | Compound gains at scale |
| 5 | Accelerating first-pass / preparatory work | Drafting, analysis before human review | Speed up with oversight |
| 6 | Exploration of unfamiliar areas | New domains, fragmented information | Accelerate sense-making |
10. Roadmap (Template)
| Item | Description |
|---|---|
| Agent Builder Rollout | No-code agent building for all Copilot Premium users |
| Technical AI Playbook | Legal, risk, compliance deep-dive for builders |
| AI SDLC Playbook | AI-assisted software development guidance |
| Governance Updates | Fast Track expansion, agent-guided experience |
| Platform Expansion | MCP gateway, skills/agent catalog for reuse |
| Pro-Code Agent Platform | Pro-code agent building with M365 connectors |
11. Architectural Standards & Frameworks
- TOGAF
- Zachman
- NIST AI RMF
- ISO 42001
- COBIT
- ITIL
- DMBOK
Appendix: Tool Taxonomy (Three-Tier Model)
| Category | Target | Use Cases | Access |
|---|---|---|---|
| A. AI for All | All employees | Individual productivity | No additional approval |
| B. Pre-configured AI | Business / technical users | Domain-specific workflows | Platform access process |
| C. Custom Built AI | Business / technical users | Unaddressed use cases | AI governance approval |
Appendix: Agent Engagement Model
| Mode | What | Approval |
|---|---|---|
| USE | Pre-built agents approved for enterprise | None or per use case |
| BUILD | Custom agents for individual productivity | None for small group; governance for wider |
| REQUEST | Custom/pre-built for business processes | All require governance approval |
ONE UNIFIED AGENTIC FOUNDATION. ENDLESS BUSINESS POSSIBILITIES.