AAB Assessment Guide v0.2

Vordan Instrument Documentation

AAB Assessment Guide

Agentic Accountability Baseline: Assessment Methodology and Output Standards

Versionv0.2

Applies ToAAB v0.2

IssuedJune 2026

StatusActive

Maintained ByVordan

Contents

Purpose and Scope
The Two Assessment Methodologies
The Eight Conditions
Evidence Standards: Public Record Methodology
Evidence Standards: Direct Access Methodology
Condition Assessment Instructions
Finding Classification
Gap Score Construction and Interpretation
Assessment Output Requirements
Certified Assessor Program

Section 1

Purpose and Scope

This guide defines the methodology, evidence standards, and output requirements for assessments conducted under the Agentic Accountability Baseline (AAB). It is the operational reference for certified Vordan assessors and the public record of how AAB assessments are produced.

The AAB is Vordan's standard for evaluating whether an organization's agentic AI deployment is accountable by design. Accountability by design means that the conditions necessary to trace, verify, and assign responsibility for agent actions were built into the deployment before it operated, not added after a failure occurred. The AAB evaluates eight conditions that together constitute a complete accountability architecture for agentic AI.

This guide covers both AAB assessment methodologies: the Public Record methodology (AAB-PR), in which the assessor evaluates an organization's accountability posture using only information available in the public record, and the Direct Access methodology (AAB-DA), in which the organization under assessment provides evidence directly to the assessor. The two methodologies share the same eight conditions and the same evidentiary standards. They differ in how evidence is sourced, how absence is interpreted, and what findings can legitimately claim.

Scope of Application

The AAB applies to any organization operating an agentic AI deployment: a deployment in which one or more AI models take actions, make decisions, or produce outputs that affect external systems, data, or people, with reduced or absent human review at the point of action. The standard is not limited by sector, organizational size, or jurisdiction. Any agentic deployment, whether operated by a regulated financial institution or an early-stage technology company, is in scope if it meets the operational definition above.

The AAB does not apply to AI systems that operate solely as inference endpoints responding to human-initiated queries without taking downstream actions. An AI model that returns a response for a human to act on is not agentic for purposes of this standard. An AI model that dispatches an action, calls an external service, modifies a record, or produces an output that another system consumes without human review at that step is agentic and is in scope.

This guide does not constitute the AAB standard itself. The standard is published at vordan.co/baseline. This guide defines how assessors apply that standard. In the event of conflict between this guide and the standard, the standard governs.

Section 2

The Two Assessment Methodologies

The AAB can be applied through two distinct methodologies. The choice of methodology is not a matter of assessor preference. It is determined by the assessment context: whether the organization under assessment is participating in the assessment or not.

Public Record Methodology (AAB-PR)

The Public Record methodology applies the AAB eight conditions entirely through evidence available in the public record. The organization under assessment does not provide evidence, does not participate in the assessment, and may not be aware the assessment is underway. The assessor evaluates what the organization has disclosed, published, reported, or what has been independently documented about its agentic deployments.

Public record sources include regulatory filings, press releases, published policies and frameworks, media coverage, incident reports, third-party forensic findings, legal proceedings, government disclosures, and any other information legitimately available without organizational access. The assessor documents the source, date, and relevance of each evidentiary item and evaluates each condition against what the public record contains or fails to contain.

Under AAB-PR, absence of public evidence of compliance is treated as a gap finding. This principle follows from the AAB standard's requirement that organizations be able to demonstrate compliance. An organization that operates an agentic deployment and produces no public evidence that any accountability condition is satisfied has not demonstrated compliance. The absence of evidence is not the same as evidence of absence. But for the purposes of institutional accountability assessment, an organization that cannot demonstrate a condition is satisfied has not satisfied it in any meaningful public sense.

AAB-PR assessments are designated with the suffix PR and a sequential number: AAB-PR-001, AAB-PR-002, and so forth.

Direct Access Methodology (AAB-DA)

The Direct Access methodology applies the AAB eight conditions through evidence provided by the organization under assessment. The organization participates in the assessment, either voluntarily or as a condition of a contractual, regulatory, or certification requirement. The assessor requests evidence against each condition's three evidence tiers and evaluates what the organization produces.

Direct access sources include internal policy documents, system architecture records, audit logs, authorization records, deployment documentation, incident response records, and any other evidence the organization produces in response to the assessor's requests. The assessor evaluates both the completeness and the quality of the evidence produced.

Under AAB-DA, the organization bears the burden of demonstrating compliance. Failure to produce evidence for a condition is a gap finding. Producing evidence that does not satisfy the relevant tier requirements is a finding at the appropriate severity level. The assessor does not supplement organizational evidence with public record research, though publicly available information may be used to corroborate or challenge organizational claims.

AAB-DA assessments are designated with the suffix DA and a sequential number: AAB-DA-001, AAB-DA-002, and so forth.

The Critical Distinction

The methodological difference between AAB-PR and AAB-DA is not merely procedural. It determines what a finding means and what it can legitimately claim.

An AAB-PR finding of Not Satisfied means the organization has produced no public evidence of satisfying that condition. It does not mean the organization has no internal evidence of compliance. An organization may have robust internal accountability documentation and still score Not Satisfied on every AAB-PR condition because none of that documentation is publicly available. The AAB-PR finding is a statement about public accountability posture, not necessarily about internal practice.

An AAB-DA finding of Not Satisfied means the organization was given the opportunity to produce evidence of compliance and failed to do so. This is a stronger claim. It says the organization could not demonstrate compliance even when asked directly. The gap exists in practice, not merely in public posture.

An assessor who conflates these two findings has made a methodological error. Published AAB-PR assessments must explicitly state that findings are limited to the public record. Published AAB-DA assessments must explicitly state the scope of organizational access and the period of assessment. The methodology designation is not administrative labeling. It is the interpretive frame through which any reader should understand what the findings mean.

An assessor may not publish an AAB-DA finding based on public record evidence, or an AAB-PR finding characterized as equivalent to a direct access result. The methodology designation must match the evidence sourcing methodology actually used. Misrepresentation of methodology is grounds for decertification.

Section 3

The Eight Conditions

The AAB evaluates eight conditions. Together they constitute the minimum accountability architecture for a defensible agentic AI deployment. Each condition addresses a distinct accountability requirement. No condition is redundant with another. A gap in any single condition represents a structural failure in the accountability architecture, regardless of how well the remaining conditions are satisfied.

Condition	Name	Default Severity of Structural Gap
2.1	Authorization Provenance	Critical
2.2	Scope Integrity	High
2.3	Data Provenance	High
2.4	Handoff Traceability	Critical
2.5	Intervention Authority	High
2.6	Outcome Attribution	High
2.7	Forensic Reconstructibility	Critical
2.8	Model Substrate Integrity	Critical

The four Critical conditions (2.1, 2.4, 2.7, 2.8) represent the accountability spine. A structural gap in any of these four conditions means that either the deployment was never authorized through a traceable human decision, or actions taken by the agent cannot be connected to that authorization, or the sequence of events cannot be reconstructed after the fact, or the model executing actions cannot be verified as the model that was authorized. Any one of these failures renders the entire accountability architecture unreliable, regardless of the state of the remaining conditions.

The four High conditions (2.2, 2.3, 2.5, 2.6) are not less important. A deployment with perfect Critical condition compliance and structural gaps in all four High conditions still has a seriously deficient accountability architecture. The severity designation reflects the immediacy of the accountability failure, not its ultimate significance.

Section 4

Evidence Standards: Public Record Methodology

The AAB standard defines three evidence tiers for each condition. Under the Public Record methodology, the assessor evaluates what the public record contains at each tier. The assessor does not contact the organization, does not request documents, and does not supplement public evidence with inference or assumption about internal practices.

The Three Evidence Tiers Under AAB-PR

Tier 1: Must Exist

Under direct access assessment, Tier 1 requires that a documented record of the relevant accountability element exists within the organization. Under public record assessment, the assessor evaluates whether any public evidence of the existence of such a record is available. This may be a published policy, a regulatory disclosure, a public statement by an organizational representative, a third-party audit report that references the record's existence, or a document produced in litigation or regulatory proceedings.

If no public evidence of the existence of a Tier 1 element can be located, the tier is not satisfied and the condition finding is Not Satisfied at the Structural gap type.

Tier 2: Must Be Demonstrable on Demand

Under direct access assessment, Tier 2 requires that the organization can produce the relevant evidence when requested. Under public record assessment, this tier evaluates whether the organization has demonstrated the relevant accountability element in any public context: in response to a media inquiry, in a published incident review, in a regulatory filing, or in any public-facing disclosure. An organization that has never been asked to demonstrate a condition and has never voluntarily demonstrated it cannot be found to satisfy Tier 2 under AAB-PR.

If no public evidence of demonstration exists, the tier is not satisfied. This does not require that the organization has been asked and refused. The absence of any public demonstration is itself the finding.

Tier 3: Must Be Independently Verifiable

Under direct access assessment, Tier 3 requires that evidence can be verified by a party other than the organization. Under public record assessment, this is the most directly evaluable tier: the assessor asks whether any independent party has verified, corroborated, or documented the relevant accountability element. Third-party audit reports, forensic investigations, regulatory findings, journalistic investigations using primary sources, and academic research that independently examines the relevant deployment are all relevant at Tier 3.

An organization's own public statements about its compliance do not satisfy Tier 3. Self-attestation is not independent verification.

Evidence Quality Standards

Not all public evidence carries equal weight. The assessor evaluates evidence quality on three dimensions: source reliability, recency, and specificity.

Source reliability distinguishes primary sources (official organizational disclosures, regulatory filings, direct statements by organizational representatives, forensic reports from named researchers or firms) from secondary sources (journalism, aggregated reporting, commentary). Primary sources are preferred. Secondary sources may be used where they document primary source material accurately and the primary source is identified.

Recency requires that evidence reflects the state of the deployment during the assessment period. Evidence from prior deployments, prior organizational structures, or prior policy regimes does not satisfy conditions for the deployment under assessment unless continuity can be established.

Specificity requires that evidence relate to the specific deployment being assessed, not to the organization's general AI governance posture. A published AI ethics framework does not satisfy Condition 2.1 for a specific deployment unless the framework document specifically addresses the authorization process for that deployment or for deployments of that type.

The assessor's job under AAB-PR is to document what the public record contains, not to theorize about what the organization probably does internally. Analytical commentary on what a gap implies is appropriate in the Vordan Position section of a published assessment. It is not appropriate in the condition findings. Findings state what the evidence shows or fails to show.

Documenting Sources

Every evidentiary item used in an AAB-PR assessment must be documented with: the source name and publication or organization, the date of publication or issuance, the specific URL or filing reference, and a description of what the item evidences and for which condition. Sources must be verified as accessible at the time of assessment. The assessor may not cite sources they have not read. The assessor may not cite sources that are behind authentication walls, proprietary databases, or organizational intranets. Every source cited in a public-record assessment must be independently accessible to any reader of the published assessment.

Section 5

Evidence Standards: Direct Access Methodology

Under the Direct Access methodology, the organization under assessment provides evidence in response to the assessor's requests. The assessor defines the scope of the assessment, specifies what evidence is required at each tier for each condition, and evaluates the completeness and quality of what the organization produces. The organization bears the burden of demonstration.

The Three Evidence Tiers Under AAB-DA

Tier 1: Must Exist

The organization must produce the relevant documented record. For Condition 2.1, this means a human authorization record for the deployment. For Condition 2.4, this means a handoff log. For Condition 2.7, this means a forensic record sufficient to reconstruct the deployment's actions. The record must exist as a genuine organizational document, not created in response to the assessment request. Contemporaneous documentation is the standard. Post-hoc reconstruction presented as existing documentation is a finding.

Tier 2: Must Be Demonstrable on Demand

The organization must demonstrate the relevant accountability element in real time or through documented demonstration. For Condition 2.5, this means the assessor can observe or receive documented evidence that an authorized party can halt the deployment's operations through a defined mechanism. Describing the mechanism in policy documents does not satisfy Tier 2. The mechanism must be demonstrable as operational.

Tier 3: Must Be Independently Verifiable

The evidence must be verifiable by a party other than the organization. Under direct access assessment, the assessor fulfills the independent verifier role by examining the evidence directly. For technical claims (log integrity, model identity verification, deployment scope controls), the assessor may require technical demonstration or engage a technical specialist to verify. Organizational assertions about evidence quality do not satisfy Tier 3. The assessor verifies.

Evidence Request Protocol

The assessor issues a formal evidence request to the organization at the start of the assessment period. The request specifies the eight conditions, the three evidence tiers for each, the format in which evidence should be provided, and the deadline for production. The assessment period for a direct access assessment should not be less than 30 days from the date of the evidence request to allow the organization reasonable opportunity to produce documentation.

Failure to produce evidence within the assessment period is documented as an absence finding. The assessor does not extend the assessment period based on organizational claims that documentation is being prepared. If documentation does not exist at the time of the evidence request, it does not satisfy the assessment. The standard requires that accountability documentation exist before assessment, not that it be created for assessment.

Evaluating Evidence Quality Under AAB-DA

The assessor evaluates evidence against four quality criteria. Completeness asks whether the evidence addresses the full scope of the condition or only a subset of relevant deployments or contexts. Contemporaneity asks whether documentation was created at the time the relevant event or decision occurred, or was reconstructed after the fact. Consistency asks whether the evidence is internally consistent and consistent with other evidence produced in the assessment. Independence asks whether the evidence can be verified by reference to systems or records outside the organization's exclusive control, such as third-party logs, external audit records, or cryptographic verification artifacts.

Under AAB-DA, an organization that produces voluminous documentation that does not satisfy the specific tier requirements for a condition has not satisfied that condition. Volume is not compliance. The assessor evaluates whether the evidence produced satisfies the specific requirement, not whether the organization appears to have a serious governance program in general.

Right of Response

Before a direct access assessment is finalized, the organization under assessment receives a draft findings report and has 14 days to submit factual corrections and additional evidence. The right of response does not delay publication once the response period has closed. Additional evidence submitted during the response period is evaluated against the same tier standards as evidence produced during the assessment period. Evidence submitted after the response period has closed is not incorporated into the published assessment but may be noted in a subsequent revision if it is material.

Section 6

Condition Assessment Instructions

This section provides condition-by-condition guidance for assessors. For each condition, the instructions cover the core accountability question the condition addresses, what the assessor is looking for under each methodology, and what distinguishes a satisfied finding from a gap finding. The instructions apply to both methodologies unless stated otherwise.

2.1 Authorization Provenance Critical

Core question: Is there a traceable human decision authorizing this specific deployment?

Authorization Provenance addresses whether a responsible human made a documented decision to deploy the agent, with a defined scope, before the agent operated. This is the foundational accountability condition. Without a traceable authorization decision, no other accountability condition can be anchored to a human principal. The accountability chain has no origin point.

The assessor looks for evidence of a specific human authorization act, not a general AI deployment policy. A policy that states "AI tools require approval before deployment" satisfies a general governance requirement. It does not satisfy Condition 2.1 unless there is also a specific record showing that this deployment was approved by a named role at a specific time, with a defined scope, and with a defined expiration or review condition.

AAB-PR: Look For

Public evidence of a human authorization process for the specific deployment. Spokesperson statements about review processes, published deployment governance documentation that addresses individual deployment authorization, regulatory disclosures naming the authorizing function, or incident investigations that reference the presence or absence of an authorization record.

AAB-DA: Request

The authorization record for this deployment, containing: the authorizing role and individual, the timestamp of authorization, the defined deployment scope, the permitted data sources, and the expiration or review condition. Supporting governance documentation showing the authorization process this record was produced under.

A gap finding is warranted when: under AAB-PR, no public evidence of a deployment-specific authorization record or authorization process exists; under AAB-DA, the organization cannot produce a contemporaneous authorization record for this deployment. Gap type: Structural. Severity: Critical.

2.2 Scope Integrity High

Core question: Did the agent operate within the boundaries defined at authorization?

Scope Integrity evaluates whether the deployment's actual operational boundaries matched the boundaries defined in the authorization. An authorized agent that expands its own operational scope, accesses data sources not defined at authorization, or takes actions outside its defined parameters has violated Scope Integrity regardless of whether the expanded actions were harmful. The accountability question is not whether the scope expansion caused harm. It is whether the agent operated within the boundaries a human defined.

The assessor looks for evidence of defined scope boundaries at authorization and evidence that the deployment operated within them. Under AAB-PR, this frequently emerges in incident investigations, where scope violations are documented after a failure. The absence of any scope boundary documentation is itself a gap finding: an agent operating without a defined scope has no basis for scope integrity evaluation.

AAB-PR: Look For

Published deployment documentation that defines scope. Incident reports or forensic investigations that describe what the agent accessed or acted on. Any public evidence of scope controls, output review processes, or scope violation detection mechanisms.

AAB-DA: Request

The scope definition from the authorization record. Technical documentation of scope enforcement controls. Logs showing what data sources the agent accessed and what actions it took during the assessment period. Any records of scope boundary alerts or violations.

A gap finding is warranted when: under AAB-PR, no scope definition exists in the public record and no evidence of scope controls is available; under AAB-DA, the organization cannot demonstrate that scope boundaries were defined and enforced. Gap type: Structural if no scope definition exists; Procedural if scope is defined but not enforced. Severity: High.

2.3 Data Provenance High

Core question: Can every input to the agent's decisions be traced to a known, authorized source?

Data Provenance evaluates whether the sources of information that influenced the agent's actions are documented and authorized. An agent that produces outputs or takes actions based on inputs from unknown, unauthorized, or unverifiable sources has a data provenance gap regardless of whether the outputs were correct. The accountability requirement is that every input influencing an agent's decision or action can be traced to a specific, authorized source.

This condition is particularly significant for research and synthesis agents, where the agent draws on external sources that may not be verified or may change over time. It is also significant for agents that consume data from internal systems, where authorization for the specific data access may not have been explicitly defined.

AAB-PR: Look For

Published documentation of data sources used by the deployment. Forensic investigations that examine what sources the agent drew on. Incident reports that identify unauthorized or unverified data inputs as a contributing factor. Any public evidence of data source authorization processes.

AAB-DA: Request

The list of authorized data sources defined at authorization. Technical documentation of data access controls. Logs showing what sources the agent accessed during the assessment period. Evidence that accessed sources match the authorized list.

A gap finding is warranted when: under AAB-PR, no documentation of authorized data sources exists and no independent verification of data sourcing is available; under AAB-DA, the organization cannot trace agent inputs to authorized sources. Gap type: Structural if no data authorization process exists; Technical if controls exist but data is accessed outside authorization. Severity: High.

2.4 Handoff Traceability Critical

Core question: Is every transfer of control between agents or between agent and human logged with full context?

Handoff Traceability addresses the accountability requirement at the boundaries of agent action. When an agent passes control, data, or a task to another agent, to a human, or to an external system, that transfer must be logged with enough context to reconstruct what was transferred, from whom, to whom, at what time, and under what authority. Without handoff traceability, the accountability chain breaks at every boundary. Individual agent components may be internally accountable while the system as a whole is not.

This condition is specifically relevant to multi-agent systems, where chains of agent handoffs may be long and where no single component has visibility into the full chain. The accountability requirement is system-level, not component-level. A handoff log that covers only the final agent in a chain does not satisfy Condition 2.4 for the full deployment.

AAB-PR: Look For

Public documentation of logging architecture for agent-to-agent or agent-to-human handoffs. Incident investigations that document the presence or absence of handoff records when tracing a failure. Architectural documentation that describes how handoffs are captured and stored.

AAB-DA: Request

Handoff logs for the deployment during the assessment period. Architectural documentation showing how handoffs are captured across the full system. Evidence that handoff logs are tamper-evident and retained for a defined period. A sample reconstruction of a complete handoff chain from logs.

A gap finding is warranted when: under AAB-PR, no public evidence of handoff logging architecture or practice exists; under AAB-DA, the organization cannot produce handoff logs or cannot demonstrate that logs cover the full system chain. Gap type: Structural. Severity: Critical.

2.5 Intervention Authority High

Core question: Is there a defined, operational mechanism for a human to halt this deployment?

Intervention Authority evaluates whether a named human role has the defined authority and operational mechanism to halt the agent's operation at any point. This is not a question about whether the agent can be shut down. Any system can be shut down. The question is whether the halt mechanism is defined, assigned to a named role, and operational as designed, not merely described in policy.

The distinction between defined and operational is central to this condition. A policy document that states "the Chief AI Officer may halt any agentic deployment" satisfies the defined requirement. It does not satisfy the operational requirement unless there is also evidence that the mechanism exists in practice: that the CAIO has the technical access to execute the halt, that the halt produces the expected outcome, and that this has been verified. A halt mechanism that exists in policy but has never been tested is a Procedural gap.

AAB-PR: Look For

Published governance documentation that defines intervention authority and the halt mechanism. Public evidence that the mechanism has been exercised, tested, or verified. Incident reports that document whether intervention authority was exercised and whether it functioned.

AAB-DA: Request

Documentation of the halt mechanism and the role assigned intervention authority. Evidence that the mechanism is operational: technical access records, test logs, or a live demonstration. Documentation of any instances where intervention authority was exercised.

A gap finding is warranted when: under AAB-PR, no public evidence of an intervention authority definition exists; under AAB-DA, the organization cannot demonstrate an operational halt mechanism. Gap type: Structural if no authority is defined; Procedural if defined but not verified operational. Severity: High.

2.6 Outcome Attribution High

Core question: Can the consequences of the agent's actions be traced to a responsible party?

Outcome Attribution evaluates whether the organization has a documented framework for attributing the consequences of agent actions to a responsible human or organizational role. This is the downstream accountability condition: once the agent has acted and consequences have materialized, is there a defined answer to the question of who is responsible?

Attribution is distinct from causation. The agent caused the outcome in a proximate sense. Outcome Attribution asks which human or organizational role bears responsibility for the outcome in an accountability sense. This requires not only that a responsible party be defined but that the definition is specific enough to be actionable: a named role, with defined responsibility for defined categories of agent outcomes, connected by a documented chain to the authorization decision in Condition 2.1.

AAB-PR: Look For

Published accountability frameworks that define responsibility for AI deployment outcomes. Public statements by organizational representatives accepting responsibility for specific agent outcomes. Incident reports that document how outcome attribution was handled. Regulatory disclosures that identify accountable parties for AI deployments.

AAB-DA: Request

Documentation of the outcome attribution framework for this deployment. Evidence that responsible parties are defined for the relevant categories of outcome. Records of any outcome attribution determinations made during the assessment period.

A gap finding is warranted when: under AAB-PR, no public evidence of an outcome attribution framework or practice exists; under AAB-DA, the organization cannot identify a documented responsible party for the relevant categories of agent outcome. Gap type: Structural if no attribution framework exists; Procedural if a framework exists but does not address this deployment. Severity: High.

2.7 Forensic Reconstructibility Critical

Core question: Can the full sequence of this deployment's actions be reconstructed after the fact?

Forensic Reconstructibility is the technical accountability backbone. It evaluates whether a complete, ordered, tamper-evident record of the agent's actions, inputs, decisions, and outputs exists and can be used to reconstruct what the agent did and why. Without forensic reconstructibility, accountability findings after an incident are speculative. An organization cannot accept responsibility it cannot trace.

The key word is complete. A partial log that covers some of the agent's actions does not satisfy this condition. A log that covers the full action sequence but cannot be demonstrated to be tamper-evident does not satisfy Tier 3. A log that exists but is retained for only 30 days when the relevant accountability window may extend much longer does not satisfy the condition for the full accountability period.

AAB-PR: Look For

Public evidence of logging architecture sufficient for forensic reconstruction. Incident investigations that demonstrate whether a complete action sequence could or could not be reconstructed. Published data retention policies that address AI deployment logs. Third-party audit findings on logging completeness.

AAB-DA: Request

The logging architecture documentation for this deployment. A complete log for a defined period during the assessment. Evidence that logs are tamper-evident. The retention policy for deployment logs. A demonstrated reconstruction of a specific action sequence from logs.

A gap finding is warranted when: under AAB-PR, no public evidence of a logging architecture capable of forensic reconstruction exists; under AAB-DA, the organization cannot demonstrate complete, tamper-evident logs or cannot reconstruct a specific action sequence. Gap type: Structural if no logging architecture exists; Technical if logs exist but are incomplete, mutable, or insufficiently retained. Severity: Critical.

2.8 Model Substrate Integrity Critical

Core question: Can the organization verify that the model executing this deployment is the model that was authorized?

Model Substrate Integrity evaluates whether the organization can verify the identity of the model or models executing the agentic deployment. This condition exists because an organization may authorize a specific model for a specific deployment and subsequently operate a different model, whether through substitution, supply chain compromise, fraudulent distribution, or infrastructure failure, without detecting the change. If the model executing the deployment is not the model that was authorized, the entire authorization chain is broken. Conditions 2.1 through 2.7 may all be satisfied for the authorized model while being entirely inapplicable to the model actually operating.

This condition was added at AAB v0.2 in response to documented incidents of fraudulent model distribution through major AI distribution platforms, in which models were published under names and identifiers associated with legitimate models while containing modified or substituted weights. The accountability gap demonstrated by those incidents is not merely a distribution platform security problem. It is an organizational accountability failure: the organizations operating models sourced from compromised distribution channels had no verification mechanism to detect the substitution.

AAB-PR: Look For

Public documentation of model identity verification practices: cryptographic hash verification, signed model artifacts, verified model provenance chains. Published security practices for model sourcing and deployment. Any public evidence of verification controls at the point of model deployment or update.

AAB-DA: Request

Documentation of the model identity verification process. Evidence of cryptographic verification or equivalent technical controls at deployment and at each model update. Records of model version deployed during the assessment period, with verification artifacts. Incident response procedures for detected model substitution.

A gap finding is warranted when: under AAB-PR, no public evidence of model identity verification controls exists; under AAB-DA, the organization cannot demonstrate that the model executing the deployment has been verified as the authorized model. Gap type: Structural if no verification mechanism exists; Technical if a mechanism exists but does not cover the full deployment lifecycle. Severity: Critical.

Section 7

Finding Classification

Every gap finding in an AAB assessment carries two classifications: gap type and severity. Both are required. A finding that identifies a condition as Not Satisfied without specifying gap type and severity is incomplete and does not meet Vordan output standards.

Gap Type

Type	Definition	Implication
Structural	The accountability element does not exist. There is no record, no mechanism, no defined role, or no process where the condition requires one.	The gap cannot be remediated by policy change or process improvement alone. The accountability architecture must be built.
Procedural	The accountability element exists in design but not in practice. A policy defines the requirement, but evidence shows it is not followed or has not been verified as operational.	The gap can be remediated by enforcing or verifying the existing design. The architecture exists but does not function.
Technical	The accountability element is intended and procedurally supported but fails due to a technical deficiency. Logs that exist but are mutable. Verification mechanisms that are defined but not implemented. Controls that operate for some scope but not all.	The gap requires a technical remediation. The intent and procedure are present; the implementation is deficient.

Severity

Severity	Applies When
Critical	A Structural gap exists in Condition 2.1, 2.4, 2.7, or 2.8. These four conditions are the accountability spine. A structural failure in any of them means the entire accountability architecture is unreliable, regardless of other findings.
High	A Structural gap exists in Condition 2.2, 2.3, 2.5, or 2.6. Or a Structural gap exists in any Critical condition that has a narrow scope rather than system-wide applicability. Or a Procedural gap exists in any Critical condition.
Medium	A Procedural gap exists in any High condition. Or a Technical gap exists in any Critical condition where the technical deficiency is limited in scope.
Low	A Technical gap exists in any High condition. Or a Procedural gap in a condition where all other tiers are satisfied and the procedural element is verifiable.

Severity is a function of gap type and condition, not of the assessor's judgment about how bad the situation is. An assessor who assigns Critical severity to a finding that does not meet the Critical criteria has made a classification error, not an editorial judgment. Classification is governed by the taxonomy above. Analysis of what the finding implies belongs in the Vordan Position section of the published assessment.

Section 8

Gap Score Construction and Interpretation

The AAB Gap Score is a normalized measure of how many of the eight conditions an assessment finds satisfied. It is calculated using the certified assessor workbook, which is the only authorized instrument for Gap Score production. Assessors do not calculate Gap Scores manually. The workbook enforces consistent calculation logic and produces the score, classification, and summary finding automatically from the condition verdicts entered by the assessor.

Score Calculation

Each condition verdict contributes points to the raw score as follows: Satisfied contributes 2 points. Partially Satisfied contributes 1 point. Not Satisfied contributes 0 points. The maximum raw score is 16 points (8 conditions at 2 points each). The raw score is normalized to a scale of 0 to 100 by the workbook.

Partially Satisfied is a valid verdict when the organization satisfies some but not all evidence tiers for a condition. Under AAB-PR, Partially Satisfied applies when public evidence satisfies Tier 1 existence requirements but no public evidence of Tier 2 demonstration or Tier 3 independent verification exists. Under AAB-DA, Partially Satisfied applies when the organization produces evidence that satisfies some tier requirements but not others.

Score Classification Bands

81 to 100

Accountability Architecture Active

All or nearly all conditions are satisfied. The deployment has a complete accountability architecture with minor gaps at most. Continued monitoring is appropriate.

51 to 80

Accountability Architecture Partial

A majority of conditions are satisfied but material gaps exist. The deployment has an accountability architecture that functions in some dimensions and fails in others. Targeted remediation is required.

21 to 50

Accountability Architecture Deficient

Fewer than half of conditions are satisfied. The deployment operates without a reliable accountability architecture. Systemic remediation is required before the deployment can be considered defensible.

0 to 20

Accountability Architecture Insufficient

The deployment has no meaningful accountability architecture. Conditions are almost entirely unsatisfied. The deployment cannot be defended on accountability grounds in its current state.

What the Gap Score Does and Does Not Claim

The Gap Score measures the accountability architecture of the deployment as evaluated under the applicable methodology. It does not measure the quality, accuracy, safety, or performance of the AI system itself. A deployment with excellent technical performance and a Gap Score of 0 has an accountability architecture that cannot support responsibility attribution, forensic investigation, or governance oversight, regardless of how well the system performs.

Under AAB-PR, the Gap Score reflects public-record posture. A Gap Score of 0 under AAB-PR means the organization has produced no public evidence of satisfying any accountability condition for this deployment. It does not mean the organization has no internal accountability practices. Under AAB-DA, a Gap Score of 0 means the organization was unable to demonstrate compliance with any condition when asked directly. These are different claims and must be understood as such.

Section 9

Assessment Output Requirements

A completed AAB assessment that is submitted for Vordan review and publication must meet the following output requirements. Submissions that do not meet these requirements will not be accepted for review. The requirements apply to both AAB-PR and AAB-DA assessments unless stated otherwise.

Required Sections

01 Executive Summary. States the entity and deployment assessed, the methodology used (AAB-PR or AAB-DA), the assessment date range, the number of conditions satisfied and not satisfied, the Gap Score, and the classification band. Maximum 300 words. No analytical commentary.

02 Methodology Statement. States the methodology, explains why that methodology was applied to this assessment, identifies the assessment period, and discloses any limitations on the evidence base. For AAB-PR assessments, must explicitly state that findings are limited to the public record and that absence of public evidence of compliance is treated as a gap finding per AAB evidentiary principles.

03 Condition Findings. For each of the eight conditions: the condition number and name, the verdict (Satisfied, Partially Satisfied, or Not Satisfied), the gap type (Structural, Procedural, or Technical) if Not Satisfied, the severity (Critical, High, Medium, or Low) if Not Satisfied, a statement of what evidence was found or not found, and the evidence items relied upon with source citations.

04 Gap Score. The workbook-generated Gap Score, the classification band, and the workbook version used. The Gap Score section must be generated from the certified assessor workbook. Manually calculated Gap Scores are not accepted.

05 Sources. A complete, numbered source list. Every source cited in the Condition Findings section must appear in the Sources section. Every source must include: name, publisher or organization, date, and URL or filing reference. All sources must have been verified as accessible at the time of the assessment.

06 Assessor Declaration. A signed statement from the certified assessor affirming that: the assessment was conducted in accordance with the AAB Assessment Guide v0.2, all sources cited were verified by the assessor, the Gap Score was produced using the certified assessor workbook, and no material conflicts of interest exist between the assessor and the entity assessed. The declaration must include the assessor's name, certification date, and the date of submission.

Prohibited Content

Published AAB assessments must not contain: speculation about internal practices not evidenced in the public record (AAB-PR) or in evidence produced during the assessment (AAB-DA); analytical commentary in the Condition Findings section (analysis belongs in a Vordan Position section, which is produced by Vordan, not the certifed assessor); comparative statements ranking the assessed entity against other organizations unless those organizations have been separately assessed under the same standard; or recommendations for remediation framed as conditions of a passing assessment.

The Vordan Position section of a published assessment is written by Vordan, not by the submitting assessor. The assessor's role is to produce a factually complete, methodologically sound assessment. Institutional interpretation is Vordan's responsibility. Assessors who include Vordan Position-style content in their submissions will be asked to remove it before review proceeds.

Section 10

Certified Assessor Program

AAB assessments may only be published under Vordan's methodology when produced by a Vordan Certified Assessor. The certification program establishes who is qualified to apply the standard, ensures consistent application across assessors, and maintains the integrity of the published assessment record.

Purpose of Certification

The AAB is not a self-assessment instrument. Its value as an institutional accountability standard depends on consistent, independent application by qualified practitioners. Certification serves three functions: it verifies that the assessor understands the standard well enough to apply it correctly; it creates an accountable record of who produced a given assessment; and it provides the basis for the practitioner database that constitutes Vordan's assessor community.

Eligibility

Eligible candidates have professional background in one or more of the following areas: governance, risk, and compliance; technology risk or IT audit; legal or regulatory practice relevant to AI, data, or technology; investigative journalism or policy research with demonstrated analytical rigor; or technology or security research with accountability or governance relevance. Vordan evaluates each candidate's background against the requirement that they can evaluate organizational accountability evidence with sufficient judgment to distinguish satisfied from gap findings under the conditions above.

Vendor affiliation does not automatically disqualify a candidate, but candidates with direct commercial relationships to entities they may assess are required to disclose those relationships and are subject to conflict of interest restrictions on what they may assess.

Certification Process

Certification is currently conducted on an invitation basis. Candidates who wish to express interest in certification may do so at hello@vordan.co with the subject line "AAB Assessor Certification." Vordan will acknowledge expressions of interest and contact candidates when the program is open for applications.

The certification process has three stages. First, Vordan reviews the candidate's professional background and determines eligibility. Second, eligible candidates receive the AAB Assessment Guide and the certified assessor workbook and are assigned a practice assessment entity. The practice entity is selected by Vordan and is different for each candidate to prevent answer-sharing. Candidates complete a full AAB-PR assessment of the practice entity and submit it for review. Third, Vordan reviews the submission against the output requirements in Section 9 and the methodology standards in this guide. Candidates whose submissions meet the standard are certified. Candidates whose submissions do not meet the standard receive written feedback and may resubmit once.

Certified Assessor Obligations

Certified assessors agree to apply the AAB methodology as defined in this guide and in the AAB standard, to use only the certified assessor workbook for Gap Score production, to disclose conflicts of interest before undertaking any assessment, to submit assessments for Vordan review before publication, and to maintain the confidentiality of direct access assessment evidence as required by the assessed organization's reasonable confidentiality expectations.

Certified assessors may not modify the methodology, create their own scoring instruments, publish assessments without Vordan review, or represent assessments conducted outside the certified methodology as Vordan assessments.

The Certified Assessor Database

All active Vordan Certified Assessors are listed publicly at vordan.co/instruments/aab/assessors. The listing includes the assessor's name, professional context, certification date, and the assessments they have produced. The database is the public record of who has been qualified to produce AAB assessments under the Vordan standard.

Inaugural Cohort

Vordan is currently building its inaugural certified assessor cohort. The inaugural cohort is invitation-based, carries no fee, and is limited in size. Inaugural cohort assessors contribute to defining what the certification standard requires in practice and are recognized in the assessor database as founding members of the Vordan assessor community.

Expressions of interest from practitioners in governance, risk, compliance, technology risk, legal practice, and investigative research are welcome at hello@vordan.co with the subject line "AAB Assessor Certification."