EU AI Act Article 12 enforcement begins Aug 2, 2026 · — days left Book the $997 AI Audit →

Field notes  ·  2026-05-30

Is my AI agent a high-risk system under the EU AI Act? The classification decision tree.

No single page owns the answer for an autonomous agent. Here is the copyable five-gate decision tree, the multi-agent rule no competitor operationalizes, and the three SMB agents I get asked about most, run through the tree to a definite answer.

Your AI agent is a high-risk system under the EU AI Act (Regulation (EU) 2024/1689) if its intended purpose is a safety component of an Annex I product, or if it falls in one of the eight Annex III domains, unless one of the four Article 6(3) limbs applies and it does no profiling. First rule out Article 5 prohibited use. Transparency under Article 50 can still apply even when you are not high-risk.

Classification is a five-gate walk, not a yes/no. You take the gates in order: Article 5 (prohibited) → Annex I (product-safety route) → Annex III (standalone use case) → Article 6(3) (the exemption) → Article 50 (transparency). The first gate that fires sets your obligations.

The agent-specific twist no competitor turns into a procedure: you classify the agent by what it is for, end to end, not by its architecture. Splitting a high-risk job across an orchestrator and a fleet of sub-agents does not split the classification. The whole stack is one system, classified by its combined intended purpose.

The question I get most on audit calls right now isn't "what do I log" — it's "is my agent even in scope." Teams have read that high-risk AI is coming, they have built something agentic, and they cannot tell whether the rules land on them or sail past. The honest answer is that you can decide it yourself in about fifteen minutes if you walk the gates in the right order. This is that walk, applied to an agent, with the worked examples to anchor it.

One date caveat before we go, because it is moving. As of this writing (2026-05-30), August 2, 2026 is still the legal date the high-risk obligations apply. EU institutions reached a provisional "Digital Omnibus" agreement on May 6, 2026, confirmed by Member State representatives on May 13, that would defer stand-alone Annex III obligations to December 2, 2027. That deferral is not law until it is published in the Official Journal, which had not happened when I wrote this. I am not going to re-derive the date stack here; the August 2, 2026 deadline stack and the fine tiers covers it, and the logging reference carries the same caveat.

What makes an AI agent "high-risk" under the EU AI Act?

An AI agent is high-risk only by one of two routes: it is a safety component of an Annex I product (Article 6(1)), or its intended purpose falls in an Annex III domain (Article 6(2)) — and Annex III systems are high-risk unless an Article 6(3) limb exempts them.

That is the whole definition, and it is narrower than the panic suggests. "High-risk" is one tier in a stack. Above it sits prohibited use under Article 5 — a short list of things you cannot do at all, like social scoring or untargeted scraping of facial images. Below it sits everything else, the minimal-risk majority. Off to the side, and easy to miss, is a separate and lighter category: transparency-only obligations under Article 50, which can co-apply to a system that is not high-risk at all. A chatbot that is not high-risk can still owe a disclosure duty. So the question is never just "am I high-risk." It is "which tier, and do any transparency duties stack on top."

The two routes matter because they run on different clocks. The Annex III standalone route is the August 2, 2026 date (subject to the deferral above). The Annex I product-safety route — agents embedded as safety components in regulated products like medical devices and machinery — runs to August 2, 2027.

The classification decision tree — five gates from Article 5 to Article 50.

Walk these five gates in order; the first one that fires sets your obligations. This is the part competitors leave as prose. Here it is as a procedure you can run against your own agent.

  1. Gate 1 — Article 5 — prohibited use. Does your agent's purpose do anything on the prohibited list (social scoring, manipulative subliminal techniques, untargeted scraping for facial databases, and the rest)? If yes, stop. It is banned, not classifiable.
  2. Gate 2 — Article 6(1) / Annex I — product-safety route. Is your agent a safety component of a product covered by Annex I (medical devices, machinery, toys, and the other listed product legislation)? If yes, it is high-risk via the product route, on the August 2, 2027 timeline.
  3. Gate 3 — Article 6(2) / Annex III — standalone use case. Is your agent's intended purpose in one of the eight Annex III domains? If yes, it is presumed high-risk — go to Gate 4.
  4. Gate 4 — Article 6(3) — the exemption. Does exactly one of the four narrow limbs apply, and does the agent do no profiling? If yes, it is exempt from high-risk — but you must document that conclusion (Gate 5 plus the self-assessment duty below).
  5. Gate 5 — Article 50 — transparency. Does it interact with people, generate or manipulate content, or do emotion recognition or biometric categorisation? If yes, transparency duties apply even if the agent is not high-risk.

The same walk as a table, which is the form an auditor will want and the form an answer engine will lift:

GateArticleAsk of your agentIf yes
1Art 5Does its purpose do anything on the prohibited list (social scoring, manipulative subliminal techniques, untargeted scraping for facial DBs, etc.)?Stop. It is banned, not classifiable.
2Art 6(1) / Annex IIs it a safety component of a product covered by Annex I (medical devices, machinery, toys…)?High-risk via the product-safety route (and the Aug 2, 2027 timeline).
3Art 6(2) / Annex IIIIs its intended purpose in one of the eight Annex III domains?Presumed high-risk — go to Gate 4.
4Art 6(3)Does exactly one of the four narrow limbs apply AND it does no profiling?Exempt from high-risk — but document it (Gate 5 + self-assessment).
5Art 50Does it interact with people, generate/manipulate content, or do emotion/biometric categorisation?Transparency duties apply even if not high-risk.

Run the gates top to bottom. Most agents I see stop at Gate 3 with a "no" and exit to the minimal-risk majority, sometimes picking up an Article 50 duty at Gate 5. The agents that land in scope almost always do so at Gate 3, in one of three domains, which is the next section.

Is your agent in one of the eight Annex III domains?

List the eight domains and check your agent's intended purpose against them — most SMB agents that are high-risk land in employment, essential services (credit), or biometrics. Annex III is the canonical list, and the rule attached to it is that an Annex III system is high-risk unless it does not pose a significant risk of harm (which is what the Article 6(3) limbs operationalize). The eight domains:

The point of checking the list is to find your agent's intended purpose in it, not its technology. A language model is not high-risk; a language model whose intended purpose is to score creditworthiness is. The domain attaches to the job, not the tool. For the flat reference list and the rest of the date stack, the August 2, 2026 deadline stack covers it; here it lives inside the tree.

The Article 6(3) escape route — the four limbs (and the profiling override).

An Annex III agent is NOT high-risk if it does only one of four narrow things — a narrow procedural task, improving the result of a completed human activity, detecting decision patterns without replacing or influencing human review, or a preparatory task — and never if it profiles people. The four limbs of Article 6(3):

  1. The agent performs a narrow procedural task.
  2. The agent improves the result of a previously completed human activity.
  3. The agent detects decision-making patterns or deviations from prior patterns and is not meant to replace or influence the previously completed human assessment without proper human review.
  4. The agent performs a preparatory task to an assessment relevant to an Annex III use case.

Two things make this escape route narrower than it reads. First, the profiling override is absolute: the moment your agent performs profiling of natural persons, none of the four limbs is available. As Barry Scannell of William Fry put it, "AI systems involved in profiling individuals will always be classified as high-risk." (William Fry, 2024-09-03) There is no exemption to reach for once profiling is in the picture.

Second, the human-in-the-loop trap. Adding a human reviewer to an Annex III agent does not by itself downgrade it. Rosie Nance and Marcus Evans, writing for Norton Rose Fulbright's Data Protection Report after the draft guidelines landed, put it plainly: "The provider cannot exempt and categorise an AI system as 'low risk' simply by adding to it a requirement for human involvement." (Data Protection Report, 2026-05-22) If the agent's output meaningfully shapes the human decision, limb three is gone and you are back to high-risk.

Multi-agent stacks — does splitting the work across agents change the classification?

No. You classify by the combined intended purpose end to end, so an orchestrator coordinating sub-agents toward a high-risk decision is ONE high-risk system; an architectural split does not split the classification.

This is the rule competitors assert in a sentence and never operationalize. The teams I have audited keep trying to engineer around it: split the credit decision so that one sub-agent fetches features, another scores, a third applies the threshold, and reason that no single agent "decides," so none is high-risk. That does not work, and the draft guidelines say why directly. The European Commission's draft classification guidelines (published 19 May 2026, consultation open to 23 June 2026) address modular and agentic systems: splitting a high-risk function into separate modules does not change the analysis, because classification follows the combined intended purpose rather than any single module. Two practical consequences fall out of that.

Where the obligations attach.

They attach to the system as defined by its intended purpose — in an agentic stack, typically the orchestrator, or whoever provides the end-to-end agent, not each sub-tool in isolation. If you are the provider of the assembled agent that produces a credit decision, you are the provider of a high-risk system, even if you stitched it together from third-party models and tools. The sub-agents do not each carry the obligation; the system that uses them toward the regulated purpose does.

Where the audit trail attaches.

You must be able to reconstruct the end-to-end decision path across agents, not per-agent fragments. A regulator asking "why was this applicant declined" does not want the orchestrator's log and three disconnected sub-agent logs that cannot be joined. They want the single thread: what the system saw, which agent did what, in what order, and what the combined output was. The pattern that survives this is a signed receipt per agent action, chained so the whole path is tamper-evident and replayable. That is the same chain-of-custody mechanism I use for logging work; if you want the build, I wrote up how to build an audit chain that survives Article 12. The classification just tells you the chain has to span the whole agent, not one node of it.

I decided my agent is NOT high-risk. Am I done?

No — under Article 6(4) you must document the assessment that led to the not-high-risk conclusion before the agent goes on the market, and you may still have to register under Article 49(2); the self-assessment is itself a compliance artifact.

This is the branch almost everyone under-serves. The operators I see who relied on an Article 6(3) limb did the hard part — they reasoned correctly that their agent was exempt — and then wrote none of it down. When a market surveillance authority later asks "show me why you decided this was not high-risk," they have a conclusion and no record behind it. The document proving you decided correctly is the cheapest compliance artifact you will ever produce, and it is the only thing standing between a defensible "not high-risk" and the misclassification tier. One deployer I worked with had self-assessed as not high-risk months earlier, confidently and correctly, but had nothing to show for it. Reconstructing the reasoning after the fact, under pressure, cost more than writing it down at decision time would have. If you want a second set of eyes on the self-assessment before it goes on file, that is part of the audit.

What happens if you misclassify?

Under-claiming doesn't just expose the failure-to-comply tier; supplying an unfounded "not high-risk" self-assessment to authorities can also land you in the "incorrect, incomplete or misleading information" penalty tier under Article 99, which carries fines up to EUR 7.5 million or 1% of worldwide annual turnover, whichever is higher. In other words, the bad self-assessment can be a second, separate finding stacked on top of the failure to meet the high-risk obligations you should have met. I am not going to re-table the fines here; the August 2, 2026 deadline stack and the fine tiers is the authority for the full breakdown.

Worked examples — credit-scoring agent, CV-screening agent, support chatbot.

Run the three most common SMB agents through the tree and you get three different answers — the credit and CV agents are high-risk, the support chatbot usually is not but still owes Article 50 transparency.

AgentAnnex III domain?High-risk?Also Article 50?Why
Credit-scoring agentEssential private servicesYesYes if it has a chat UICreditworthiness decisioning; no 6(3) limb; profiling
CV-screening / candidate-ranking agentEmployment & HRYes (a pure CV parser may hit a 6(3) limb but still registers)If conversationalRecruitment decisioning; ranking influences a human decision
Customer-support chatbotNone (unless it does HR/credit decisioning)Usually noYes — must disclose it is an AIBecomes high-risk only if it crosses into an Annex III decision

The credit-scoring agent fires at Gate 3 (essential services), survives no limb at Gate 4 because the credit decision is the substantive task and it profiles the applicant, and picks up an Article 50 duty at Gate 5 if it talks to applicants through a chat interface. The CV-screening agent fires at Gate 3 (employment); a ranking or pass/fail agent influences the human hiring decision, so limb three is unavailable. A pure CV parser that only extracts fields might genuinely hit the narrow-procedural-task limb — but it still has to be documented and may still register. The support chatbot clears Gate 3 entirely as long as it stays out of credit and HR decisions, so it is not high-risk; it still owes the Article 50 disclosure that a human is talking to a machine, and it flips to high-risk the moment you wire it into a regulated decision.

These are the same two worked examples I use in the logging reference, asked the other way around. There the question is what to log; here it is whether you are in scope at all. Once you have confirmed high-risk, here is what you must log, in what form, and for how long.

You're high-risk. What now?

A high-risk classification triggers the full obligation set — risk management, data governance, logging, human oversight, post-market monitoring — and the cheapest next move is the audit trail. Risk management and data governance are policy work you will do once and maintain. Logging is the one with the longest lead time and the one auditors press on hardest, because it is the evidence everything else rests on. So the practical order is: confirm the classification (this page), then work out what you must log, in what form, and for how long, then build an audit chain that survives Article 12. None of it needs a $40K platform; it needs decisions written down and emission wired into the decision path.

FAQ

Is a credit-scoring agent high-risk under the EU AI Act?

Yes. Credit scoring is an Annex III essential-services use case, so a credit-scoring agent is presumed high-risk; no Article 6(3) limb saves it because the decision is the substantive task and it profiles applicants. If it has a chat interface, Article 50 transparency also applies.

Is a CV-screening or candidate-ranking agent high-risk?

Yes. Employment and recruitment is an Annex III domain. A ranking or pass/fail agent influences a human hiring decision, so it is high-risk. A pure CV parser doing a narrow procedural task may fall under an Article 6(3) limb, but it still has to be documented and registered.

Is a customer-support chatbot a high-risk AI system?

Usually not. A general support chatbot is not in an Annex III domain, so it is not high-risk, but Article 50 still requires you to disclose that people are talking to an AI. It becomes high-risk only if it crosses into a regulated decision, such as credit or employment.

Does splitting my workflow across multiple agents change the classification?

No. You classify by the combined intended purpose end to end. An orchestrator coordinating sub-agents toward a high-risk decision is one high-risk system; the architecture does not split the classification, and the audit trail must cover the whole decision path.

If I decide my agent is NOT high-risk, do I still have to do anything?

Yes. Under Article 6(4) you must document the assessment behind the not-high-risk conclusion before the system goes on the market, and you may still need to register under Article 49(2). The self-assessment is itself a compliance artifact.

When do these classification rules actually apply?

The legal date is August 2, 2026. A provisional Digital Omnibus agreement would defer stand-alone Annex III obligations to December 2, 2027, but that is not law until it is published in the Official Journal, which had not happened as of this writing. Plan for August 2, 2026.

Book it

Two hours on a call with me. Forty-eight hours to the plan.

I'll walk your agent through the classification tree, give you a written self-assessment you can put on file, and a numbered plan for whatever the gates turn up. $997 flat. Refund in 7 days if you don't find it useful.

Book the AI audit →

Regulators don't want philosophy, they want records — and the first record is the one that proves you classified correctly.