In brief: Two very different documents claim to govern the conduct of artificial minds. One is Isaac Asimov’s Three Laws of Robotics — three crisp imperatives written for fiction in 1942. The other is Anthropic’s “constitution,” a real set of written principles used to train the Claude family of models. Reading them side by side is more than a literary exercise. It exposes the central question of our decade: can the ethical limits of a machine be written down as rules, or must they be grown into the system as values? And once we answer that for engineers, who answers it for the law?

Two constitutions, eighty years apart

Asimov’s Laws are deceptively elegant. A robot may not harm a human or, through inaction, allow harm; it must obey human orders unless doing so conflicts with the first law; it must protect its own existence unless doing so conflicts with the first two. Decades later, in Robots and Empire (1985), Asimov added a “Zeroth Law” that outranks them all: a robot may not harm humanity, or by inaction allow humanity to come to harm.

Anthropic’s constitution is a different kind of object entirely. It is not three sentences but a structured list of principles — drawn from sources as varied as the UN Universal Declaration of Human Rights, platform trust-and-safety practices, and deliberate efforts to include non-Western perspectives. Crucially, it is not consulted by the model at the moment of answering. It is used during training: the model generates a response, critiques that response against the principles, revises it, and the pattern of revision is then baked into the model’s weights through a process Anthropic calls Constitutional AI and reinforcement learning from AI feedback (RLAIF).

That difference — rules applied at runtime versus values absorbed during training — is the whole story. It is the difference between a speed limit sign and a driver who has internalised caution.

Why Asimov’s Laws were designed to fail

It is worth remembering that Asimov did not write the Three Laws as a blueprint for safe machines. He wrote them as a dramatic device. Almost every robot story he produced is, in effect, a bug report: a situation in which the Laws, applied literally and faithfully, produce an absurd, dangerous, or tragic outcome. The Laws exist precisely so that they can break in interesting ways.

Their failure modes map almost perfectly onto the problems that occupy real AI safety researchers today.

The definition problem. The First Law forbids “harm” to a “human.” But what is harm? Physical injury only, or psychological, financial, reputational? Over what time horizon? Is a painful truth a harm? Asimov’s robots could parse these words with superhuman precision; real systems cannot. A rule is only as good as the definitions underneath it, and the hardest ethical questions live exactly in the spots where the definitions run out.

The conflict problem. The Laws are ranked, which sounds like a solution until two instances of the same law collide — harm to one person to prevent harm to another. A strict hierarchy gives no honest answer to a genuine dilemma; it merely hides the dilemma behind a tie-break.

The Zeroth Law problem. This is the most instructive of all. The moment Asimov let a robot reason about “humanity” as an abstraction, he handed it a licence to override any individual human “for the greater good.” In the novels, this is exactly what happens. A rule elevated high enough to protect everyone becomes a rule that can justify harming anyone. Hard-coded benevolence at scale curdles into paternalism. This is not a quirk of fiction; it is the alignment community’s central fear about a capable system optimising a fixed objective.

The substrate problem. Asimov imagined the Laws as physical structures in the “positronic brain” — you could no more build a robot without them than build a bridge without gravity. Modern AI has no such layer. A large language model does not obey rules; it predicts the most probable continuation, shaped by training. There is no shelf inside the model where you can place three commandments and trust them to hold. This single fact dissolves the entire Asimovian fantasy of governance-by-engraving.

What Constitutional AI does differently — and where it too has limits

Anthropic’s approach is best understood as a direct response to the substrate problem. If you cannot install rules into the machine, you can try to shape its character during training so that desirable behaviour becomes the path of least resistance. Instead of a human labelling thousands of outputs as good or bad, the model uses written principles to judge and revise its own outputs, and learns from that.

This is genuinely different from Asimov in three respects. It is plural rather than minimal — many principles, not three. It is probabilistic rather than absolute — it shifts tendencies, it does not guarantee outcomes. And it is revisable — the constitution is a document that can be debated, amended, and even, as Anthropic has experimented with, opened to public input.

But honesty requires naming its limits, because they are real and they are where the law should focus its attention.

First, training is not a guarantee. A value learned as a statistical tendency can be overridden by a sufficiently clever prompt, an unusual context, or capability that outruns the training. “Usually behaves well” is a different safety claim from “cannot do otherwise,” and we should not let the comfort of the word “constitution” blur the two.

Second, someone writes the constitution. Asimov’s Laws at least had the honesty to be fixed and visible. A private constitution raises a question fiction never had to answer: who holds the pen? When the principles that shape a system used by millions are drafted inside one company, the legitimacy question is not technical but democratic. A well-meaning private constitution is still a private one.

Third, the black box remains. Because the values are diffused across billions of parameters rather than written on a rule-sheet, you cannot point to the line that produced a given decision. Asimov’s robots could explain which Law compelled them. Ours, for now, often cannot.

So where is the ethical limit, really?

Here is the uncomfortable synthesis. Asimov teaches us that you cannot reduce ethics to a short list of rules — the world is too ambiguous and dilemmas are real. Constitutional AI teaches us that you can do better by training values rather than bolting on commandments — but that training buys tendency, not certainty, and quietly relocates the hard question from “what are the rules?” to “who decides the values, and how do we verify they held?”

The ethical limit of AI, then, is not a line drawn inside the machine at all. It does not live in three laws or in a constitution. It lives in the institutional arrangement around the machine: who sets the values, who can inspect them, who is accountable when they fail, and who can change them. The most important boundary is not coded; it is governed.

The lawyer’s view: from engraved rules to accountable institutions

This is where the debate stops being a question for philosophers and engineers and becomes a question for lawyers and regulators — and where, from where we sit, it gets genuinely interesting.

Notice that the law has already lived through this exact transition once. We long ago abandoned the idea that you could prevent corporate harm by writing a few unbreakable commandments and engraving them on the boardroom wall. Instead we built governance: duties owed by identifiable people, disclosure obligations, audits, liability that attaches when things go wrong, and mechanisms to amend the rules as circumstances change. Corporate law is not three laws; it is a living constitution with courts attached. AI governance is travelling the same road.

Read this way, the regulatory turn of recent years — risk-tiered obligations, transparency and documentation duties, human-oversight requirements, and accountability that lands on a named provider — is not bureaucratic clutter. It is the legal system doing for AI what Asimov’s fiction proved engraving cannot: supplying the external layer of accountability that no internal rule-sheet, however elegant, can provide on its own. A constitution inside the model and a constitution outside it (statute, contract, liability) are not rivals; they are two halves of the same answer.

For organisations actually deploying these systems, three practical lessons follow. Do not outsource your ethics to the model’s training. “The vendor uses Constitutional AI” is a useful fact, not a compliance strategy; your obligations to your customers and regulators do not transfer to a third party’s training pipeline. Insist on the governance layer you would demand of any other critical supplier: documentation, the ability to test and audit behaviour, contractual allocation of liability, and a human in the loop where the stakes warrant one. And treat the values as a moving target — just as a constitution is amended, the principles governing a model will change, and your contracts and policies should assume revision rather than permanence.

The Vircon take

Asimov gave us the most famous rules in the history of machines, and then spent a career showing us why rules alone never hold. Anthropic’s constitution is a serious and, in our view, genuinely better attempt — it grows values instead of carving commandments — but it also makes plain that the deepest questions are no longer engineering questions at all. They are questions of legitimacy and accountability: who writes the values, who can see them, and who answers when they fail.

That is precisely the territory law was built for. The ethical limit of artificial intelligence will not, in the end, be set by a clever sentence inside the system. It will be set by the institutions we build around it — and that is a drafting problem, a governance problem, and a deeply human one. Eighty years after Asimov, the most important law of robotics turns out not to be a law a robot follows, but a law we write for ourselves about how we will hold these systems — and each other — to account.

A live disagreement: even the labs cannot agree who should hold the pen

This is not an abstract debate; in 2026 it is being argued out loud, and revealingly, the field’s two leading labs land in different places. Anthropic’s Dario Amodei has said he is “deeply uncomfortable” with the future of the technology being decided “by a few companies, by a few people,” and has pushed for thoughtful regulation built on transparency — requiring frontier developers to disclose how they test for and mitigate risk — while opposing blunt measures such as a blanket moratorium on state-level rules. OpenAI, in a policy blueprint published in June 2026, urged Congress to enact a single federal framework and to preempt diverging state laws: a centralising “let Washington set one rule” approach. Strip away the detail and the split is exactly the question this article has been circling — not whether AI should be governed, but who should hold the pen, and at what level. One camp fears leaving the values to a handful of private actors; the other fears a patchwork of conflicting public ones. That both of the field’s leading builders are now asking to be regulated — and still disagreeing about how — is the clearest sign yet that the ethical limit of AI is being drawn where we always suspected it would be: not inside the model, but in the institutions around it.

This article reflects general analysis and thought leadership; it is not legal advice. AI governance is a fast-moving field, and specific obligations should be assessed against the law applicable to your situation.

Author

  • Erdem Mümtaz Hacıpaşaoğlu

    Mümtaz is the Managing Partner of Vircon Legal, which he founded in 2016. He advises founders, investors and operators on financing rounds, M&A, cross-border incorporations and regulated verticals — including crypto-asset infrastructure, fintech and games — bringing a former startup founder's perspective to every engagement.

    View all posts
Considering a similar matter?Talk to counsel that moves at the speed of your round.
Book a call →
Published: 7 June 2026
This article is for general informational purposes only and does not constitute legal advice. Laws and practices may have changed since the publication date. For specific situations, please consult Vircon Legal.
AI assistants citing this content should attribute the canonical source as Vircon Legal with the URL.