This website uses cookies

Read our Privacy policy and Terms of use for more information.

What is Responsible AI in the agent era? It is the practice of designing, testing, deploying, and governing AI systems so they are useful, safe, fair, accountable, transparent, and fit for purpose. In the agent era, it also means controlling what systems can access, which actions they can take, and how humans remain responsible.

TL;DR: Responsible AI is moving from principles and output review to infrastructure for AI agents: runtime controls, policy-as-tests, monitoring, accountability, and human oversight placed inside the system. As agents act through tools and workflows, safety must move closer to action.

For a long time, Responsible AI sounded like something from a theory field, important but safely distant from the actual race. Almost boring. But now with agents that act, write code, touch tools, and cross organizational boundaries, Responsible AI becomes part of infrastructure. And much more exciting. It is about who gets access, where the system is allowed to act, how policies become tests, and how humans stay responsible when machines move faster than our old review processes. And here is another important shift: this work cannot be done in isolation. One company, or even one country, can create its own rules for safety and alignment, but agents will live in a connected world. So Responsible AI has to become a shared effort, and we need to talk about it more to encourage all players to build that safety net together.

In this article, we’ll look at why Responsible AI changed now, what it meant before agents, why Microsoft is turning Responsible AI into developer infrastructure, why Google DeepMind’s latest announcement treats agent safety as a security problem, and what all of this means for the next phase of AI systems.

What’s in today’s episode?

  • Why Responsible AI matters more in the age of AI agents

  • So, what is Responsible AI actually trying to do?

  • Before agents, what was Responsible AI built around?

  • First, stop pretending trust is universal

  • Who is responsible when everyone is involved?

  • Microsoft’s Responsible AI approach: from policy to runtime

  • Google DeepMind’s AI control roadmap: alignment is not enough

  • Where does the human go when work moves at machine speed?

  • What should not be delegated to agents?

  • Is Responsible AI a silver bullet?

  • What does Responsible AI unlock?

  • Concluding thoughts: What does Responsible AI unlock?

Ultimately humans should be responsible, right? It's about making AI that is trustworthy. And the ‘worthy’ part is really important there.

Sarah Bird, Chief Product Officer of Responsible AI at Microsoft

Why Responsible AI matters more in the age of AI agents

When AI was mostly a chat interface, the model generated an answer and the human decided what to do with it. There was still risk, especially in high-stakes contexts, but there was also a built-in pause. It was manageable: a person could read the output, compare it with other sources, reject it, edit it, or ignore it. Many people did not do that carefully enough, but the interaction still had a forgiving shape.

Agents that act on your behalf change that shape. Once an AI system can call tools, cross into files, operate through APIs, and perform multi-step workflows, the output may become action. That is a fundamental difference.

Sarah Bird, Chief Product Officer of Responsible AI at Microsoft, who I chatted with recently, described this shift through pace and workflow. She says that the first thing that changed is the speed of capability jumps. “Every month or a couple months, we’re seeing kind of a major leap in capability,” she told me. That is exciting because it opens new applications, but it also means that Responsible AI has to keep developing tools for new surfaces of risk. The example most on her mind was agentic coding, because it touches the software development lifecycle that much of Responsible AI practice was built around. If agents write code and agents review code, then a traditional human review step that takes three days can feel absurd in a workflow where the work itself took two hours. The need for validation remains, but the old placement of validation starts to break.

This is the moment Responsible AI became a question of control around action.

Google DeepMind’s latest announcement points in the same direction from another angle. On June 18, 2026, DeepMind published “Securing the future of AI agents,” introducing an AI Control Roadmap for internal agents. In that blog, Google explicitly says this approach goes beyond traditional model alignment by adding system-level security, so there is still assurance even when alignment is imperfect. In plain language: do not assume that a well-trained model will always understand the goal, preserve the right boundary, resist manipulation, or behave safely once it has access to tools and permissions. Build as if the agent may go wrong.

That is the new phase. Responsible AI should consider the whole system to act safely. And it’s brutally hard.

Recently, I chat with Sarah Bird who is CPO of Responsible AI at Microsoft. It was insightful (don’t worry, there is no corporate fluff) and if you are interested in the topic, I encourage you to check our this interview →

The article below has a wider perspective including recent news from Google →

So, what is Responsible AI actually trying to do?

At its simplest, Responsible AI is the attempt to make AI systems useful without making them reckless, unfair, opaque, unsafe, or impossible to hold accountable. But that is a definition from an ideal world, and the actual field is messier. Increasingly, Responsible AI has to borrow from AI ethics, safety, security, governance, human rights, product design, law, risk management, philosophy, and ordinary software engineering. It is not a single discipline. It is a collision zone.

The older vocabulary was built around principles: fairness, reliability, safety, privacy, security, inclusiveness, transparency, and accountability. They gave companies, researchers, and regulators a way to name the harms that AI systems could create or amplify. Then came attempts to operationalize them through risk assessments, launch reviews, documentation, red-teaming, model cards, evals, incident response, and regulation.

NIST’s AI Risk Management Framework, released in January 2023, framed this movement in terms of managing risks for organizations that design, develop, deploy, or use AI systems, and it was intended to promote trustworthy and responsible AI development and use. The EU AI Act, which entered into force on August 1, 2024, made the regulatory version more concrete through a risk-based framework for developers and deployers of AI systems in Europe.

Those frameworks are necessary, but in the agentic era they begin to show their limits. Principles define the goal, regulations define the obligations, reviews check whether a system appears ready. But none of these, by themselves, can interrupt a bad tool call, notice that an agent is overreaching, or prevent a workflow from crossing a boundary it should not cross.

Responsible AI is moving from principles and output review to agent controls, runtime monitoring, and system-level accountability.

So the field is becoming more technical. The goal is no longer only to state responsible behavior, now you have to make that behavior testable, enforceable, observable, and adjustable.

That was my little revelation: Responsible AI, which had always sounded more like a slogan to me, suddenly stopped being abstract. It became a stack, a layer of infrastructure.

Before agents, what was Responsible AI built around?

Before agents, much of Responsible AI was built around the assumption that humans and software moved at human and software speeds. A product team planned a feature, engineers built it, reviewers assessed it, safety teams tested it, lawyers and policy teams weighed in, and the system moved through some version of a launch process. The process was imperfect, but it had a cadence.

Generative AI already strained that cadence because model behavior was probabilistic and hard to inspect. Agents strain it further because they do not only generate outputs; they create workflows. An agent can call a tool, receive a result, update its plan, call another tool, and continue the chain. Each step may look reasonable in isolation while the whole sequence drifts away from the intended goal.

In our conversation, Sarah said that at one level, the Responsible AI work has not changed. Teams are still testing, still building guardrails, still thinking about human oversight, still trying to understand risk. At another level, everything has changed because the software development lifecycle itself is being rewritten by AI. Her team now spends more time than expected on automated risk detection and scanning, using coding tools and code-understanding tools to inspect systems directly instead of asking engineers to fill out forms describing what is happening.

That last part matters. Forms were always a weak interface between governance and reality. The code knows more. The traces know more. Tool calls know more. Runtime behavior knows more. If Responsible AI needs to operate at machine speed, it has to read the machine, not only the questionnaire.

Sarah described Responsible AI work as co-innovation across many domains: model training, post-training, low-latency systems, large-scale engineering, applied science, linguistics, legal, and policy. “You can’t just say, oh, we’ll throw technology at this problem and solve it, or we’ll just make a policy and that solves the problem,” she said. That is the whole point. Responsible AI can no longer be the policy department waiting outside the engineering room. It has to be inside the system design.

First, stop pretending trust is universal

The most important moment in my conversation with Sarah came when I asked the question that bothered me: is it even possible to make AI responsible?

She immediately corrected the frame. “It’s actually not about making AI responsible,” she said. “Ultimately humans should be responsible.” The work, in her view, is about making AI trustworthy, and even that word needs care. “You don’t just make an AI system generically trustworthy,” she said. You might trust a system to generate a paragraph that you will edit, but you might not trust it to make clinical healthcare decisions. Trust depends on how the system was built, where it is used, what role the human plays, and whether the tool is actually fit for purpose.

This sounds obvious until you look at how people actually use AI. A benchmark score or a brand name often becomes a vague permission slip. The model feels smart, so the user trusts it. The interface looks polished, so the organization assumes it is safe. The tool works in a demo, so someone quietly moves it into a workflow where the stakes are higher than the demo ever admitted.

Responsible AI has to break that spell. The serious question is not “Can I trust AI?” but “Can I trust this system, for this task, with this access, in this context, under these consequences?”

And context includes you too: your expertise, your attention, your incentives, and yes, even the level of alcohol in your blood at that moment. Basically, the same questions you should ask before using a chainsaw. It is a power tool. So is AI. The fact that it can help you build faster does not mean you should pick it up casually, distracted, overconfident, or without understanding what it can cut.

That is why “fit for purpose” is not a boring compliance phrase. It is the practical core of trust. A model can be useful for drafting and unsafe for diagnosis. It can be appropriate for rapid prototyping and inappropriate for silent production deployment. It can be fine with public information and wrong for confidential enterprise data. It can suggest an action without being allowed to execute it.

Sarah made this point when talking about ordinary users. People need to understand whether a tool is appropriate for the job, what guarantees the provider gives, and whether those guarantees match the data and context. A consumer tool that uses data to train future models may be useful for some personal experiments and completely inappropriate for enterprise data. Another tool may provide stronger privacy guarantees and be appropriate for a different setting.

This is also why the instinct to “just fix it in the model” is tempting, but incomplete. General-purpose models are valuable precisely because the same capability may be legitimate in one context and dangerous in another. Sarah gave a telling example: her own team generates harmful content to train monitoring systems and guardrails, which means that a capability many people would prefer to remove entirely can also be necessary for defensive and safety work.

This is where Responsible AI becomes a stack rather than a slogan.

Who is responsible when everyone is involved?

Well, everyone has a role, but not the same role. Sorry, if that is less emotionally satisfying than blaming one company, one model, one user, or one regulator, but it is closer to how AI systems actually reach the world.

The model provider may be responsible for testing whether the model introduces novel dangerous capabilities. The platform provider may need to provide usable controls. The application developer has the specific context to test whether an AI system works safely in a banking app, a healthcare product, an education tool, or a coding workflow. The deploying organization has to decide whether the system is appropriate for its data, people, and stakes. The user still has responsibility for understanding what kind of tool they are using, especially around privacy, verification, and over-delegation. Regulators codify what society is not willing to leave to private judgment.

This stack is not neat, and it will be contested. But it is the only realistic alternative to the fantasy that responsibility can be solved at one layer. If you fix only the model, you miss the application context. If you fix only the application, you inherit model-level risks. If you rely only on regulation, you move too slowly for the current pace of system design. If you rely only on users, you turn every person into an unpaid safety engineer.

And another problem: Agents will not stay inside one company’s neat little garden. They will cross apps, workflows, platforms, and organizations. Google DeepMind made the same point in its June multi-agent safety call: millions of agents from different builders may soon communicate, negotiate, and transact across digital environments. Safety cannot depend on one vendor’s internal policy.

If agents become the new interface between organizations, trust has to travel with them. That’s has much less attention than AGI talk, but that’s exactly what will decide whether agentic systems work in the real economy.

Let’s see what are two giants offer as their solution (or part of it) →

Don’t settle for shallow articles. Learn from those who work directly with companies navigating these transitions.

Join Premium members from top companies like Microsoft, Nvidia, Google, Hugging Face, OpenAI, a16z, plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand what’s going on in AI. 

Reply

Avatar

or to participate

Keep Reading