- Turing Post
- Posts
- 🎙️When Will We Fully Trust AI to Lead?
🎙️When Will We Fully Trust AI to Lead?
An Inference with Eric Boyd, CVP of AI Platform
Hi everyone – hope the weekend's treating you well. Turing Post now has a proper YouTube channel, and you can listen to our podcasts on Spotify or Apple.
At Microsoft Build, I actually sat down with Eric Boyd, Corporate Vice President leading engineering for Microsoft’s AI platform, to talk about what it really means to build AI infrastructure that companies can trust – not just to assist, but to act. We get into the messy reality of enterprise adoption, why trust is still the bottleneck, and what it will take to move from copilots to fully autonomous agents.
We discussed:
When we'll trust AI to run businesses
What Microsoft learned from early agent deployments
How AI makes life easier
The architecture behind GitHub agents (and why guardrails matter)
Why developer interviews should include AI tools
Agentic Web, NLweb, and the new AI-native internet
Teaching kids (and enterprises) how to use powerful AI safely
Eric’s take on AGI vs “just really useful tools”
If you care about deploying AI agents responsibly, this one’s for you. Eric unpacks the real challenges of enterprise trust, agent guardrails, and how GitHub’s agents actually work. We talk infrastructure, AGI, how AI changed processes in his team, parenting in the AI era – and why recruiters should stop banning AI tools in interviews. Eric brings clarity, realism, and just enough dad humor. I loved this one. Watch it now →
This is a free edition. Upgrade if you want to receive our deep dives directly in your inbox. If you want to support us without getting a subscription – do it here.
The transcript (edited for clarity, brevity, and sanity. Always better to watch the full video though) ⬇️
Ksenia Se: Hi Eric, thank you for joining me today. Let me just start with a big question right away: “When will we trust AI enough to run our businesses, to run our companies?”
Eric Boyd: Already we see just tremendous impact from AI across so many of our customers. There are lots of places where people are using AI – everywhere from customer support systems to developers using it to write code. And you know, in all of those scenarios, people are using it generally to make a person more productive.
Like: “Here's a suggestion. Here's how you should do your job. Here's some help for you.”
I take your question to mean: at what point are we at the place where I don't need that human looking over the AI’s shoulder? I think it will depend entirely on the particular scenario you're looking at. Already there are some low-risk scenarios where you'll let the AI go and try something, and if it does a poor job, you kind of don't care that much.
And there are high-risk scenarios that I think we'll never want to give over to AI.
So it's just going to be continually learning – where is the place where AI is performing at a level that we feel confident enough to say: "All right, yeah, I think this task can be delegated to you."
Ksenia: Where do you think this level of trust will never be achieved?
Eric: The highest-level things – like medical procedures or we're not going to let nuclear missiles be launched by AI or anything like that. But I think there are so many more where we're starting to see – hey, AI is actually doing a better job at this.
It's instructive just to look through the developer lens, right? We've, in just the past couple years, gone from having AI suggesting a completion to you, to now we're asking it: go and resolve some issues from my GitHub backlog – with the agent that we announced at Microsoft Build.
You know, again, you're still going to review that code and make sure that it works, but you've given a lot more responsibility and a lot more tools to the agent to go and do those things – which I think is super interesting progress there.
Ksenia: What is your favorite example from the internal work with agents, and from the clients side?
Eric: We call it "dogfooding" – and it’s just learning how to use the product. GitHub software agent, for example, has been pretty amazing for a lot of people. One of the early demos we did, just to see – hey, is this even possible? – was upgrading a codebase from Java 8 to Java 17. For those who aren't deep in the guts of development, it’s just a mindless, thankless, not fun, and really pretty difficult job, right? You just keep running into another corner and another dead end – “this didn’t work.”
So we just watched the agent do it. And it did exactly what a person would do:
“I changed all these APIs... oh, I compiled it... but now I forgot I need this dependency... so I went and installed some stuff...
oh, that pulled these other dependencies...”
It just kept working through the problem the same way a person would do. It completed this pretty thankless, pretty unfun task – but still pretty important task. That's a trend we’ll actually see a lot of.
When I go to the doctors in the area, they're all using Nuance DAX. It’s a technology that lets a doctor have a conversation with a patient, and DAX sits unobtrusively and listens, and then produces the medical record. Doctors gone from having to type away everything, to having a medical record produced while they have a conversation – focused on you.
The trend is: we’re going to have this menial work that people don’t want to do – upgrading Java 8 to Java 17, producing the medical record – those things are going to be done by AI. We're pretty much at that world today. That’s, I think, great improvement.
Ksenia: I agree. AI is very powerful. But trust in many cases is still a bottleneck. In your view, is it harder to build trustworthy AI, or to convince the stakeholders to trust it?
Eric: That’s a really hard question. I think you need both sides to sort of work together.
We focus a lot on building trustworthy AI – starting all the way from the data that we use to train the models, the way that we train, the way that we post-train the models to follow instructions and answer questions a particular way. Then the Azure content safety system that we put on top of it – that gives developers the control to make sure they’re going to be able to use it in the right way.
But then there’s the other side of it – as you start to give agency to these models,
you know, this is a question that comes up in a lot of AI research areas: if the AI performs and makes mistakes at the same level as humans – is that okay?
It feels like the answer today, in many cases, is no. We have sort of a higher bar. But I think that’ll be something we have to work toward.
It’s just sort of hard to accept – people make mistakes. People are fallible.
Ksenia: People also hallucinate.
Eric: People also hallucinate. Like I’ve never, ever possibly said something incorrect, right? So yeah – how do you live with AI that has some of those same characteristics That’s a pretty big societal question to work through.
Ksenia: What lessons has Microsoft learned from early enterprise agent deployments? What breaks the trust? What builds it?
Eric: You know, we settled on this metaphor of “copilot.” An I think that’s been a pretty important metaphor. The system – this model – that is helping a person perform better,
it’s sort of looking over the shoulder, looking at the code you write.
It’s able to go through – like on Microsoft 365 Copilot – it understands your emails and your contacts and your documents and the relationships that you have, and it’s able to answer questions in a better way because of that.
But that was the first phase.
This next phase, as we move into agents, I think is going to start to build different muscles. It’s going to work differently – where you’re actually giving more control to an agent to go and perform certain actions, to perform tasks. People will work carefully with that – to set guardrails on it. There’s a company I spoke to that’s a two-sided marketplace. They use agents to resolve disputes. The agent has got some thresholds. It can offer refunds up to a certain dollar amount. And it’s using its judgment to decide how to balance the two. They’ve given some guardrails to it and some restrictions,
but they’re also starting to open up a bit – how much do we start to let the agent make some of these decisions?
In their case, potentially, some of them are just like: look, it’s better to have two happy customers, and an agent can make that happen. Those are interesting scenarios we start to see.
Ksenia: If we’re talking about the new metaphor – about the agentic web – this is the new story that Microsoft announced at Microsoft Build. How would you describe it? What is this story for you?
Eric: The agentic web is just this recognition that the world is really changing.
We’re moving from a place where, most of the time, I would go to a search engine, look for a web page, and consume information that way – to a world where now I’m asking AI to do a lot more for me.
Some of the research capabilities that we’ve announced through M365 Copilot –
“Go perform some research. Look at stuff in my documents, look at stuff on the web, and just produce a report based on that” – it changes the interactions quite a lot, and changes the way we think about what it means to use the web.
People who produce content on the web are going to think about that too. What are the ways they should be incorporating agents into their websites?
And you saw things like the NLweb, and how we’ve made that simpler to integrate. Everyone is going to need to grapple with – hey, we have new technology, and things are going to work differently. How is that going to show up? In some ways, it’s not that different from the emergence of the web. I remember in the mid-’90s when every commercial on TV suddenly ended in “.com” and we laughed about that. Now you don’t think it’s weird at all. Of course I want to go to a website to find more information about something.
Ksenia: Everything is with AI now.
Eric: It’ll come to: “What do you mean this website doesn’t have an AI agent that can help me figure out the answer to my question immediately?” That’ll seem silly – like they just haven’t caught up or updated.
Ksenia: Do you have children?
Eric: I do.
Ksenia: There was a lot of conversation and stories about how children use AI – from Kevin Scott and from the stage that day. What’s your strategy with your kids? How do you talk with them about AI? What do they do? I have five children, and I need to understand how to, you know, teach them about it.
Eric: I don’t know that I have all the answers. I’m definitely not the world’s perfect parent. But I think we’ve done a pretty good job. We have an 18-year-old and a 14-year-old – two boys. They’ve been comfortable online and comfortable with AI. My favorite line: when ChatGPT first launched, my son said, “This has blown up on TikTok faster than anything in my whole life!!!” And I was like, “On TikTok? In your entire life?”
But yes, t’s the first thing he saw explode like that. It was suddenly something that they got, that they interacted with. And I think that’s been super important.
There are a lot of concerns – how’s this going to impact education? Will people still know how to write? I think it’s important is to think of these as new tools.
You know, showing my age: when I was in high school, they were very worried about us using calculators. “You have to learn how to do the math by hand!” But now – I mean, I don’t carry a calculator anymore, I carry a phone. Anything hard? I don’t do long division – I just pull out the calculator. It’ll be the same thing. Like: “I have this tool that helps me do things in a better way. How do I use it?”
That’s going to be the important thing for kids as they’re navigating this AI world. And for everyone trying to instruct them – we have new tools, they’re very powerful, you can use them in very powerful and productive ways.
You can probably use them in some harmful ways too. So you need to understand those things.
As a parent, I still think the most important things are your character and your judgment. Those are the dominant traits we need, as a society, to work on. But the tools they have access to – they’re going to need to learn how to use.
Ksenia: Everything we’ve talked about so far is very practical. What’s AGI, in your opinion?
Eric: Yeah, AGI… it’s such a great science fiction topic. And ASI, for superhuman intelligence, all those things. It becomes almost a philosophical conversation, with lots of people, like: “What is it?” “How do we know we’ve actually reached it?” Maybe, yes, over a nice glass of wine, you can indulge in that kind of conversation.
I tend to be much more pragmatically focused: We have this AI... is it AGI? I don’t know. I don’t think so. It’s pretty helpful though. It’s a very useful tool. I really like being able to use it. So I focus on that.
The things people worry about with AGI – like how jobs change, and the potential disruption that comes with that – those are very real concerns. But those are real concerns with the tools we have today. They’ve been real concerns since the industrial revolution, or the invention of the PC. That’s hard for societies to work through – but they do work through them.
So when I think about AGI, I hear the debate really being about: “Hey, the world as we know it is going to work differently. What does that mean for me?”
That’s the important question for everyone to work through.
Ksenia: Are we close to understanding that? Or just generating more questions?
Eric: There are certainly some areas where we’ve seen real change. I think it’s interesting how developers – programmers – are going to be one of the first professions to really grapple with this. That role is going to change. The ability to remember some arcane API, or to quickly search for it – that’s less important. The ability to use the AI tool is suddenly much more important.
It’s funny – the calculator analogy again. Some companies are interviewing developers and trying to prevent them from using AI in interviews. I was talking to another company that said: “That’s totally the wrong approach.” They should be using the tool.
I want them using the most advanced tools, and showing that they’re comfortable and fluent in it – because that’s where the developer world is moving.
They’re going to have to work through: how does the developer job change? But being comfortable and fluent with the tools is going to be a key part of that.
Ksenia: So in the last year, how did it change for your team?
Eric: We use GitHub Copilot extensively. The place we’re at now is: developers largely have the same job, but they’re faster at doing a lot of the menial tasks. As we start moving into the agent space, we’re just starting to see that shift. Increasingly, it’s going to look like: “I now have even a fleet of agents at my disposal to go and get my tasks done.”
And being effective at getting them to do that work – I think that’s going to be an important role. But we haven’t gotten too far down that path yet.
So the main change has been: you have to learn how to use Copilot – how to prompt it to go and do work for you, to complete this particular function or whatever. And it makes you more productive, if you’re good at that. That’s been the main change that we’ve seen.
Ksenia: How long will it take for everyone to be more productive, more knowledgeable about agents – and for agents to work properly?
Eric: Gosh, that’s a hard question to answer. I mean, the space has moved really quickly.
That’s definitely been one of the defining characteristics of AI over the last couple years – it's moving fast. So it’ll probably be quick. But what is “quick”? Is quick 6 months? Is quick 5 years? Both of those feel pretty quick to me. It’s probably going to be more on the order of a couple of years to fully take effect.
I think about it this way: If we didn’t produce another model – if nothing else ever came out – we’d still have at least 5 years of adoption work with the existing tools, just to figure out how to use them in their best ways. But we are releasing new models.
So there’s going to be so much more development and applicability to come.
Ksenia: So you think – from two to five years?
Eric: For infrastructure to get to a place where people are very frequently using agents in a large portion of their professional life? Yes, I think that’s probably the right timeline.
Ksenia: So when you think ahead – two to five years – what excites you the most, and what are your concerns?
Eric: It's pretty exciting just being able to get these tools to do the types of things we want them to do anyhow, right?
Like, my son wanted to go backcountry skiing, and I wanted a guide to show us where to go – so we didn’t die in an avalanche, because I don’t know how to do that.
What are the guide services? Where can I look them up? Where should we go? What are the different costs? That’s just hours of research.
I was able to ask an AI agent: “Go and fill this out and come back with a report.”
And it did it. In five minutes. Taking that kind of tedium out of so many aspects of life – I think that's really quite exciting. We’re starting to see that expand into so many professional fields.
I talked about doctors: the tedium of having to produce medical records.
My wife’s a lawyer: the tedium of doing document review or research on some of these things. Being able to focus on the interesting part – I want to go backcountry skiing, I don’t want to research how to go do it. That’s really exciting.
You asked what I’m concerned about.
You know, this is very powerful technology, and we’re constantly working through:
What are the right guardrails we need to put on it? It’s a fine balance.
Some of the tools we’ve got – if we just completely unconstrain them – they could do more interesting things. But they could also do things we’re not comfortable with.
So finding that right balance of: “I’m going to constrain and set the problem up in a way that’s safe – but still preserve the wow... ‘That’s incredible, I didn’t know it could actually do that!’” – that’s the key.
You see that in something like the GitHub agent. It can go and solve issues from your GitHub backlog, understand the issue, test it out, write code, run it – all those things. But it can’t check it in and deploy it. We’ve given it some guardrails. It’s very constrained in the environment. It can’t email the code out to someone else or anything like that. So that balance – where it’s still very much a wow experience, but it has the right controls – I think that’s going to be key to navigate.
That’s something we, at Microsoft – as creators, as technologists – take very seriously.
We think it’s super important to think through that.
Ksenia: From the clients – what do you hear from them? What are their questions and concerns?
Eric: We certainly hear from a lot of our customers: “How am I going to manage this across my enterprise? I've got every corner of the enterprise really excited about this,
but I don’t know the rules I need to put in place. I want to make sure I don’t leak out information in ways I’m not supposed to.”
Some of the work we announced – that probably won’t get as much press –
the integration of Entra ID. Now you can give an ID to each agent, you can give it governance and control over what access it has, and you can really manage that in a centralized way.
Those are going to be super important for adoption.
Now, how. to make this cool technology work in an enterprise environment – it’s hard. These types of controls, and the enterprise-grade work Microsoft has spent decades building – that’s going to be super important to help companies get it right.
Ksenia: My last question. What is a book you go back to often – something that helps you? It can be about your job or something completely different. But what book shapes you – or shaped you?
Eric: Wow, that’s an interesting question. There are books about how to interact with people – that I’ve found really important. There’s a book called Crucial Conversations,
which is about how to have those hard conversations – the tough, emotionally charged topics. It’s one of those books where you're reading it and thinking: “This is going to change the way I talk to my wife.” Knowing: How do I bring up the subject? How do I have that interaction? I think that’s a super important skill. From a professional life – that’s one I go back to a lot.
On the fiction side – I tend to gravitate toward science fiction. The high school book – my son just read it – Brave New World is one I always go back to. Dystopian societies are always fun and entertaining, but also thought-provoking: “What if society went in a different direction?” You play these scenarios out to their logical conclusion. I tend to like all those kinds of books.
Ksenia: But sci-fi lately... it feels like we’ve run out of some scenarios. We’re at this moment with AI – AI is so powerful, it’s kind of limited our imagination about dystopias.
Eric: Most dystopias need some triggering event to ruin the world. Like the Silo series I just read, where people live in vertical nuclear silos buried underground. There’s always some catastrophic event. The Hunger Games is another. Big war, terrible government. Most dystopias need something really bad to happen. Distopian worlds are always fun and entertaining. It’s interesting how few are truly AI-focused.
Ksenia: Thank you. I hope education will help us avoid that dystopia. It was great to talk with you.
Do leave a comment |
Reply