I pioneered machine teaching at Microsoft. Building AI agents is like building a basketball team, not drafting a player
We shouldn’t ask how much knowledge an agent can retain, but rather if it has had the opportunity to develop expertise by practicing as humans do.
Salesforce’s latest agent testing/builder tool and Jeff Bezos’s new AI venture focused on practical industrial applications of AI show that enterprises are inching towards autonomous systems. It’s meaningful progress because robust guardrails, testing and evaluation are the foundation of agentic AI. But the next step that’s largely missing right now is practice, giving teams of agents repeated, structured experience. As the pioneer of Machine Teaching, a methodology for training autonomous systems that has been deployed across several Fortune 500 companies, I’ve experienced the impact of agent practice while building and deploying over 200 autonomous multi-agent systems at Microsoft and now at AMESA for enterprises around the globe.
Every CEO investing in AI faces the same problem: spending billions on pilots that may or may not deliver real autonomy. Agents seem to excel in demos but stall when real-world complexity hits. As a result, business leaders do not trust AI to act independently on billion-dollar machinery or workflows. Leaders are searching for the next phase of AI’s capability: true enterprise expertise. We shouldn’t ask how much knowledge an agent can retain, but rather if it has had the opportunity to develop expertise by practicing as humans do.
The Testing Illusion
Just as human teams develop expertise through repetition, feedback and clear roles, AI agents must develop skills inside realistic practice environments with structured orchestration. Practice is what turns intelligence into reliable, autonomous performance.
Many enterprise leaders still assume that a few major LLM companies will develop powerful enough models and massive data sets to manage complex enterprise operations end-to-end via “Artificial General Intelligence.”
But that isn’t how enterprises work.
No critical process, whether it be supply chain planning or energy optimization, is run by one person with one skill set. Think of a basketball team. Each player needs to work on their skills, whether it be dribbling or jump shot, but each player also has a role on the team. A center’s purpose is different from a point guard’s. Teams succeed with defined roles, expertise and responsibilities. AI needs that same structure.
Even if you did create the perfect model or reach AGI, I’d predict the agents would still fail in production because they never encountered variability, drift, anomalies, or the subtle signals that humans navigate every day. They haven’t differentiated their skill sets or learned when to act or pause. They also haven’t been exposed to expert feedback loops that shape real judgment.