Amit Kothari CEO of Tallyfy, AI advisor at Blue Sheen

AI team structure: the optimal setup

In brief

Most organizations build AI teams backward, hiring specialists before defining what they need. Fei-Fei Li at Stanford HAI found 78% deploy AI, yet only a small fraction see real returns. An effective university AI lab starts with three core functions, cloud infrastructure, and a hybrid model that scales.

Amit Kothari Follow 10k+

Nov 4, 2025 · AI

CEO of Tallyfy · AI advisor at Blue Sheen for mid-size companies

AI team structure: the optimal setup

The pattern at university AI labs is almost scripted. Someone gets funding. Job postings go up. Suddenly there’s a ten-specialist hiring spree before anyone defines what the team is actually supposed to build.

This fails. Consistently.

Institutions spend months assembling dream teams that never ship because nobody defined the underlying functions first. AI teams need data scientists, ML engineers, and AI architects working alongside business domain experts. Most organizations confuse roles with functions and end up with expensive, messy overlap and zero accountability. Meanwhile, 87% of tech leaders already struggle to find the skilled workers they need.

Why most AI teams fail before they start

The problem isn’t talent. It’s structure.

The 2025 AI Index from Fei-Fei Li’s Stanford HAI found 78% of organizations now deploy AI in at least one function. Yet only a small fraction see real returns. That gap should alarm anyone planning an AI lab from scratch. Understanding why AI projects fail shows that team structure matters as much as talent.

Recent research on agentic organizations paints a different picture: small, outcome-focused teams can now orchestrate large fleets of specialized AI agents running end-to-end processes. Turns out, universities keep building teams like it’s 2018. Massive. Centralized. Disconnected from actual use cases.

When Princeton built their AI Lab, they didn’t start with dozens of researchers. They created proper shared infrastructure first: 300 H100 GPUs, administrative support, research software engineers. Then specific projects attracted the right specialists.

The University of Tokyo went further. Their Matsuo-Iwasawa Laboratory equipped actual hardware environments including robot arms, mobile manipulators, simulators, and VR devices. They grew from core faculty to 50 members through a research community model that attracted talent to problems, not positions.

Start with infrastructure and clear functions.

Talent follows.

Need help making this real in your firm? That’s what Blue Sheen does.

AI team structure with three core roles, operating models, and internal skill building

The three roles that actually matter

Forget the ten-specialist fantasy. A working university AI lab needs three core roles that map to actual work.

Research engineers who experiment and prototype. These are the people testing hypotheses, exploring new approaches, figuring out what’s actually possible with current technology. Not pure theorists. Not production engineers. Researchers who code.

ML engineers who move prototypes into production. These engineers focus on transitioning models from research to systems that operate in real environments. The numbers from Talent500’s job trends analysis make this clear: the majority of enterprise AI initiatives struggle without dedicated operational support, which is why MLOps skills are now minimum requirements, not differentiators.

Infrastructure specialists who keep systems running. Data engineers construct and maintain the data pipelines that make AI development possible. AI certifications like Google ML Engineer and AWS ML Specialty are linked to 20-25% salary premiums for data engineers. Without solid infrastructure, both research and production grind to a halt.

Everything else, like data scientists, ethicists, NLP specialists, and security officers, maps to these three functions or gets added when specific projects demand it. The IT skills shortage is projected to cause trillions in cumulative losses according to IDC. You won’t hire your way into ten distinct roles. You’ll just burn budget.

Build the three core functions first. Specialists emerge from project needs.

Cloud versus on-premise for university labs

On-premise infrastructure requires massive upfront investment. Hardware, cooling, power, maintenance staff, physical security. The cost math is unfavorable: on-premise AI workloads need real initial capital plus ongoing costs for power consumption, cooling systems, space, and maintenance. On-premise can be more cost-effective over time for organizations running AI workloads continuously, but most university labs don’t fit that profile.

Universities don’t run AI workloads continuously. That’s the part most lab planners miss.

Classes happen in bursts. Research projects ramp up and wind down. Student projects spike during semesters then disappear. Does any of that justify paying for continuous hardware capacity? No. And yet universities still make this mistake.

CloudLabs and similar platforms solve this by providing cloud-based, customizable learning environments. Students get dedicated access to Big Data Analytics, Deep Learning, and NLP labs hosted on AWS, Azure, and GCP. When class ends, you’re not paying for idle GPUs gathering dust.

The Minnesota Supercomputing Institute took a different approach. They built shared on-premise HPC clusters that individual departments can access without each one buying its own hardware. Researchers run large-scale experiments concurrently on shared infrastructure. This avoids per-department capital spending, even though the model itself is on-premise.

For teaching and research that varies by semester, cloud wins on economics and student experience. Students learn the same platforms they’ll use professionally. Universities avoid hardware refresh cycles and maintenance overhead. Reserve on-premise for the rare cases where sustained, predictable workloads actually justify the capital investment.

Hybrid models beat pure centralization

The debate shouldn’t be centralized versus decentralized. It should be about which elements belong in each category.

AWS published a useful piece on generative AI operating models that recommends centralizing foundations, specifically infrastructure, data governance, and security standards, while distributing innovation across business domains. This hybrid approach keeps AI governance solid while letting teams move fast on delivery.

Pure centralization creates bottlenecks. Every department waits for the central AI team to get around to their project. TDWI’s research on AI team structures backs this up: mid-size organizations tend to fully centralize, but this sacrifices speed and domain alignment as they grow.

Pure decentralization fragments everything. Each department builds its own solutions that don’t talk to each other. Everyone reinvents the wheel on infrastructure and governance. The few companies that pull ahead tend to do the opposite: they share ownership of AI between business and IT rather than leaving each department to fend for itself.

The hybrid or federated model, sometimes called hub-and-spoke, centralizes infrastructure, security, and standards while embedding AI specialists in department teams. University AI lab setup maintains consistent data quality and security while letting departments move fast on domain-specific problems.

Airbnb learned this through experience. They transitioned from fully centralized data science to a hybrid model as they grew. The data science team stayed together for career development and standards but split into sub-teams aligned with specific product areas.

Build your hub first. Will every department be ready from the start? No. Grow spokes as departments prove they’re ready.

Building skills instead of buying talent

The math doesn’t work on hiring.

The WEF’s Future of Jobs Report 2025 puts a number on it: 63% of employers cite the skills gap as the key barrier to business overhaul. Nearly 40% of global jobs are exposed to AI-driven change, and skill demands are evolving at a much faster clip in AI-exposed roles.

You can’t compete with tech companies offering equity and unlimited budgets. I think most university leaders already know this, but badly underestimate how much it limits their options. The alternative is developing internal talent.

85% of employers now plan to offer upskilling, and 77% provide AI training according to the WEF. This works because AI expertise builds on existing domain knowledge. Your biology faculty who understand the research problems just need the technical tools, not another PhD.

The key skills aren’t mysterious. Hiring managers surveyed recently ranked on-the-job training, industry certifications, and university coursework as the top pathways into AI roles. Real-world projects and applied skills matter most. A CS degree isn’t the only entry point anymore.

Universities have real advantages here. Cloud-based ML certifications like AWS Machine Learning Specialty are linked to roughly 20% salary boosts in existing data and engineering roles. AWS leads cloud market share for ML workloads, and 73% of organizations actively prioritize AI-certified talent.

The infrastructure to train your own people exists. Use it before burning budget on hiring battles you’ll lose.

Stop planning the perfect team. I am oversimplifying, but not by much. Give three people who want to learn some cloud credits and real problems, then grow from there. The organizations that do well with AI don’t have the biggest teams or the most PhDs. They have clear functions, appropriate infrastructure, and people who learn by shipping real things.

In two years, the labs that started small and shipped fast will have lapped the ones still writing hiring plans. That gap only widens.

ai-teamsuniversity-labsteam-structurecloud-infrastructure

About the Author

Amit Kothari is an experienced consultant, advisor, coach, and educator specializing in AI and operations for executives and their companies. With 25+ years of experience, he is the Co-Founder & CEO of Tallyfy® (raised $3.6m, the Workflow Made Easy® platform) and Partner at Blue Sheen, an AI advisory firm for mid-size companies. He helps companies identify, plan, and implement practical AI solutions that actually work. Originally British and now based in St. Louis, MO, Amit combines deep technical expertise with real-world business understanding. Read Amit's full bio →

Disclaimer: The content in this article represents personal opinions based on extensive research and practical experience. While every effort has been made to ensure accuracy through data analysis and source verification, this should not be considered professional advice. Always consult with qualified professionals for decisions specific to your situation.

Contact me More about me

View All Posts »

Where to host your app after you build it with AI

Your AI coding tool already made hosting decisions for you. Lovable chose Supabase, Bolt chose Netlify, Replit locks you in. Before picking a platform, understand what you actually built, what you are locked into, and what architecture decisions will cost you later.

Accessibility overlays do not work, and AI auditing is the opposite

An accessibility overlay is one line of JavaScript that promises ADA compliance while you do nothing. The FTC fined accessiBe a million dollars over that promise. Here is why a widget cannot fix a problem that lives in your code, and how real AI auditing does the reverse by finding the broken line so a person can change it.

Can AI actually do accessibility testing? I ran it on my own product

Automated accessibility tools catch maybe a third of WCAG problems. I pointed Claude Code at Tallyfy, my own product, and let it run a real WCAG 2.2 audit with a live screen reader across four codebases. It found bugs that axe-core cannot see, and it showed clearly where the work still needs a person.

How to run a long autonomous Claude Code job without it drifting

The hard part of a big AI job is not the work. It is making the agent run for many sessions without drifting or claiming it is done when it is not. I used an accessibility audit across four codebases as the test. The setup that kept Claude Code on track was a git ledger, atomic parallel claims, and two verification passes.

What a VPAT costs, and why the report is the cheap part

A VPAT is the report that states how accessible your product is, measured against WCAG. People ask what it costs and price the document, but the document is the cheap part. The real cost is re-auditing every release, and that is the number an AI agent actually moves. Here is the ADA, WCAG, Section 508 and EN 301 549 stack underneath it.

What axe-core misses, and how AI caught it with a real screen reader

Axe-core catches about a third of WCAG failures and skips anything that needs judgment. Here are the thirteen criteria a scanner cannot decide, how an AI agent drives a real VoiceOver session to cover them, and the save button that passed every automated check and was silent to a blind user.