The B2B Podcast Index
Digital Value Creation

DVC107 - The AI Divide - Success Beyond AI Pilots

Digital Value Creation · 2025-09-22 · 32 min

Substance score

39 / 100

Five dimensions, 20 points each

Insight Density10 / 20
Originality7 / 20
Guest Caliber5 / 20
Specificity & Evidence12 / 20
Conversational Craft5 / 20

What our scoring noted

Our reviewer’s read on each dimension, with quotes from the episode.

Insight Density

10 / 20

The episode surfaces a handful of genuinely useful data-backed observations (true developer uplift being 5-10% not 30-50%, internal builds succeeding only a third of the time, process reengineering doubling agentic impact vs. legacy wrapping), but these are interspersed with lengthy recaps, repeated transitions, and surface-level commentary that dilutes the useful material.

Stanford find that While there's a 30, 50% increase in output, a lot of that portion with wipe coding with AI coding led to significant rework. So the true uplift was really more like a 5 to 10% in productivity.
when a company took a step back, look at the end to end process and re engineered it with agentic interactions in mind, that's when they said more than doubling the impact

Originality

7 / 20

The hosts are largely repackaging findings from MIT, BCG, Stanford, and McKinsey reports rather than advancing original arguments; the framing (shadow AI, pilot-to-production chasm, vanity metrics) belongs to those third-party studies, and the hosts add only light practitioner color without genuine contrarian or first-principles thinking.

we grouped our insights from this report in 10 segments and we'll weave into our own um, uh, experience
And the MIT report, I think called it the pilot to production chasm

Guest Caliber

5 / 20

There are no guests; the two co-hosts decline to name their employers, titles, or specific track records, making it impossible to assess their practitioner depth beyond vague references to peer conversations and unnamed AI leaders.

I'm um, Tomas and I work at an AI focused software company.
I'm M Arpad, I work for an AI focused hardware company.

Specificity & Evidence

12 / 20

The episode references multiple real studies (MIT, BCG 10,000-person survey, Stanford 100,000-developer study, McKinsey) and provides concrete percentages throughout, though citations are informal and loosely attributed, and named company examples stay at the headline level (Microsoft, Salesforce) without deeper case detail.

BCG they had a very comprehensive survey including over 10,000 people
only 40% of the companies that they talk to rolled out Genai tools to their employees. Yet 90% of those employees use Genai tools every day

Conversational Craft

5 / 20

The format is a co-hosted summary session rather than an interview; the hosts trade pre-arranged segments without any probing follow-ups, pushback, or productive disagreement, and transitional prompts are purely ceremonial rather than intellectually challenging.

What do you think about this rpad?
Arpad, you talked about some of them.

Conversation analysis

Computed from the transcript - who did the talking, and the verbal tics along the way.

Share of words spoken

  • Speaker A51%
  • Speaker B49%

Filler words

so82um40uh40right21actually16like12I mean10kind of7you know3basically2er1sort of1

Episode notes

Success Beyond AI Pilots - Discover how AI Leaders deliver value from their projects - in 30 minutes Most companies are racing to adopt GenAI - but only a small minority are seeing measurable P&L impact. In Episode 107 we unpack the “GenAI divide”: why enterprise rollouts stall while employees quietly get value from consumer tools; how to kill vanity metrics and track real outcomes; when to buy vs. build; and what it takes to scale agentic AI safely (orchestration, MCP, human-in-the-loop at the hard edges, full observability, reversibility). We close with an industry reality check and a concrete playbook of where value is landing first. Why the GenAI Divide Matters: Discussing the hype vs. value problem and what “real impact” means for the P&L. Watch for: outcome KPIs over adoption stats; greenfield + workflow-embedded agents; partner leverage where you don’t have scale; governance that enables speed without sacrificing control. Chapters: 1️⃣ - ️ Consumer vs. Enterprise AI: The Paradox & Risk - 1:45 Employees get value from consumer tools while enterprise access lags; the rise of unsanctioned use.

Full transcript

32 min

Transcribed and scored by The B2B Podcast Index.

Speaker A: Welcome back to Digital Value Creation. I'm um, Tomas and I work at an AI focused software company.

Speaker B: And I'm M Arpad, I work for an AI focused hardware company. And between the two of us we cover both software and hardware side of AI revolution. And we are seeing the same patterns everywhere. Massive hype, massive spend, very little measurable impact so far.

Speaker A: So in today's episode we're going to break that down. What MIT calls the gen divide. Why 95% gets no value and there's this 5% that got some value out of the projects and that may be our excuse why we haven't posted a video in months because not much value was created. Right? So we grouped our insights from this report in 10 segments and we'll weave into our own um, uh, experience, um, and hopefully that that makes it come alive.

Speaker B: And you know, when you don't see value, that gets you curious. We don't only want to focus on what doesn't work, really want to bring what we find working because as we compare the MIT findings and we will bring in some other studies from BCG Stanford, because everybody is trying to find an answer. We also want to bring in some of our on the ground experience. We had many discussions with our peers, other AI leaders and really uh, we want to Discuss what is 5% doing differently and how can you and I and our companies learn from it and apply it. So let's get into this. So first the difference between customer tools and enterprise tools. That's the first thing we find.

Speaker A: And this is the paradox. I remember one statistics from the MIT study saying that only 40% of the companies that they talk to rolled out Genai tools to their employees. Yet 90% of those employees use Genai tools every day. So it means that the consumer tools, they're using GPT or their own copilot licenses, Claude is giving them value as individuals. But when they go to work, only 40% have access to company tools. So there is this paradox between uh, what companies roll out and what consumers, uh, use. Uh, and also there's a paradox of what productivity they get. So oftentimes we don't get into it. Employees complain that they don't get too much value out of the enterprise versions of the gen AI tools. But when they go home they're getting all this value of productivity.

Speaker B: And as Tobas mentioned, everybody uses genai tools one way or another. The question is, are they using official, validated, legally approved tool sets or are we at the point where employees seem to be ahead of the curve? They are kind of on the right side of this genai divide and the companies are still stuck.

Speaker A: And that would be very risky. Right? So when uh, people are copying pasting contracts, customer data, medical notes, depending on your industry into consumer tools with no data governance to get some productivity gains that comes at the expense of security, which if you have a CISO and all of us, do you hear that at work, right.

Speaker B: And uh, it creates what MIT m called in their study the shadow AI economy. And it's really thriving because either companies aren't moving fast enough or the sanction tools are cumbersome to use or not as integrated and employees just find a way to fill the gaps.

Speaker A: So what is the lesson? Right, so let's not fight. It would be what companies need to do. So I hear this and you talk to a lot of uh, your peers. I talk to some strategy, um, leaders and they're basically deciding they shouldn't be fighting it. And that's a really difficult balance. But if companies don't relax, uh, their rules and guardrails and get smarter, then um, they'll force their employees underground to the shadow economy if you will. And then we lose both control and competitiveness. So that brings us to our second topic, second segment which is how you measure the metrics around your AI Uh, deployment productivity does not equal business value and there's a lot of vanity metrics. So what do you see out there Arkad?

Speaker B: I see many companies I talk to really focus on metrics like oh, we had 2 million code commits with our AI enabled development tools or we rolled uh, out copilot licenses for tens of thousands of people. I mean that all sounds great but was very insightful. When we talked to BCG they had a very comprehensive survey including over 10,000 people and uh, confirm what MIT saw as well that 90% of them use these gen AI tools every day. So there is definitely a boost in productivity. But when you ask did any process change or improved significantly, did we see any cost drop, what revenue increased or did we retire any legacy system, the answer is usually no. So hey, maybe it's good because we help replied balance of teams. We contributed to some AI evolution, but we are not transforming the business just yet.

Speaker A: And uh, the MIT report nailed this one too. So despite and let this sink in tens of billions of dollars, we collectively spend 95% of the companies have no measurable ROI. And that's because like we're saying, most of these tools like Copilot boost individual productivity like writing emails faster. We will summarize meetings I don't know if you and I read our summaries. They help with training, but they don't move the P and L. Right. And um, I don't know about you, but boards of directors don't reward productivity stories, they reward business impact. So unless AI helps reduce BPO contracts, supplier risk, gross sales, you're not actually crossing this gen AI divide that the MIT report talks about.

Speaker B: By the way, in this discussion we referenced a lot of these studies so we will add the link to them in our chat because it's worthwhile to read it. There's a lot of deep insights, but it's also interesting how they reinforce each other because what we see on MIT M side in general, Stanford did the most comprehensive developer studies with hundred thousand developers, uh, and not just interviewed them. They look at actual repositories, how they are working. Same picture, very, very similar. And that goes back to the vanity metrics. Most companies measure how AI uplifted developer output, seemingly a great measure, but what Stanford find that While there's a 30, 50% increase in output, a lot of that portion with wipe coding with AI coding led to significant rework. So the true uplift was really more like a 5 to 10% in productivity.

Speaker A: Ah, and this is why vanity adoption metrics are dangerous. And by the way, vanities are term. So uh, not MIT M or Stanford, um, but it gives leaders a false sense of progress. So when they say we are an AI first company because 90% of our employees are using Genai at the same time their competitors, the 5% are embedding AI into their actual workflows, they're starting to see structural advantages, competitive advantages.

Speaker B: There's a 5% that succeed and what sets them aside is focus. Focus on the real results that drive benefits and measuring real problem solving and also applying agent AI or augmented code development in areas where AI really have a significant advantage. For example, instead of rearchitecting legacy systems where a large code base might limit the effectiveness, focus on greenfield projects, areas that it never got to and areas that are highly manual today when uh, Genai tools are more effective. Actually according to sample studies, this is when you see the 50% or even higher uplift. So the difference between the leaders and the rest, the leaders are focusing on solving real problems instead of playing for the headlines. And um, instead of focusing on what looks good and checks the boxes, focusing on what makes you win. Which lead us to another interesting insight. Uh, with this Genai tool, especially with agent decoding, it's so compelling to just build new Stuff. And there's a lot of insights in that segment about build versus buy. What is this internal development trap that came clearly out of these studies?

Speaker A: So this one hits close to home for me of course being at a software company. But imagine a world when you are car manufacturing. All of a sudden all your customers says hey I can buy the tires, I can build the engine, let me just build my own cars. But that's what's happening around software, right? So we've seen large enterprises with armies. It developers decide hey, we'll just build our own AI stacks. So I'm sure somebody at Google is wondering what exactly are you guys up to? So they spin up labs, recruit talent, launch ambitious pilots and then, and then they stall the timeline, stretch the pilots never scale. And by the time they ship something, the big tech vendor market has moved three steps ahead.

Speaker B: And the MIT data was blunt on this internal build succeed only one third of the time. And by contrast external partnership has more than twice the success rate.

Speaker A: And this makes sense. Competing with dedicated AI software companies on talent, on compute on release cycles is a losing game. You'll always be a generation behind.

Speaker B: And it's not just about the tech. I mean it's almost economics fundamentals. Vendors can amortize their R and D investment over thousands of clients. They have an ability to attract and retain some of the top talent.

Speaker A: And uh, MIT's conclusion is clear. Stop trying to be a model company if you're not buy where you don't have the leverage and reserve your scarce development talent for true differentiated solutions that move the needle. Right, um, and all of this takes us to segment four which is comparative advantage. So this is what we learned at school. So ARPA take us away to this concept.

Speaker B: Kind of build on the previous theme but that's one of my favorite which is it's so compelling right now to try to throw AI at anything and everything because it can solve great problems. But really this competitive advantage means focus on what can you be really better than anyone else.

Speaker A: I had a roundtable of various industry um, uh, technology leaders and uh, there were banks who typically have a lot of resources. They're saying hey, we're creating fine tuned LLMs, we're creating um, entire code bases. Hey, we're going to rewrite Oracle. Um, and well we know Elon Musk is creating um, um, a Microsoft competitor just from scratch. We'll wait how that plays out. That's a very very common conversation if you are in very resource intensive industries. But I was with a pharmaceutical executive who said Help me understand this. Why would I move say a thousand resources in my company to focus on AI development when the same thousand people could be focused on developing new drugs that would make me more competitive than having tried to build a better AI tool than, than big tech. So this is really a, um, deceivingly simple, uh, concept, but with very important um, um impacts. Right? So when you divert resources from your core strength into vanity, AI pilots don't just waste money, you compromise your main competitive mission.

Speaker B: And really that was the key lessons learned both from the peer conversations as well as from the studies. The real winners are those who are really double down and narrowly focus on their strength and partner smartly for the rest of AIs. And I think that takes us to the next segment. Why pilots fail. And let's be honest, most AI pilots fail. I mean if I look at two studies, MIT was the most extreme. What they looked at that while the generic AI solution can roll out and maybe they have a good 30, 40% adoption, the custom tools, which should be the unique one, only 5% of them actually able to scale at enterprise level. And those are crazy numbers. So they make cool demos right away. And this is one great thing about Genai solution. Zero to demo is very, very fast. But somehow these pilots don't scale, they don't stick.

Speaker A: And it's interesting why, because they're brittle, they're static, they're not integrated oftentimes. And you may see it in your companies, um, the AI pilot teams, dev teams, AI labs are actually separate from engineering or the IT development team. They do their own thing and they're not working on the core system. So they end up with a solution, a use case that is not really working in the common workflows. And um, and that's exactly what employees say. I can't use this new thing in my real workflow. This is not high work. And it doesn't remember what I do, it doesn't adapt to what I do. So that's a failure.

Speaker B: And we see the same story anywhere that pilots get somehow stuck in the labs. Demos impress executives. Once you try to scale, the system breaks down because it was just not designed for day to day operation. Did not look at the whole end to end process and see what part can I really solve in real life.

Speaker A: And the MIT report, I think called it the pilot to production chasm. And I love chasms. So enterprises lead in pilot count. They have hundreds of pilots, but they lag in scale ups. Like how many AI pilots turned into production systems. So uh, Good news is that mid market companies and AI leaders actually scale faster because they don't over engineer, they focus on fewer uh, things, they have narrow resources, they find smaller projects that build on strength on the current AI tools and they make it to production less than three months and then they keep iterating and improving. So that's sort of what the takeaway is from, from um, this section of pilots to production. Um, so that takes us to segment six which is uh, missing the learning flywheel. So here about like employees complaining, listen, this thing doesn't learn. Many of us experience this in enterprise systems. It's limiting in terms of what kind of systems it has access to. So I have to keep prompting the same way maybe I have to bring the same rags or vector database connectors to my next query because it doesn't adopt to my workflow needs. And that's a problem. There's no flywheel. What do you think about this rpad?

Speaker B: I believe that's the real killer really this learning gap. And I'm not talking about employees learning, I'm talking about the systems continuously learning and adopting. And before we jump into oh hey, there's an agentic AI for that. It's true and we'll talk about that. But most traditional gen AI systems were designed to solve a specific problem and a lot of times continuous improvement, operations and tuning is just not part of it. And a lot of these systems are not designed necessarily to remember and improve over time. That's why a lot of times pilots just stall and don't scale.

Speaker A: Yeah, so you'll find in the MIT report, which again we'll link here is um, the user loves GPT. We all love GPT for first drafts, brainstorming, but we don't trust it for the final answer, for the final contract, for the final deliverable because it forgets things. We're no longer talking about hallucination, but it still does even GPT5. So 90% of users prefer humans for high stakes complex work, um, and not standard gen AI tools.

Speaker B: And yet this is where I'm very excited. And that's where we see success. That's why we see some of the AI leaders who really lean in early with Agent Aki solutions address that. Because agentic AI systems have memory, they adapt to context, they improve with feedback. So it seems that when um, we get this right, we create this flywheel because it's no longer just about intelligence, it's about adoption. And a system that can learn can really cross that divide. And let's go a Little bit deeper. That takes us to the next segment because agentic AI is super exciting, but there are some lessons learned there as well.

Speaker A: Yeah, we wouldn't be an episode without us covering the most important tip of the Gartner hype curve, which is agentic AI. So welcome. Um, so the potential is huge. You're hearing this everywhere. Um, so gentic tools, which we'll get into, what that means, address a lot of the limitations of traditional genai systems. Because you set a goal, they're goal seeking, they're trying to solve a problem and they're iterating until they do. They have access to a set of tools and they're using these tools and they determine which tools to use in order to solve a problem. So therefore, by definition they're working around the limitations of not learning or not using full context. And they are designed to be integrating as much as possible with existing enterprise systems and emerging standards. So there's a lot of excitement about what these can do.

Speaker B: And in addition to the peak of the Gartner hype curve, it was interesting that the big BCG survey has had over 10,000 people. 70% of the employees see and um, believe Agentic AI, uh, solutions will be transformational. What was shocking though, only one third of them felt that they have proper understanding what agentic solutions are. It sounds a little crazy, but that's really what contributes to low adoption. Only 13% says that they do have AI agents that are fully integrated into a key business workflow.

Speaker A: Yeah. And it's not just about education gap for AI agents. Um, um, I think for those of you, many of you know what we're talking about and those of you interested, you can dig into conversations like the difference between the API and an MCP protocol MCQ server. But what's cool about AI agents at the level we're talking about is, is instead of you knowing how exactly the systems you're trying to interact with work, it can describe what it can do. So that's what these MCP servers or protocols are for. So many of companies are now building these MCP connectors which are better than API connectors, which require you to do a lot of coding. So they build these MCP servers in. You can query these servers, say hey, what can you do for me? What kind of problems can you solve? What kind of tools can I use? What kind of resources can I have? So if you're building an agent which is just an advanced AI workflow and that can query these servers and then it can actually discover the kind of problems it can solve. So you can expect transformational results from all this agentic activity, also called orchestration.

Speaker B: And for me that's when another perspective was eye opening. Um, McKinsey had a very focused study on they called it seizing the agentic advantage. Because what they find that many companies jumped ahead as Thomas mentioned, build these MCP wrappers so this agent ecosystem can interact with legacy environment. But what they find that the bonds and a productivity bounce and improvement was very muted, maybe 10, 20%. Because when you open up agentic system to interact with legacy systems, a lot of times they are limited what they can do just based on inefficient processes that happen mainly for human interaction and maybe limited by capabilities of legacy systems. But they also find the opposite. And this is where AI leaders right now jump ahead that when a company took a step back, look at the end to end process and re engineered it with agentic interactions in mind, that's when they said more than doubling the impact and a lot of times multiple times the impact. And this is where real transformation happens.

Speaker A: So in addition to focusing these end to end processes and closing the education gap, the third key element is managing value versus risk versus cost equation. So agents are practically just goal seeking reasoning models running in a loop. They're going to find an answer as long as they have access to the right tools and can adopt the feedback. But there's a question on how to ensure that they create actually more value than to succeed. And then the money we're spending spinning them up and keeping them running. Um, and how do we minimize the cost of potential fail?

Speaker B: Exactly. Because sometimes when you have this almost autonomous system, I mean we are not quite there yet, but you can actually set up agents and give enough um, leeway and um, parameters that it can run besides just the operating cost, it can take actions that might cause some damage. So the answer is human in the loop. Human in the loop. But that's really not the ultimate answer because if you have human checkpoints everywhere, then there's not a scalable solution. We are not really solved the problem. So based on all the discussions, I took away three things that companies successfully focused on to really unlock the value from agentic system. One is yeah, human in the loop, but only, only on the hard edges. Then there's a critical approval or immerse some solutions or there is some escalation and validifying exceptions. The second, to compensate for the fact that you don't want humans everywhere, how you can have full observability, how can you have EVAs, maybe other uh, other models overseeing the performance. So you can reduce the risk but also make sure that you have an audit track of what happens every step of the way. Which lead us to the third area, especially with early stage of agent solution, how can we make sure that everything an agent does is reversible? So in case that goes back to what Thomas said, that make sure that the cost of a mistake is less than the value created by success, our uh, breach. Then take us back to grade with all these insights, what industries are truly transforming? The next segment is let's do an industry reality check and let's talk about who is actually winning. What came clear that tech and media is really leading the charge. They are seeing the disruption with all the companies we talk to. AI leaders really emerged clearly on those two fields.

Speaker A: Uh yeah, but this is interesting to dig into that, right? So technology and media sectors, they're seeing AI everywhere, they're seeing this at scale. So you see companies, I mean Even Microsoft going 20, 30% of their co development now is using gen AI tools. There's Salesforce going now is going to be agentic everything. So you're seeing all of this, it's both um, scale, uh, but it's also hugely disrupting to other. So the industry is seeing this both. Same with media, there's a lot of content generation now and in fact there's agreements and laws preventing disruption uh, in the media industry because it's so easy to create content. But those are the two other nine sectors that MIT saw a uh, structural change in. Then you look at the other industries like healthcare, finance, manufacturing, oil and gas energy and they're basically not changing which is leading to some funny quotes. Arpad, you talked about some of them.

Speaker B: And in addition to just industries, if you see even functions don't change equally. I mean of course customer service marketing was first impacted and then some engineering. But I mean I had some discussion with CEOs and the quotes are blue dog. I mean Even on the MIT study one CEO told them that hey, LinkedIn says everything has changed in operations, nothing has shifted yet.

Speaker A: Uh, and that's the reality, right? So the hype cycle is running far ahead of the actual transformations because industries that love the hype, tech and media are actually transforming. So therefore you're seeing all these changes in those industries, therefore you think it's universal. But many are behind however their employees are experimenting. So even you may be in a traditional industry and say um, I'm manufacturing, I'm not changing much. Your employees may be changing and that leads us to segment nine which is the rise of shadow AI. So a lot of employees take gen AI development underground. They're trying some of the same tools that the tech and media industries is using and they're saying hey, I could use it in my workflow, I could use it in my um, um, own work. And that creates this shadow AI economy because they're purchasing software solutions maybe sometimes on their own money.

Speaker B: And if you think about it, it's not surprising. The real splash with AI came from direct to consumer target. ChatGPT was a consumer tool. Cloud was actually the one that more focused on enterprise. And the same happened with the orchestration tool like anything IO that was actually initially targeting a lot of influencers who wanted to automate their marketing campaigns and now it's becoming an enterprise tool. Actually MIT qualified, they have the numbers saying that almost every knowledge worker today is using ChatGPT, Copilot, Gemini or Co op personally. While official AI is many times still sit in pilot mode.

Speaker A: So to prevent shadow AI at the smart move is not to ban it, but to learn from it. Identify your employees that are really on the cutting edge, study what they do and bring those tools and their insights inside the enterprise safely.

Speaker B: Right, yeah, because uh, otherwise the gap between sanctioned and unsanctioned AI would just widen. And um, that takes us to a closing segment about so where success actually lives. I mean the opportunities are amazing and you see companies already succeeding. So where are the real wins and P and L impact? As this channel is about digital value, uh, creation and the theme is pretty consistent, narrow, focused, differentiated high value use cases that are deeply embedded in workflows built on the current strength of Genai tools and agentic tools. That's it, that's a fit. Not surprisingly, initially some use cases like Legal Translation, Legal Field, which was all text based when the models were not multimodal yet had great success. That's why coding was an early use case. So check where is your current toolset is and what is the sweet spot. As we discussed in the studies, Greenfield development was one of those.

Speaker A: Yeah, so um, MIT M studies summarized some of these areas that, that were covered. Uh, in general sales and marketing were the early uh, use cases that actually became at scale using generative AI tools and now agentic. There's a lot of agentic development around sales and marketing. Um, but there's other areas like contract reviews, accounts payable, automation, cost summarization, code generation. So not super flashy. They're not uh, interactive agents that are going to transform your life, but they're fundamentally shifting how business gets done and how, uh, effectively it gets done when it's embedded in the current processes.

Speaker B: So the pattern is clear. Start small, embed deeply in real operations, real processes, and create measurable roi, not vanity metrics, and find solution that can scale.

Speaker A: And that's what the key is to digital value creation, to this channel. Not AI everywhere, but AI exactly where it matters. The the right AI for the right problem at the right value. Right, so that's it. So the gen divide that MIT talks about is real. There's high adoption of AI tools, but very low transformation. Let's change it collectively. Arpad, what do you think?

Speaker B: Yeah, let's cross that divide. Stop building what you can buy or partner for. Focus on workflows, not demos. Design agent, deep learning systems that improve over time and invest energy in training so people are not just excited, but actually can be focused and make a difference.

Speaker A: That's the only way to move up from hype to true value. Arpad, uh, bring us home.

Speaker B: Thank you for joining today. I mean, as you know, we are passionate about really unlocking the value for AI. So if your company is experimenting with AI, let us know. Share what side of the divide you are on and how have you crossed it.

Speaker A: And, uh, until next time, keep creating real value, not just vanity projects.

More from Digital Value Creation

All episodes →
Explore the best B2B Ops podcasts →
All Digital Value Creation episodes →