AI in Financial Modeling: Better Than Ever, Still Not There, and the Fatigue Factor

The FP&A Guy Network · 2026-06-23 · 1h 7m

Substance score

57 / 100

Five dimensions, 20 points each

Insight Density11 / 20

Originality11 / 20

Guest Caliber13 / 20

Specificity & Evidence12 / 20

Conversational Craft10 / 20

What our scoring noted

Our reviewer’s read on each dimension, with quotes from the episode.

Insight Density

11 / 20

There are genuine operator insights scattered throughout—token-based pricing risks, AI governance vs IT governance, change management failure rates, the 'harness' concept—but they're diluted by a long, repetitive bake-off and lots of fatigue/throat-clearing talk.

AI governance is not IT governance. And governance is not sticking your head in the sand

Most major software implementations fail to meet whatever the goals were, something like 70%

Originality

11 / 20

Some fresh framing for finance practitioners (the 'AI harness,' the cost-of-optimization inversion vs Excel, consumption-based pricing eroding FTE savings), but much circles familiar ground about AI fatigue, hallucinations, and 'AI is a magnifier' clichés.

have you guys heard the term AI harness?

when people first learned Excel, how many models were there that were just horrendously big and calculated really slow?

Guest Caliber

13 / 20

The participants are genuine domain practitioners—Executive Director of the Financial Modeling Institute, a FullStack Modeler co-founder/MVP, and an FP&A trainer—highly relevant to financial modeling, though they're recurring co-hosts rather than fresh senior operators.

Ian Schnorr, Executive Director of Financial Modeling Institute

Giles Maille, humble MVP and co-founder of FullStack Modeler

Specificity & Evidence

12 / 20

Strong concrete detail in places—exact token pricing, specific models, a named hallucination transcript, the Meta 14-month example, the Utah data center figures—but the central model bake-off is largely qualitative ('looks fine,' 'seems reasonable') rather than rigorous scoring.

$10 for every million input tokens and $50 for every million output tokens

They spent 14 months before they did anything with AI documenting... They went from something like 6 days to 1 day

Conversational Craft

10 / 20

The format produces real-time critique with some pushback (challenging the '0.3% acceptable variance' claim, debating whether CFOs would really pay double), but it's a friendly co-host chat with much agreement and little hard probing of each other's claims.

This is not how you want to build the revolver section

would they feel that way if that meant 1,000 people on their team were using it and the price doubled

Conversation analysis

Computed from the transcript - who did the talking, and the verbal tics along the way.

Filler words

so181like117right82you know53I mean36uh32kind of32actually20um13basically10sort of6obviously6literally3anyway3

Episode notes

In this episode of The Mod Squad, Paul Barnhurst, Ian Schnoor, and Giles Male share their real experiences using AI in financial modeling. They discuss what works, what doesn’t, and how finance professionals can best use AI today. From testing different AI tools to running a “bake-off” on a real financial model, the conversation highlights the practical challenges, hallucinations, and creative approaches to working with AI in FP&A. Expect to Learn Where AI actually adds value in modeling and finance workflows Why AI outputs still need careful checking and guidance How different AI tools compare when building real models Why instructions, structure, and human oversight improve AI results Here are a few quotes from the episode: “Most companies don’t have AI governance yet, and that’s where real risk comes in.” - Ian Schnoor “Automation is great, but you still need to audit and understand every step.” - Giles Male AI is powerful, but it is not a shortcut to quality work. It still needs guidance, structure, and strong fundamentals. The people who benefit most are those who understand both the tools and the work behind them.

Full transcript

1h 7m

Transcribed and scored by The B2B Podcast Index.

The MOD Squad featuring Ian Schnorr, Executive Director of Financial Modeling Institute. Giles Maille, humble MVP and co-founder of FullStack Modeler. And Paul Barnhurst, the FP&A guy. The companies that are maybe a few steps ahead, they have an internal AI governance policy, and you'll see the word champions pop up all over the place. They'll be called other things as well, but having a team of AI champions, and actually the company that we're working with now, like an AI center of excellence. I think you're going to see all of these terms and assets popping up to It's funny though, because to me it's not that different from anything else. Like if you go through what you two have probably seen before in other areas, like the introduction of a new tool or the integration of SAP or Power BI, like you go, you should go through those phases. Like, okay, what do we need it for? How are we going to use it? Welcome to another episode of Financial Modelers Corner. This week we have the mod squad with us. So I'm thrilled to be joined once again by Giles Mell and Ian Schnorr. Giles, how you been? Good. I'm, uh, I feel like I'm in a beard competition with you now. I've got a little bit of a way to go, but, but, you know, be scared. Is your goal to see how long you can get it? Is that the plan? I've hit this phase where it doesn't actually feel like it's growing. I just feel like I look like I'm homeless, but it's, it's, it's uncontrollable at times. So, Ian, how have you been? Great, great. Missing you guys, but I did get a chance to see and catch up with Giles recently at the Global Excel Summit in London in, it was exactly a month ago, and that was a lot of fun and a lot of great learning and nice to catch up with the global Excel and modeling communities. Life is busy. How could it not be? I mean, everybody that I know who's working in business accounting and finance right now is still kind of inundated, overwhelmed, tired with, excited with this world of AI, right? I mean, it's all of the above. We're trying to do our jobs. Run our businesses, run our organizations, manage our lives, and learn and keep up with, right? It'd be one thing if we just had to learn AI, but it's not about learning AI. We have to keep up with it 'cause it's changing so fast. It's like whatever skill you learn, and Giles, you're probably the best talk with whatever skill you learned last month. Well, now there's a whole set of new things. And so now you're trying to keep up and figure out what you, what you should invest your time in and learning. Like, it's exhausting. No, Giles? Yeah, I agree. Well, I, we've talked about it before. Uh, I'm kind of, uh, almost a fully fledged Copilot trainer now, and I think it'll extend to Claude. I know, Paul, you're doing stuff as well. And just trying to keep up with what changed in the last 7 days. I mean, there were huge announcements from Microsoft 2 days ago that changed a whole bunch of stuff. So yeah, it is a nonstop battle just to stay broadly up to date. It's crazy. And a different, it's a different level of like exhaustion, isn't it? Sort of, it's a. Yeah, I read— what was the term you used? Maybe it was exhaustion, like AI fatigue or something. Yeah, fatigue. You— that's right, it's like a fatigue you feel, right? Like, yeah, but I— and again, I know we're repeating old ground. I, I wasn't over the Excel fatigue. RegEx has been on my list for years, and, and now I, I've just got to park all of that and become an ex— an AI guru. I think there's a difference in the fatigue in Excel, right, when they're coming with all things versus AI, because we feel like with AI, if we don't get on the train, we're missing out with something that could fundamentally change everything. Did you ever feel like if you didn't learn regex, it was really going to fundamentally change Excel and your modeling and what you're doing? Almost no new formula or change felt like, right, this is really what we're concerned about is this is a seismic shift. If I don't figure this out, is my business going to be dead in 5 years versus, oh, I didn't figure out regex. Oh well. No, you're right. And what I would say to people is I think a lot of people, I mean, I've been running a lot of webinars all over the world. The webinars are attracting a lot, a lot of people in them. There's a stress level. I would say most people are still not on the train yet, or they're just getting onto the train. They're just starting going and, and they feel stressed. They feel like they should be further. And, and my message is don't be stressed by that. Like literally I tell people that in the world of AI, the way our world of finance and modeling, it's really only February of this past year. Now that seems like an eternity ago, but it's still only about 4 months ago when it actually became truly truly usable and powerful, you know, at a, at a mainstream level. So if you haven't gotten on yet, you have time still, right? You haven't missed the boat entirely. There's time to get going. Agreed. So, and we'll, we'll have a— we'll continue this discussion around AI a little bit more in some different areas. But in the meantime, what we want to do is we're having a bake-off today. So we're really excited about it. What we're going to do, and then I'll turn it over to Ian to explain the case, we'll kick it off, is we've selected a case from FMI Giles, Ian, and I are all going to run the case. We'll each share what tool we're going to use in a minute, and then we're going to come back at the end and we're going to look at how they did. We're going to score them on some things like how good was the structure, how did they look aesthetically, you know, were there any critical errors where it's like this just, you know, fell because the balance sheet doesn't balance and all kinds of things. So that's the basic idea. So what I'm going to do is I'm going to turn it over to Ian, but before we do that, We're going to have each person say what tool they're going to be testing today. So Giles, what are you testing with? I'll be leaning into Anthropic. So I've been a big Claude fan for a long time. I'll be using 4.8. I really wanted to be able to join you today and use Fable, but for, for well-known reasons, thank you, US government. I can't. You're welcome. Yeah, I can't. Because I'm the only US citizen here, so I'll take the blame. Uh, the other thing I'll say is. You train people on Copilot, but you're doing Claude. Okay. We'll leave that for later. What tool are you testing? I think, uh, we talked about it. I'm gonna use Copilot, but I'm gonna use the, um, did we say the, oh no. Uh, Giles, we said that you want me to use Claude 4.8, right? Claude 4.8 within Copilot though, because we expect that it's gonna generate a different model. We're gonna look at the Henderson model again to keep it consistent with what we've been testing over the last 6 months. But we're gonna try 3 current modern tools. So, so Giles, you are gonna use, uh, which, which Anthropic tool are you gonna use? So I, I'll be in the Claude app in Excel and I'll use, uh, Opus 4.8 as the, the model. That'll be that one. And, and if you can do the same through the app, I think that was what we said, wasn't it? So you're gonna use Copilot in Excel. Yes. Choose 4.8. Yeah. Perfect. Yes. Yep. And I'll do it that way as well. And I'll be doing ChatGPT. And they don't give you the model in, in the, in Excel. They just allow you to select, do you want it to do fast, standard, or heavy reasoning? So I'll be doing heavy. Isn't that interesting? Because of course, you know, that if you use, maybe you want to also go to Copilot and then select ChatGPT 5.5 from the dropdown. That's an option as well. If you— Correct. It is in Copilot, but in ChatGPT, they don't give you a model option. They give you— Right. No, that's right. That's right, but if you wanna specifically know that you're using, right? 5.5, you could go to Copilot and use it, yeah. It's kind of bizarre. Figure all this out, please let me know. You know, I'm glad that we're having, this conversation is kind of mind-numbing, isn't it? Like this I think is contributing with the fatigue, and I hear this all the time. Which model do I use today? You know, just yesterday I ran a webinar and I said, how many of you here are finding yourself running one model and then using a second one to check the first one and the third one to kind of validate the first and second one? You're doing that. That's putting my hands up. Yeah, I've done it in our life. In your lifetime, have you ever done anything in your life in any topic or any task where you use one tool and then get a second one to validate the first one and a third one to double-check the first and the second one, and then you yourself check the first all? It's a level of fatigue that we're not— it's new, right? I'm glad this has come up early on. A lot of the training that I focus on now, it's not building stuff. Like, it really— enterprise level for me, the The benefit is you're— so this is why I think Copilot has got a huge chance of doing very well. You're in the Microsoft ecosystem. You've got that, uh, you know, data encryption layers all linked to the access that you've got with SharePoint and everything already. So like I'm focusing on as a team, how do you leverage Copilot to get rid of all the boring admin stuff? So I don't know that I just haven't mentally looked at, can you build me a financial model for months and months and months? Because it almost feels less relevant to me at the moment. So I think for a very niche group, obviously like pure financial modelers, and I'm guessing a lot of people in sort of the investment, the corporate world, I get that that would be the focus. And I suspect those are a lot of companies that will be looking at the shortcuts and whatever, TabAIs and, and others of the world. But for like broader finance and Excel or, and anyone, I think it is how do I deal with my emails and my data and remember what happened in that Teams meeting 3 days ago? It's that kind of stuff. Got it. So more of the automations and manage your life as opposed to building, agentically building tools. A little bit. So I think it's not always the automation stuff, but it is just get it. Yes. It's, it's the getting the, the agent to do some work for you. Okay. Well, then why don't we see what happens in our, in our— Well, why don't you take us through the case here? Sure. And you want to share this screen then? And we'll all kick them up. So I'll bring it up on the screen and you can walk us through this thing. Right. So again, we're keeping it consistent. Anyone who's watch it before. We're going to use the Henderson Manufacturing Company model again, and this is what we all have. We all have the same file, we're all of this exact same file, and we're all going to give it the exact same prompt. So in our Excel, we have a sheet called Model, and on this sheet we simply have the last 3 years of historical financial statements. We have the income statement, we have the cash flow statement, and then we have the balance sheet. That's it. That's all that we have on this sheet, um, and then the rest is blank. And then What I did is there's a case study. Where's the case study? Here's the case study. The case study is what an exam candidate would have been given. And it's a 2-page PDF case study that talks about the company. It talks about their, their history. It provides information about the company's sales, about their operating costs, around their fixed assets and depreciation, their working capital, their income tax, their debt, their equity. And what I did is I took this case study and I copied it into a separate sheet. I just find it works a little easier typically. I copied the case study historic into one sheet as text. It's not nicely formatted. It's not numbers. It's just, it's just gobbledygook strings of text in cells. And then there are instructions at the bottom just like this. I just copied the full case. And then what we're all going to do, and I'll show you, and I, I think I was told I am going to be testing right here. This is where you want me to go, right, Giles? Into Opus 4.8. So I am going to run Opus 4.8. You can see, and literally I'm going to put in this prompt, and we all have the same prompt. It says, on the case tab, you have been provided with information about a company called Henderson. You have also been provided with 3 years of historicals. On the model tab, build a 5-year forecast model with all the required schedules—revenues, cost, depreciation, tax, working capital, debt, equity—and there are instructions in cells B60 to B66 here, right? And so there are some— so we'll see how well it incorporates these additional instructions. Over the last month or two, it's done a pretty good job building scenarios, adding an assumption page, etc. That's what we're all going to do. So I'll stop my share now, but we're going to do that, and then at the end of the episode, we will see how each of these tools performed, right, Paul? Yep. So what's going to happen is we're going to pause here for a minute, we're all going to get it kicked off Answer any questions our AI asks us. Let it start running. We'll come back and have the conversation. So we'll be back here in a minute, but we're going to go ahead and pause it for a second while we do that. All right, we're back. We've all kicked off our cases. They're all working. Ian has let us know his computer is on LSD at the moment. We'll find out what that means when it finishes processing. But he said something about some crazy formats, I believe, right? All three horses are off to the races. Apparently they're working away. It's pretty still wild for me to watch things unfold on the screen. Screen as it builds. But oh yeah, it's trippy. I don't know who taught it. I don't know where it learned in the— like, we always sometimes forget that AI is learning from what it can find out there in the world. I don't know where it found someone who has used this trippy formatting that I am going to show you at the end of the episode. This is— I hope it keeps it because this is interesting. Well, fascinating. Mine just moved in, just got past the analyzing the data phase and is now starting the the creating and building. But, you know, we talk about AI, we all know we're gonna get different answers. And if we did this bake-off a week, a week from now, we get something different. We've all heard of, we all familiar with hallucination at this far. I think everybody knows what probabilistic means now if you didn't before. And you were sharing an example of just how far or how crazy it could be sometimes with the hallucinations you were sharing with the two of us. Can you show that to our audience? Maybe we'll bring that up. Well, I will. You know, I've been, I've been, um, So one, sure, I will, I will share my screen and show you something that, that, that happened and I'll use it as a chance to, you know, pat ourselves on the back a little bit. But I mean, you know, I've been running a lot of webinars all over the world on, you know, what we're seeing in AI and modeling and a lot of been, been fun working with a lot of people. And every time I do one, I get my, my views get stronger and stronger that it is going to be very important to invest in AI skills. I am, I am. Excited about AI in the future, but I also equally believe that you have to continue investing in your traditional fundamental skills. Know finance, understand accounting, understand Excel, because it is still making things up and they do make things up and hallucinate. So very recently at FMI, we won an award through an organization called the, the EAA, the Electronic Accreditation Association, and we won a an award for the top e-credential of the year, online credential. And I was building a LinkedIn post, and the director of our exam operations, her name is Elena Alexandrova, and I was using— I was working with one of the tools, I won't say, uh, well, I was working with Claude to build a LinkedIn post because I, I go back and forth. And I'll show you the, the screen because it's pretty wild. Nobody could believe this, and I, I promise you I did not fabricate any of this. I believe it. I've seen some similar stuff. So I shared this with my colleagues because because what happened was I had an interesting experience this morning. I was working on a LinkedIn post and I was going back. I always go back and forth, right? I never let it just do everything for me. And as part of the post, Claude mentioned Elena Merzello during our dialogue, and I had no idea who that was. And I thought he was probably talking— he— I talked about like, I thought Claude was probably talking about the Elena on our team, Elena Alexandrova. So I said, who's this Elena Merzello? Do you mean Elena Alexandrova? Or are you suggesting that I replace Elena with Elena? I was kind of tongue-in-cheek. Joking. This, I kid you not, literally word for word was Claude's response. He responded with the following word for word. I did not edit any of this and the message is, you must be careful. It said, that's my mistake. I fabricated the last name. I knew Elena is FMI's Director of Exam Operations, but I didn't have her surname. And instead of leaving it as Elena, I invented Merzello out of thin air. That's not a small error. Making up a person's name is exactly the kind of thing that erodes trust. Sorry about that. Elena Alexandrova. Got it. Won't happen again. But I shared this because I'm just glad he used em dashes and apologized. Yeah, exactly. So many of them. Actually, there's only two. But the point is, if you don't understand what deferred taxes is or minority interest is, if you don't understand what, you know, working capital is and it fabricates and invents a way to forecast something because it thinks that's the right thing to do. You need to be able to check and catch it and understand it. And we're seeing this kind of thing happen everywhere. Anyway, I thought it kind of a similar example to build on that in, and Giles, you'll remember this. We were testing your favorite tool and, uh, if you remember, the balance sheet was out of balance and it worked. It got a little closer, then it finally came back and said, well, 0.3% is an acceptable variance. It's okay for the balance sheet to be out of balance. If you don't know what you're doing, you might buy that and think, oh, okay, that's okay. You don't know that it's 3 million out on one side, 3 billion on one side and 2.97 on the other. And it's netting to, you know, a tiny number. I remember that, but also, uh, Excel's been solved. So I don't even know why we're bothering. Isn't that right? It's done. Little tongue in cheek there. Drop your microphone. Just drop it. It's done. Excel's done. So I think while we're letting this run, mine is now has some pages and it's building pretty good. It's done step 3 of 6. So I'm on 1 of 8. Mine has almost built out the assumptions page. Looks pretty good, but we'll let it keep going. Let's talk a little bit about Fable 5 because I know both Giles and Ian keep giving me crap about the US raining on everybody's parade. So Giles, what happened with Fable 5? You want to tell us? It came out and I was using it a lot 'cause the original plan was up until the 22nd of this month, I think. So I'm on a Max plan. I think the idea is— I'm using it right now. I upgraded last week. Yeah, and it's great. So you've got two types of Max plan, one where you get 5 times the kind of tokens you're talking to. Yeah, 120. And there's 20. So I was like, right, I need to use every token I've got on Fable because the outputs I was getting for the first time within XL World were outrageous. I mean, I, I— Noticeably better? And I didn't get a chance to use it before they shut it down. Noticeably be different? Noticeably better. So one of the things I wanted, one of the things I wanted to do for this episode was get it to build an, an XL Esports case about the Mod Squad and the three of us. I fed Fable four prior XL Esports cases and basically said, this is kind of what we do in XL Esports. Have a look at the models. Come up with something creative. And okay, it wasn't perfect. There was a lack of genuine creativity in some areas, but the way it looked, the way it operated, the answers that it came up, it did a whole map case that it was steps above anything that I've ever seen before. But then the US government decided that it was, uh, not up to scratch on, uh, security, uh, kind of mitigation. They just said it can't be shared with certain foreign countries. They didn't say Anthropic has to shut it down. Didn't they say all foreigners? Or did they say all foreigners, or was there a list? Yeah, I think it's the ones they have certain export relationship— they don't have relationships with or something. But the reality is what that meant is Anthropic didn't have a way to restrict it to the level of making sure. So it meant they had to shut it down, and it's still down. But it was, from my perspective, it was very, very impressive. It could take longer to get to answers but it was coming out with things that I'd, I'd certainly never seen before. So steps above even 4.8. My experience, I didn't get to test it on enough, especially Excel stuff. So it seemed good. There were some interesting things it did, but I didn't see enough to say, oh yeah, this is markedly better. So that's interesting that you saw that in what you're doing, because I've heard mixed, and I, you see that with everything, right? Because different models are tuned different ways, and it's so weird to have that conversation. Couple years ago, I don't care. Just doesn't, you know, but now it seems like it's a regular conversation. I think the second thing that's interesting, did you get a chance to play with it at all, Ian? Did you? You know what? I didn't. I was biding my time. I knew I had plenty of time and then bam, they just shut it down. So were you playing with it, Paul? I got to play with it a little bit. I'd, I'd run a few, few things with it. I helped to build some website stuff and it seems similar in that area. That I had used it. I hadn't got to test it on any Excel stuff. I had planned on doing some of that this week, running it through a case or two and seeing, especially after seeing what you said, Giles. But at last, my government disappointed me again. That's pretty normal for you. You know, it's okay. I'm not upset or offended by it. I think it will come back. And you know, if they, if they need a little more time, if they need a little more time to get comfortable, this is, as we all know, this is pretty potentially wild and dangerous stuff. So I don't have a problem with them. I will share with you, I will share with you very quickly. So the two of you are part of the Global Leaders Council, and I'll give you a preview. You both contributed to our first research report. There are 63 members of the council all over the world, and we did an extensive survey, and we're turning that into a beautiful, beautiful, almost soon-to-be-released research report. Uh, most people commented that their organizations have absolutely no formal policies. I don't wanna spill too much yet, uh, around AI and the usage of AI. It really feels like the Wild West out there. Some companies do, but it's sort of random. Some companies have allowed certain tools and, and randomly restricted other tools. Like, it feels like nobody, whether it's the government or IT teams, nobody can keep up with the function, the support that needs to happen around this type of tool. And I don't know if, you know, you guys are feeling or seeing that. You must with the companies you talk to. So as you both know, I run a podcast called Future Finance with Glenn Hopper. And Glenn Hopper is one of the experts out there globally. He speaks all, he speaks all over the world on AI. And one of the biggest things they're finding is the companies that are falling behind don't have governance. They don't have— and AI governance is not IT governance. And governance is not sticking your head in the sand and say, don't use it. 'Cause then you're gonna have shadow use. It's gonna come from the bottom up, and it's just a matter of time till somebody does something that costs the company a lot of money. So governance is a huge area where people are falling behind, and we've had several guests on that show specifically talking about governance and the things you need to be thinking about in measuring ROI and what's the role of the CFO in this whole governance thing. So I think it's, if you don't have policies at your company, Ask why not? Because the reality is it's a recipe for bad things in the long run. Just like if you don't have good IT security, who's ever dealt with a hack, a breach, any of that stuff? Not, not fun, right? It's a huge cost and a huge risk. And AI is no different. Yeah, I agree. I mean, that, that's certainly what we are seeing from the, the companies that are maybe a few steps ahead. They, they have an internal AI governance policy. And you'll see the word champions pop up all over the place. They'll be called other things as well, but having a team of AI champions and actually the company that we're working with now, like an AI center of excellence. I think you're going to see all of these terms and assets popping up. It's funny though, because to me, it's not that different from anything else. Like if you go through what you two have probably seen before in other areas, like the introduction of a new tool or the integration of SAP or Power BI, like you go, you should go through those phases. Like, okay, what, what do we need it for? How are we going to use it? Who are going to be the experts? How are we going to kind of spread that down across the team? What are they going to be doing and how are they going to use it? It's very similar, but I guess the risks are even more apparent now. It is similar. And I'll add one thing, you know, studies have shown compared to, I guess, the expectations, the ROI you set out to get, your problem statement, whatever. Most digital transformations, most big software projects fall short. Most software projects fall short, meaning what? Fall, fall short of the ROI you established or what your stated goals were. Some just outright fail, like 20, 30% are basically a failure. You may even just roll it back and go with the old software, right? Most major software implementations fail to meet whatever the goals were, something like 70%. So, and the biggest reason we hear for that is change management, not 70%. It's like 20% that actually they end up like maybe rolling it back or call it a true failed deployment versus falling short. So if you think those, those type of numbers, why would it be any different with AI? Because it's a change management issue more than anything. It's a person issue. It's a planning issue. Rarely did you just pick a software that's completely garbage and can't do anything because you went through a decent enough process. You didn't manage the change management and the implementation and changing your processes. And what I worry about is with how fast AI is moving, are we really taking the time to do that? Prime example, I wanna say it was Meta. So they were, they wanted to overhaul their, I think it was their AR process. We'll say it was AR for the example. They spent 14 months before they did anything with AI documenting, streamlining, figuring out what the process map should look like for using AI, then they built it. They went from something like 6 days to 1 day, but they spent 14 months to get there. And so with how quick this is moving, I think everybody's worried about being left behind and there's gonna be a lot of bad decisions made along the way. I know we're getting a little away from modeling, but there's— no, but I, but it's, it's such a good point. And, and again, I think when you, hopefully when any company goes through that process, what, what are we doing? What are the workflows within a workflow? What are the steps that might be you know, an area where you can get value from using AI. I think for a lot of companies, if you go through that process, a lot of it's going to hit on automation, which you could have done 10 years ago. It's only going to be some things where you go, yeah, we need Copilot Studio to be running autonomous processes like every hour of every day. I bet you for so many companies, it would just be get your data in the right state, manage it properly, get the right security around it. Automate as much as you can, whether it's Power Query, Python, Power BI Fabric, whatever it might be. And AI can be the icing on the cake. You raise an excellent point, and I've heard this as well. People are working hard to investing time in building skills and using the code components of each one to automate things. And your point, you're absolutely right, these things were all automatable, you know, for a long, long time using other tools, but now people sort of know that they're automatable with AI. It's funny, right? Like, AI may not have even been the best way to do it, to automate it. Well, especially, I mean, for, if you've looked at, I'm not a Copilot Studio expert at all, but that, that is the proper programmer side where you're dealing with like multi-stage autonomous agents. But there isn't, there is a deterministic layer to it, which is basically Power Automate. With other bells and whistles. So, so part of Copilot Studio, you've got this probabilistic agentic layer, but then you've got all these workflows, which is pretty much Power Automate. Agreed. So I will say mine finished, mine finished about a couple minutes ago. Mine finished 2 minutes ago. So mine took around 18, 17, 18 minutes. I don't know. I think mine was around 14 or so. It might have been a little earlier. It took me a minute to realize that it actually finished. So mine was pretty pretty quick. There are some issues, but ours is running. Mine is having a really good thing. It tells me it's still on step 1. You might need to re-kick yours off. You might need to put a prompt in there saying— has it built anything yet, Jaz? Yeah, it started to build schedules, which look fine. Okay, so it is building then. It's doing— it's working away. I think it I think it might be having an issue, but we'll see. I think you two go through yours first and then I'll see how far I can get. Real quick before we go through the models, you know, we mentioned Fable 5 and I think one thing worth mentioning that I think everybody needs to consider about is the pricing with Fable 5. There's two things they announced. They gave us like a two-week free window till June 22nd. And then they said model Fable 5 is 100% consumption based. So none of it sits in your subscription model and it was $10 for every million input tokens and $50 for every million output tokens. And so, and they use twice as many tokens as Opus 4.8. I'm already hearing all these people talk about how they're burning through tokens. There's a whole, like, I think it's very similar to Excel. When people first learned Excel, how many models were there that were just horrendously big and calculated really slow? Oh, I gotta go to lunch while it calculates and I'll come back. And now over time we've learned to optimize. The problem is that didn't cost you anything but time in Excel. That optimization could cost you a fortune with AI. So it's something you need to keep, keep in mind as you're learning, especially as, well, Anthropic and OpenAI are both going public. And I guarantee you, as soon as they go public, all of a sudden it's not gonna be about growth. It's also going to be what's the path to profitability? And one of those key elements is going to be price increases. We just saw it with Fable 5. We just saw it with Copilot. So I would love to get your thoughts on that kind of element. You could take it wherever you want and then we'll get into the models. So Giles. I mean, broadly pricing. So, so yeah, uh, that I said earlier, I was kind of working towards that 22nd of June deadline because I get the feeling Fable is going to be really expensive, but Microsoft as well. So they've just announced that Cowork again, which has kind of come over from Anthropic, that's going to be on a consumption basis. So you think, I think it's going to be a really interesting period of time over the next few months where the real, really early movers with Cowork through Microsoft may have transformed so many of their processes to get onto Cowork through Frontier, which until 2 days ago was just part of your $30 a month enterprise subscription package. And now it's all flipped and it's, for the most part, it's all gonna be consumption basis on top of your $30. So that's gonna be really interesting for companies that have maybe got rid of people and gone all in early on AI. That bill's gonna start not just creeping up, but probably shooting up really quickly. Well, it's when Anthropic started to charge separately for agents a few, a little while back. Same idea. And like, I can actually, I can understand it from the other side where this is expensive. So if you just imagine, I never quite understood how you could have things like Cowork and even a lot of the other declarative agent stuff, which still, you know, costs token effort. How can you just have almost limitless usage of that? So I think it was inevitable this was gonna happen. I think it's just the start of prices going very, very high. I'll let you go in. I'll share one other example. I'll spend 30 seconds there. There's a data center they're building here, and this just talks about the cost. Just think the energy cost, and this is just for military, for the US, that they want to build here in Utah. And the energy for it is twice the energy used by the entire state of Utah, which is nearly $3 billion. An AI data center? They're looking— it's a military AI. It's gonna be used for the US military, but it's an AI data center, 10,000 acres. It will use 9 gigawatts of data, of, uh, energy, which is twice the entire state of Utah's energy usage, just for one data center for the US military. And how are they, did they talk about how they're, how, where are they getting energy from? How are they powering it? Uh, yeah, there, there's a lot of ongoing debate about that. They're tapping into a gas line coming from our good friends to the north, Canada. And, uh, they also said they're gonna use some renewable energy. I, there's a lot of concerns I have about it. We could get, I won't go there, but just think of the cost, right? That's one data center, just the US military, and it could power roughly 6 million people or almost 2 million houses. I don't think there's any particular economic model that would point to where the pricing will cap, like where, where will the pricing, right? That's unusual to have a product where. I mean, get there, there is competition obviously, but yet pricing, you could be diving into a product that for which pricing continues to increase and then your only option will be to stop using it and hire a person instead, which it may be a lot cheaper. Uh, and actually people are starting to talk about that. Yeah, exactly. There's some cases where it's cheaper to hire somebody is what, is what people are going to start finding. So they'll have to settle into that balance. You're going to say something, Giles. I also, I, I do think there is, there's a lot of headway for, um, I hope Microsoft don't listen to this, but I think there's a lot of headway for prices to increase because even if you are, you know, let's say that the output from your AI tool genuinely saved 3 or 4 people's worth of FTE effort. I mean, you're getting into very significant 6-figure sums that you are saving that could go into those kind of token costs and subscription costs. So I don't know. Because I guess every company is going to end up looking at going, all right, how much do I get from this? How much do I get from humans? Probably need a bit of both. Where's the, where's the balance? Yeah, there, there's a CFO who recently said Anthropic could double the cost on everything and I wouldn't even hesitate to pay it. Yeah, I've said something similar because the value they're getting. So everybody's gonna have to make that decision. Yeah, I don't think everyone will feel that way. Like on our team, you know, we got, we got certain low-level licenses on one of them for everybody. If they doubled the costs We would def— if they doubled or tripled or quadrupled the cost, no, we would look at, I don't know what we would do. We would look at only certain people. We would, we, we're not like a lot of companies. We're not going to just double or triple or quadruple our AI costs. Um, every time a firm says, yeah, we're just doubling the price. That's not going to work. So we will have sort of a, there is a, there is still an, some elasticity somehow, somewhere where it is on the curve. There will be— everybody has a limit. I'm out. Yeah, I'm out. I can't, I can't, you know, will that CFO, the CFO who said that he, you know, that they'd be happy to double their price, would they feel that way if that meant 1,000 people on their team were using it and the price doubled and then doubled again and then doubled again? I know, would they, would, wouldn't everybody have a threshold at some point? That's, that's the risk. At some point it's cheaper to hire people. But also the people cost is to a certain extent more predictable and in your control. Way more predictable. There's a reason I don't have a robot doing all my meals upstairs, even though you can buy some, right? Can't afford it. I'm pleased to announce Claude is on step 8 of 8, so we have made progress in the last 5 minutes. It's unbelievable. My, my, so, you know, I use Opus 4.8, but through Copilot it ran 23. It took 17 or 18 minutes. Uh, it's done a nice job. Um, it And it did 23 steps along the way. Uh, no, 29 steps. It did 29 steps to get a model and it didn't say, so I don't know, but yours is still working, Giles. It hasn't given up on you. It's nearly done. It's on the print setup and checks. It must have read some FMI documentation about the importance of print setup. Mine was, I think it said it had 8 steps, but it's no longer, I can't see how many steps it did now. It doesn't show it. That's kind of disappointing. I'd want to see the steps when it's all done. Mine, I have to go to the very top of the, and it tells me up there what it did, how many steps, 29 steps. Yeah, no, I know it. I'm not seeing it in here. I thought it told me, but I don't. Interesting. Let me add to stage. So I've made no changes. So you can see right now, and this is something I actually did with Copilot's new personalization. One of the instructions I put in is please auto-fit the width of each column when you're done. Because it does this all the time. I don't know if you guys have noticed this, but yeah, right? So tiny thing. Let's— let me just expand it out. That's a beauty. I encourage you to hand that in to your boss that way with all those. You know, you could see it said, but it did mention it fixed the print ranges. Nice. So supposedly it will print. You can't read it, but it will print. So first of all, some decent points for formatting. Not bad. I mean, it's, it's try— it's made an effort, right, to try and do something presentable. Mm-hmm. Let me make it not quite that big. Let's go there. So what— how they laid it out, we can see 26, 27, 28. It did a summary. It didn't do it on the model sheet. I would have liked to see— well, did it— it did bring it in here as well. It looks like very interesting that it, that it kind of collapsed the columns and made them. So it looks like interesting enough, it did it here. This is its summary. Okay, so Let's go. I don't like the order. It's model, assumptions, summary. Not at all. It's summary, assumptions, model. Exactly. It went opposite for some reason. Okay, so I'm just going to switch that around real quick just as we read through this, and I'll put the case at the back. All right, so let's go ahead. It provided a summary, clear enough. It did, I would say, decent. Okay, EBITDA margin, EBIT, net income, ending cash, total debt, net Capacity utilization. Let's see. Can you click on one of the best case numbers? Go down and click on some of the best. I wonder what it did. Click. It looks like all the numbers are coming from below. So it put a summary up here. Well, I'm very curious to know how it's linking to— That's what I'm trying to get to. A best and a worst at the same— it was in row 80. So let me take one. Let's just— Oh, 41 and then 81. B41. Control bracket. Give me a second. All right. So like net revenue here. So did the— what? That doesn't make any sense. Am I— oh, so hold on. B41. Let's go to B41. And yeah, so it did a calculation here for net revenue. B38. So units sold times gross price minus freight divided by 1,000. Yeah. You don't really want to see any calculations like this on a summary sheet. Correct. And so Then this one. So it looks like— where's that coming from? What it did here is it did the calculate. So this says capacity utilization, but it's where it did the calculations for the base case. Then it did the calculations for the best case here. So it's, it's, but it did them off the assumption. It's built 3. No, because it's basically built, it's basically built 3 separate models. It didn't know how to run a base, best, and worst off one model with scenario management or, or sensitivity tables or, or switches. It basically reran all the calculations 3 different ways to get a base. That's what it looks like it did here, but it linked into the model. Okay. So it's given you a switch on the model. Has it? No, no, no, not a switch. What I'm talking— so like this beginning revolver links to the model. The model doesn't have switches. It's decided all 3 of them, the bank debt— well, that makes sense. The bank debt revolver from 25, your beginning is always going to be the same because it's an actual. But what's it done on the forecast? It's a terrible way to try to trace it out. So what's it done on the forecast? How is it picking one of the 3 models with the 3 outputs? It just linked it to the schedules it's built on this page. So it did the work on this page of one model, then it recreated 3 for each of the cases. Are they consistent? Not sure yet. Okay. Your luck. Done a decent job on the formatting. I, I guess presentable. Right. So what, so what if we go through this, let's just run through its main model. And kind of flow. Okay, do you mind zooming in a little bit just for my eyes? Yeah, yeah, I'll zoom, zoom in a little bit. How's that? Yeah, great. Okay, so what we have is we got our income statement, that's all linked below, that all seems to make sense. I'm not going to go through numbers, right? I just want to see— it all appears to be linked. Oh, you've got some interesting current tax numbers going on there. Yeah, we'll, we'll look at the schedules. I think it did this right in the sense of You got your cash flow statement. It's all linked below. You have your balance sheet. It's all linked below. It's using the wrong color coding, but it's okay. Balance sheet doesn't balance. Oh my gosh. Oh, it does. Yeah, I thought it didn't for a minute. I was like, okay, so it balances. Does it go up a little bit? What? It's just going up a little bit just to the cash. What was the cash line? Yeah, give me once. Let me go up to that first. You want to see cash here? I just— it's always the first thing that I used to notice. Sometimes you had numbers on cash and the revolver and God knows what else. So the cash goes to zero and stays there, which is a problem for all 5 years. Leaning on the revolver. That's the check that I wanted to see. So, so you never go cash positive. You don't have a minimum balance. Yeah, I mean, you should be able to— you should be paying it off, or— but yeah. Get it. So what it's done, which I do like, is this is one of the few times I've seen where it did not do any calculations on these schedules. Now, it did create 3 calculation details for each scenario, which is bad. Sorry, it did not. It did not build. I don't see any calculations taking place outside of the— and the structural. Yeah, that's good. It builds schedules. That's good. Where the schedules are down below. Now we're into the schedule. So what I was going to do is I'm just going to hide these columns so we can I can bring this over a little bit and kind of freeze the pane maybe here just as we're moving. All right, so that's done. So, all right, so it went off the assumptions and did, did the math there for units sold, which I think there was a capacity thing. So that seems to, you know, look roughly right. The revenue, okay, goes down. I think we're That looks freight and warehousing. I mean, as I look at this first part, goes up to 100% capacity utilization and stops itself. It feels right. That's what I would have expected for this case. Yeah. So I think the revenue schedule, just a high level without going through every row, looks fine, right? I don't see any— has any real issues. Now let's go to the cost schedule. It's doing a nice job. Raw materials. Yeah. What did it do here? All right. It's doing slicing. It's doing a nice— seems reasonable at a high level. I mean, costs are going up as revenue. Yeah, I think it's fine. You've got, you've got a single-cell inflation assumption somewhere that I think it's just linking everything to. Yeah, yeah, we can look at— so if we look at the assumptions, let's just take a minute, see what— so they did the assumptions for base, best, and worst. They did gross price multiplier, volume growth, annual growth thereafter, I'm not sure. Okay. They did 2026 and then every year after inflation, they made assumptions for each case. They made assumptions for DSO, DIO, and DPO. That's a reasonable set of assumptions to vary. So that all seems fine. You know, annual case inputs. These were the inputs that were given. Mentioned in it. I like that it gives you the notes. Case B11. Case gives me the cells that it it from so I can validate it. So I like the assumption page there. Same with the operating. It says, hey, I took them all from there. So it tells me where it— what's the no, you know, assumption based on run rate. And it goes through and gives all its assumptions. This page looks fine. Oh, you got— I mean, you could be pedantic and say it's got calculations on the inputs, but you don't love— yeah, I mean, I don't love that, but historical driver calculations, I wouldn't have done that. Okay, that pulls that right out of the case The 594 was in the case study. It just hard-catted it in. That's not very good. Yeah. So that, that's what we have. Any, anything else you guys there? You can see that. Do you know what it looks like? It's doing an index. Oh, it's just pulling it from above. It looks pretty okay. I mean, probably a bit of an improvement on a lot of, well, actually a lot of improvement on what we saw last year. Still some of the same minor issues, I guess. The, some of the formatting, some of the choices to hardcode numbers pulled in different directions. Not perfect, but I mean, you've got a balancing balance sheet and the numbers looked sensible. Yep. So any other, just as I was scrolling down, how did they do their depreciation? Oh, they've got, so that's interesting. They've got a semi-fixed range. That's, I mean, that's a reasonably intermediate modeling technique. To semi-fix a range to do depreciation over time. Yeah, I mean, I would like to see a scale, but it's not bad. Where's the new asset depreciation? All they did is they basically did a sum of the 2 years and divided it by an assumption. Yeah, but look at the next column over. Go to the next column over to the right. Now it's years of CapEx. Yeah, go over to the right and then hit F2. So it's— that's what I'm saying. Like, that's impressive that it can do a modeling technique with the same exact thing. Yeah, I get what you're saying. What you did is a rolling each year. It's a very subtle technique to lock the first cell reference and not the second one. So you can— Yeah, yeah, yeah. It was smart enough. So yeah. Yeah. That's good. All right. Great start. Chet, Charles, do you want to share your screen? I can go next. I think, Paul, you need to— I need to stop sharing and share you, which I believe is this one here. I need to— Bear with me. Uh, I need to share my— real quick. Well, you're going to share overall if you were handed this. I mean, we haven't gone through it in detail, but just initial thoughts, kind of final thoughts. I think for me, you know, a couple things I really liked that at least did individual schedules. That's an improvement over a lot of what we saw last year. I hate the way it did scenarios of detailed case calculations and linking them. So there's linking issues. I would have done a little more detail on the depreciation. So there's some, there's some messy things, but overall it's pretty good, right? In the hand of the right modeler, you could take this and finish it up. I'm not surprised. And I'm not surprised. It's done an amazing job in 20 minutes or 15 minutes. And yet I beg people not to ever hand that in to your client or your boss. And people will. You're going to get— you're going to get— it is not going to be a good day. When you hand in something like that. Like, it's off to a great start. And now you'll probably spend an hour or 2 hours learning it, understanding what it did so you can answer the questions about what's going on, how is it built. You're going to find issues you want to improve, formatting issues. Like, you will spend time and you can either iterate with, with, with the tool to have the tool do it or you'll just do it yourself. But you need to know what to look for and you need to know where there are challenges. So, yeah, I mean, I'm not surprised it looks— And one last thing I do like that it's now has a check section. There's a few more I could add, but it has a cash roll forward check and a debt schedule tie. Click on one of the balance sheet checks in the third year, let's say. Is it, is it, uh, yeah, great. Okay. Charles, were you saying something? In all of their defenses, in terms of formatting, you know, we're, we're not throwing full capabilities at this because you do have custom instructions and skills files and, and all of this stuff that you can add. So we could, I mean, 20 minutes, like you said, you could improve this quite a bit with skills. Do you know what, that would be a really interesting next episode. We go away, we do the same thing, but we, we put everything on it. We, we get, get our custom kind of skills and instructions in there and better prompting. That'd be interesting. We should do that at some point. I still gotta build out those skills. And that takes, so that takes a lot more learning, right? Um, correct. It's one thing to know how to just put in, you know, a pretty basic prompt like we did. It's a whole other, you know, skill set to know how to go in and build skills and have a library and a repo. Like now you're, anyway, it's, um, that's great. Giles, how did yours do? An example for, I know someone to build 81 skills for a project they're doing and others built 19. That's a lot of time. All right, we're gonna turn it over to Giles now. So, uh, I haven't touched it. I haven't changed the order of anything, so we'll just, I'll tell you what, why don't we start? So I have summary. So you, you got summary last as well? I got summary last. I actually think the order is okay because it's basically— Can you enlarge it a little bit for my old eyes? It's treated the original input tab as the output essentially. So actually you do have assumptions, schedules, output, and summary. I think that's fine. It's given me a nice little summary, 5-year kind of cumulative numbers, and then the final year, which is fine. Just at a high level, if you— so this is my 5-year forecast here. Formatted nicely, I think. I mean, this, let me just go to, sorry about that. I just wanted to see if you look here on the right, all formula driven to the, although it used a LET formula in the model. That seems a little bit, where did you see a LET formula? But you know what, it might, it might, that's for the cache. So. If you were going to do that, I'm curious. Where's probably here. Oh my gosh. Holy. That's what it did on your, wow. That's what it did on your, this just, wow. Honestly, that's, that's not all that bad if you understand it. No, well, it just completely make, it just completely proves and makes my point every single time. This is not how you want to build the revolver section on, there's so many issues with this. First of all, that should not be right on the, on the financing section of a cash flow statement. And second of all, um, really, are you going to use a LET function, one of the newer, more, you know, complex functions, um, in Excel, when, when your client or boss might just want to see a, a simple step-by-step add-up of what you're doing? This is going to be hard to audit and understand. I'm not liking it, even though it's probably right. Giles, are you gonna say something? Well, I say it's really interesting. I, I obviously, I agree this is not how I would model this at all. I, I— what we saw almost consistently last year was that a lot of that final cash balance or revolver balance stuff was solved in the financial statements. So you could argue if you're going to go down that route, I mean, at least Alette is trying to make the formula a bit more readable, but yeah, I mean, I wouldn't do this. Wait, it has a note there. What does the note say? It put a note on that cell. You didn't put that there, right? That's a, um, that's no, that's a flag. Oh, you add it. Okay. I thought it was like Was there an explanation? No, I was going to be impressed. You got excited there, Paul, for a second. You thought, hey, I did. I was like, wow, it's now giving me explanations. So other than that, obviously that is a big, I think, area of concern. It's pulling from the schedules. I do. Wait, didn't we just hardcode a 0 up there instead of pulling from a schedule? And that's where there's all zeros. I thought that was a hardcode. Balancing up for a second. There was a— keep going right there. One of those looked like it was all zeros. There is— well, that's okay. Maybe I— it's— I think it's the common shares issuance that was all zeros. I mean, if you— if you look at the output, this broadly aligns with what Paul's model saw. So you don't have a cash balance, you are drawing on the revolver year after year. Little bits look sensible, like the— the common shares don't change, the retained earnings is going up. You can see that the amortization on the senior debt, so that's coming down every period. Like that looks, that looks good. I think that's good. And then schedules-wise. Yeah. Okay. I mean, it's again, I'm a fast kind of modeler, so I would much rather see flags and blocks, better blocks than this. It's pulling a lot directly from the assumptions, which again is very similar to what Paul saw, which, which is not my preference, but it looks, I mean, on the face of it, let's go back. You could, with skills, give it to use the face stamp, the fat, not face, the fast standards. Yeah, you absolutely could, but that all looks pretty consistent. And what do my assumptions look like? So here's big improvement on your one. If this does work, I've got a scenario switch. I'm assuming if I change that to 2 goes to best. How's it doing that? Exactly how I would do it. Index, you could use choose index, switch, whatever you like. Probably not switch, but choose. Yeah, you, you could. I wouldn't use switch. You could use a nested if, Giles. You could. Uh, so, but I think that's impressive. So it's, it's, it's got a proper scenario tool. I haven't got time to check everything. It's doing the same sort of thing with inflation here of of various things linking back to an inflation rate. Formatted, this is very much in line with how half of the Excel financial modeling world formats. It's always either blue fill or yellow font, I think. I'm pretty happy with that. It took a lot longer than the two models you ran, but I think that's looking pretty good. So I think what I like, what I like, it did a better job with its base best and worst case, no question. I think they were both similar on the other. One area I do like on the assumptions page that mine did is it told me where it pulled it from the document, and I didn't ask it to do that. Like, it came from this cell, or we used averages, so I could at least trace it all out. On the whole, I think this one followed a little more some best practices. I don't know, I don't see— did it give us a check section like the other one did? Like, I haven't, I haven't seen one. And, and again, I think that, that's a huge plus point for yours, the fact— I mean, I do have the balance sheet check. Yeah, you have the one. I don't have anything else, I don't think. I probably maybe lean slightly yours so far, but there's plus and minuses in both of them. I think because of the scenario— the— exactly. No, I'm not— I'm diff— I'm not trying to defend my tool because it's the tool. No, no, no, I, I agree with the scenario thing's probably a big thing because that's kind of fundamental to like how do you actually build scenarios in, in a financial model. Agree. That's the— if there's a flaw in the one I did, it's The way it did the scenarios is just a mess. That's a D-minus type work, right? Just bad. 2 out of 2 balancing balance sheets. Even that's a step up from when we, do you remember when we started? Months ago? They couldn't even build anything. I mean, it was awful. It was horrible 6 months ago. We remember one time in said, are we really going to show this? I'm embarrassed to show this. That tool has shut down. So we. We could name it now. We won't, but to where we're at now. So it's a huge improvement. All right. Why don't you show us what you got? We'll spend a few minutes there and we'll then we'll wrap up. So mine is definitely formatted the worst out of the three of yours, but there's something similar. Like, again, I used Opus 4.8 right in Copilot, and there's something similar feeling about our two, right, trials. There's a, there's a switch over here. It's going into Well, it's actually going into the switch is actually only going into sales prices and sales volumes. So here they are. And it uses CHOOSE function. It's okay. I like that. Very nice. I looked at these formulas. They're mostly fine. I mean, this is 2026 and they're either, they're not using color differences. It's, it's okay. Um, not super clear. If I go to the schedules, then there's a sheet of schedules, not very long. It just has some very simple schedules. With pricing, some cost assumptions, you know, long formulas here. Look what it did for depreciation. It's okay. Let's see here. I mean, this is all good. Working capital, depreciation schedule. You did yours do one? I think that's the first one that actually did a schedule by year. Did yours, Giles? Uh, mine built a summary sheet. I don't really understand. It doesn't look very good. It built a summary sheet for 5 years under these. 5 items. So anyway, it's, oh, the base, best, and worst, the best and worst cases are, are just, you know, dead numbers. So we need to, to realize that, that they're not actually adapting. They're dead numbers that typed in. I don't know. I mean, I listen, I think it's off to a nice, the financial statements look fine. They're a little, they're not quite client ready yet, but they seem like they're okay. All these cells are links. I looked at it quickly before. It does look like. Like, you know, as after we ran it, I was looking at it, it does look like it's built some pretty good formulas. I like the cash available before revolver, like kind of help or sell. That's right. But it looks— I haven't found any big errors. I have only been looking at it for 10 minutes, but I haven't found any big errors. It's off to a nice start. Here's got the base, best, and worst, and it's typed in dead numbers. There's no formulas here. It's just— so I don't know. I don't like that. It is a base case. So I just have to trust that they're working base, best, worst. I just have to trust that they're working properly. I don't know. I would be happy to start to give it, use this as my starting point. Sorry, Giles. Isn't that fascinating though, which was kind of why we did this deliberately, that you and I have both used the same underlying LLM, Office 4.8, but mine is directly through the Claude add-in and yours is through the Copilot add-in, and yet it must just be the additional context and whatever else is going on in that processing step before and after. I think more than that, a lot of it also comes back to one, what skills it's using. So what skills does Copilot have inside? What tools are it's using? Yeah. How it's pulling the data. But the other, it speaks to the fact that this is probabilistic in nature. And that's why you, you, you, have you guys heard the term AI harness? Giving AI a harness is basically, it's everything beyond the prompt, all the context you're giving of files and the Claude skills. They're basically saying, 'cause right, if you give a harness, if you're climbing or whatever you're doing, a harness restrains you and protects you. And it's kind of similar to that to AI. It needs that additional structure and guidance, or we end up with these three very different versions. I'm sure if we all build a lot of skills, we could get much closer. To each other between these three models. There'd still be ones that would be better and differences. Yeah, we could close some of that gap. Yeah, yeah, yeah. That's a whole new thing people are expected to learn now. You gotta know how to model really well, but now go ahead and figure out how to do AI really well, and then figure out how to be a good auditor. Now instead of someone else auditing it, you gotta audit the whole thing. Not that you didn't audit before, but in a different way. And you're gonna have to know work auditing someone else's is very different, right? Like for our brains, for many of us that have been building things, uh, or building, it's a very different learning process in your brain to build and to fight that the struggle and the fight is important, right? When you're creating, that's the creative process. And a lot of people's learning is absorbed and it happens through osmosis during the struggle, during the fight of figuring it out. And that's important. So I know for me, that's how I often learn things by fighting and struggling and learning. Through the wins and the losses to get there. And then I'm very, very smart about a topic at the end of it. For me, if you gave me a fixed model and you just said, learn it in the same level of intimacy and just review it, I don't know that my brain can actually get to the same level of confidence with it, or that I can have the same level of deep understanding with it by trying to understand something. Now I'm going to have to, because that's the way the world's going, and we will. But we all have to find a way, you're right, to look at this and still rip it apart, understand it, know what it's doing right, what it's doing wrong, and be able to defend it as if it's our own. And that's a bit of a big mind shift. I don't know if you agree with that, Giles, but that's a shift for me. Yeah, I do agree. And the more we look at this, I think we've all been on a journey since— is this like the wrap-up session? Is it worth I'm sharing, I feel like we're in the wrap-up session, Paul. We, yeah. All right. That's it. Let's share our closing thoughts. Um, I think back to when we started and it was longer than 6 months ago. And at that time, a year ago now, is it really? Yeah. 9, 10 months. First episode we did was October. Really? Okay. Which means we recorded it in September, which probably means we first discussed it in August, late August. Yeah, we were planning— yeah, I think back to when we started it and this, this whole concept of agents that you can draw on within an app and it like, it wasn't there. And that's why I think we started with third-party tools because they were the only ones that had kind of put that layer on. So the world has completely changed. I think we can see clear improvement. The fact that I think we just all had balancing balance sheets. Awesome. And You know, the more we get into this, the more I'm convinced that focusing— this is not a criticism of what we've just done, but the focus on can it build me a financial model so that I don't have to do any of the work just feels so like it's not the pot of gold at the end of the rainbow for me. And that's partly now because I'm looking at that space and going, God, you've got teams up to their eyeballs in all this other stuff. Do you remember Ian Bennett talking about all the value you could get from all of the other areas of a modeling project. And I think I'm probably even more convinced that that is where the value is at the moment, at least. And you know what, figure all that stuff out if you're a big company, and then by the time you've done all of that, we'll be on Fable, we'll be on Mythos version 9, and you won't have to get out of bed, right? You'll think it, you'll just have Alpha, no name, just think something and it will develop and you'll, you know, Yep. Yeah. That's a scary thought if it can read my thoughts. Any last thoughts from you on this, Ann? I think we got Giles' dissertation. Let's move to you. No, I agree. I agree with Giles. I think this is a different world than 9 months ago when we started when we didn't even, it was only at the end of the year. Remember, it was at the end of last year when the LLMs announced that they were starting to incorporate. It wasn't even considered in the fall. We were looking at all these other tools and I, I haven't kept up with how they're all performing, but I suspect it's a challenge for many of them, you know, where they need to differentiate themselves, obviously, if they're gonna survive. Now, people are desperately learning, going deeper, as you said, into AI. They're trying to code, they're trying to learn skills, they're trying to kind of get under, you know, and think about it in terms of automating their day and their monthly close and their life and their emails and connecting it all together. We weren't talking about anything like that 6, 9 months ago. And from what we're seeing today, obviously these 3 models that we built today, I don't know if there's a real winner. I mean, they're all similar, I would say, but they are much better than what we looked at 9 months ago. But compared to the models from February, March, yeah, maybe a little bit. The balance sheets balance, which is nice, but you're still, we're still good with all of them. None of these is ready. To hand in. And that's even scary if people think that they have a model ready because you're gonna mi— you're missing the, the point of, of the learning process, I think. And yeah, I, I don't think there's a marked improvement, like you said, from Opus 4.7 to what we did today. But I'd be really intrigued if you two are up for it. I think we should do a— whether it's the next episode, whatever, I think we should do it like, okay, let's give it our best shot. Like, let's actually put all of the custom instructions we would think of and, and skills files and whatever else, because I'd be really intrigued to be like, with our best efforts, how good could we get it straight away? But that takes a lot of skill on our part to know modeling, to know accounting, and to know how to really be a strong, you know, a musician with this tool to, to work it, to work, to really work our models and, and get right? That, that's going to take a lot of time and effort to, to get there. And I'm not saying we shouldn't, we should, but that's not free. Like, that, that comes with investment, right? That's a real investment. We're just invoicing Paul for our time, aren't we? Is that correct? Right? Yeah, yeah. Hours and hours. I'm bowling it to Anthropic. Yeah, let's get Nico to pay for it. Uh, what are your closing thoughts, Paul? I think it's similar to the two of you. One thing I want to say is, do you remember when we tested Copilot online, right? That was, that was an episode. Think how far we've come from that. Like a lot of people have badmouthed Copilot. I've been guilty of it in general. I think Claude's a better tool right now, and I think most people do. That's pretty clear, but how far it's come, that's what I keep telling people is you can get value out of it. I don't care which one you're using. I think one message, you can get value out of all of them. Two, none of them on their own are production ready. Could you potentially get there with skills for certain tasks, for certain things? No doubt. Full financial model? Maybe, depending on what you're doing. But even if you do, you still have to know how to audit it. You still know how to check it. I, I'm just like the two of you. I say AI is a magnifier. The better you know what you're doing, the more you can get out of it. The one thing I'm so sick of is the AI slop we're seeing all over the place of do this in 5 minutes. And I will repeat what I've been saying all along. Go ahead and build the model in 5 minutes, get fired in 10. You know, it's just stupid if that's, that's the way you think these tools work. It's not, no, there's not a magic easy answer to anything. And that's true with modeling, but you can get huge benefit if you're willing to pay the price, I think is the message. Yep. I agree with that. Jens, always great to see you. Good to see all of you. That's a wrap. Uh, we'll do this again next month. So thank you for joining us, everybody. We hope you made the whole hour with us. Us. The Mod Squad, we are the Mod Squad.

Listen to this episode All The FP&A Guy Network episodes →