
AI-Assisted Development in 2026
SaaS That App - Building Tech-Enabled Businesses · 2026-06-09 · 46 min
Conversation analysis
Computed from the transcript - who did the talking, and the verbal tics along the way.
Share of words spoken
- Speaker A66%
- Speaker C20%
- Speaker B14%
Filler words
Episode notes
Claude Code nearly wiped out a production database while racking up an unexpected AWS bill overnight, and that’s just one of many stories from guest Daniel Cannon. As CIO at Delta Systems and CEO of StriveDB, Daniel joins hosts Aaron Marchbanks and Justin Edwards to share what AI-assisted development actually looks like in 2026: real workflows, real mistakes, and why human review is still non-negotiable. What You’ll Learn: How to structure an AI-assisted development workflow that delivers 10x speed gains without sacrificing code quality Why Big Design Upfront (BDUF) is making a comeback with AI The critical difference between letting AI write code vs. reviewing and refining it How to build safe, controlled AI agents for sensitive workflows Why context management directly impacts both cost and code quality Daniel Cannon is the Chief Innovation Officer at Delta Systems and Founder and CEO of StriveDB, bringing a wealth of experience in modern web development frameworks and architectures. His expertise spans full-stack development, with particular depth in Ruby on Rails and modern JavaScript frameworks.
Full transcript
46 minTranscribed and scored by The B2B Podcast Index.
Since switching to Claude, I've been able to do a lot more like big features. I mean, I have now done this multiple times where Claude has gone and built a huge feature and I've been like, okay, that's cute. Thanks for all your work on that Claude. Git reset, hard head, context. Clear. All right, now I'm going to write a brand new plan. And you could never do this with traditional software development, right? It's just way too inefficient. Welcome to SaaS app building B2B web applications, the podcast where we share real world stories, practical advice and tech insights. For those building or thinking about starting a tech enabled business, I am your co host, Aaron Marchbanks. And I'm Justin Edwards. Each week we bring you the stories, strategies and insights you need to build your SaaS or tech enabled business smarter, not harder. Let's dive right in. We are back. It is May 2026 and we are having another fantastic episode of SaaS. That app with me, as almost always, is my bespectacled and bow tied companion, Mr. Aaron Marshbanks. How are you doing, my friend? Doing very well. Enjoying the lovely spring weather and getting out and doing my runs again, which is awesome. No treadmill for me. It was too hot here yesterday. I had to give up and go to the gym and do it on the treadmill because the feels like index was over a hundred degrees and I was like, that sounds sinister. I know I'm already mapping out where I'm going to go when I come to see you guys. But then I forgot to take into account that the heat and humidity. So it may be like, you know, oh, dark 30 or something. Get out, get after in the morning. Otherwise you're going to pay for it or else you just have to run slower. It's not the worst thing in the world. No, easy's good. Today we are doing a second installment, second episode in our informal AI cloud coding series and the death of software engineering and all of that. And to that episode, we have invited one of our internal programming geniuses, Mr. Daniel Cannon. Welcome to the pod, sir. Hey, how's it going? Have you heard of Claude? What's that? Oh, man, it's going to be a rough episode. Oh, gosh. We were relying on you to bring it. Dustin and Daniel, you got to carry this. Is it kind of like chatgpt? Is that kind of what it is? I think I've heard something about this. It's kind of like that. Yeah. Okay. Okay. Yeah, yeah. All right, let me Google it. Really Quickly and yeah, I'll get ready for this episode. Actually, just go to Claude and type in. I'm actually on a podcast right now where I need to talk about Claude code. Help me out. I'll probably do my work. Yeah, I should do that. I should do that. That's a great idea. Who are you and how can you help? No, but jokes aside, can you let folks know a little bit about how you're using AI tools for coding, what your experience has been, how that kind of looks? Yeah, absolutely. I mean, Claude has been an interesting thing that's come on the scene. It's been a dynamic and crazy year or so here, getting used to incorporating AI tools into life and everything that we do. It's been a roller coaster. It's definitely been moments of AI sucks versus AI, is taking my job versus AI is kind of okay, we're just kind of all over the gamut. But yeah, I mean, I would say I've used Claude on the daily at this point. Sometimes I use Codex, I think in general. Still a much bigger fan of Claude than Codex, but you know, it has its place in my workflow for sure. And yeah, it's still, it's an evolving workflow. I mean, I think we're all figuring this out. If anybody says that they know what they're doing, they're lying because this is all extremely new. But yeah, I don't know. I mean, what specifically? Where do you want to go with this? What do you want to know? Walk me through like a typical. You're about to write something, maybe you've got a client project and there's like a medium sized new feature that you're going to implement. Maybe you're adding like a new layer of abstraction related to like different locations. Let's say it's some kind of a service that has employees and customers, but need that like install locations into a post hoc. How would you tackle that problem today in 2026 with Claude or whatever? Well, I can tell you what I won't do, which is I'm not going to open up an ide. That's about the last thing that I do. I'll pretty much open up an IDE anymore if I'm making like copy edits. Like that's about the only time I would really open up an editor. I mean, yeah, absolutely, I would definitely implement that using Claude. The workflow that I've sort of gravitated toward is actually kind of more of a bduff, like a big design upfront workflow for larger features. Which is so contrary to how I've always preached development. So it's been a bit of a mind warp. But it's basically like I will sit down with the feature, with the idea and I will write a spec. I will write a spec with bullet points about these are the functional requirements that must be satisfied. I'll write very strict rules about this is how it needs to be implemented. Even though it's in Cloud, CloudMD, even though it's in Skills, I will reiterate that you need to follow the conventions in the project. Here are what the conventions in the project are. Don't go off and do your own thing. If you deviate from the plan, you need to talk to me and then sort of like, I mean, depending upon the largeness of the feature, there's a few different ways that I would go about doing it. One is using a plugin called Superpowers. Superpowers is very cool and very powerful, but it has some major drawbacks as well. With Superpowers, it's basically a huge set of skills as well as a workflow and, and it's very opinionated about how you're supposed to be developing software. So like it wants to do test driven development, it wants to write tests first. That's kind of contrary to my usual workflow, but I find actually working with AI tools sometimes TDD has its place. And so it takes you through like a brainstorming process where it will read your sort of ideation document is always what I kind of call the first spec that I put together from brainstorming. It asks you a bunch of questions. Sometimes it does like a share a browser thing with you and you can work with it together on designs and lay. And then it kind of comes up with a plan and you go through and approve every step of the plan and then it comes up with the spec and then you go through and you give it feedback and approve the spec. And then at that point like the typical Superpowers thing is you just kind of hit go and it will usually spin out subagents. Depending upon how complicated it is, it'll spin up subagents. There are ways to kind of tweak it to use different methods inside those sub agents. So you're not always using like the most expensive intense model, for example. So then you can kind of let it run and it goes and builds git work trees and it goes and you know, builds the code and then it comes back and asks you to do manual review. And so this is like the philosophy of how superpowers is supposed to work. And I'm moving away from that. Honestly, my experience has been it goes off and it works for many, many, many hours. And then it comes back and it doesn't work and it doesn't follow the spec and it's come up with its own ideas about what it was supposed to do, which isn't what I told it to do. And so then I get to spend hours yelling at and cussing at the bot. By the way, there was a, there was a leak a while ago of anthropic source code. And one of the things that came out of that is that they are tracking the profanity that you with the bot. And so I've just leaned into this in a big way. Like for me, this is filing a bug report. So I just, I just swear at Claude and then I'm like, great, now that's going to anthropic. And they know that the bot screwed up. What I have found is that this is problematic. And so I'm sort of shifting myself away from just doing this hands off, sort of lazy approach to letting the bot do it. And I'm actually going back to where I sort of started, which is I'm approving every single edit as it goes through. And that way I can kind of course correct as we go. In some ways, it's a much harder way to work. But I actually feel like it's a going slower to going faster approach. And Claude is already giving me, you know, even working this way, I'm getting a 1020x speed up working with Claude. And so I don't need to squeeze every last drop of juice out of this. Like, let's go 10x. 10x is fine. 10x is fast. Let's get it done and then we get it done right. And we're not pushing up these pull requests that are thousands of files. And then, you know, and then having it do its automated peer review and it finds all these weird edge cases that it thinks it needs to robustify against. It's like, stop, stop. Like, let me be the developer, let me work with Claude. So that's more how I'm working with Claude these days. But it's evolving. I was going to say that's been a chief concern of mine since the outset. Is there going to be a mad rush to the extreme? Look at all these amazing tools. I'm on the bleeding, bleeding edge of amazing. And you're going to have a lot of people using that that aren't like you, that aren't maybe as senior or are architecturally minded. And it's going to be fine and it's going to result in a lot of stuff happening very fast, but poorly. And the poor part won't show up until you add a feature or you scale or you. Or you need to do something more with it and somebody else looks under the covers and like, what is this mess? So that's, that's always been a concern of mine. So it's interesting to me to hear that you've had that type of an experience. We have a long celebrated trend in technology where everyone identifies a silver bullet and then it promptly shoots himself in the foot with it. That's just how that goes. So I guess one question I had, Daniel, is you talked about like the planning phase and kind of talked about superpowers a little bit. So do you still do a big planning phase if you're going to improve in every individual edit, or is that kind of like, are those two modalities that kind of got mixed together? Yeah, no, I mean, I still do the planning, I think. You know, again, it depends. How complicated is this feature that I'm doing? If it's a bug fix? No, I don't go through this. I mean, with the bug fix, I'm like, claude, this is broken. Fix it. And then I let him fix it and then I peer review the fix. Like, this is more like doing a big overhaul, doing like a big feature. Something where it's like very hard to chunk this out as small things. Like, no, we're doing a big thing. And since switching to cloud, I've been able to do a lot more like big features. So this, you know, it used to be a big feature was a once a month thing. Now it's like, yeah, once a day I'm doing a big feature and then I'm also doing small little things at the same time. And I can switch contexts between different agents that are running. And like, it's been a huge speed up in that way. But I do think the planning is useful. And, you know, one of the things that I've really found, especially for features that are not super developed, by which I mean, like, from a idea standpoint, it's actually been very helpful for me to let Claude kind of go off and build something and then see what he built and then learn from that and be like, huh, okay, so there was good here, there was bad here, we went down this wrong path, you know, and then I will literally just throw it out. I mean, I have now done this Multiple times where Claude has gone and built a huge feature and I've been like, okay, that's cute. Thanks for all your work on that Claude. Git Reset hard head context clear. All right, now I'm going to write brand new plan and I'm now going to have all the learnings from everything I did wrong that time. And you could never do this with traditional software development. Right? It's just way too inefficient. But it's not that big of a deal. You know, you might use up your 5 hour usage window and have to go, you know, heaven forbid, take a walk through the park or something. But it's actually been really cool to just kind of go down a rabbit hole and then just kind of reset and start over real quick. This episode is brought to you by Delta Systems, which is what Aaron and I do when we're not talking through microphones to you, the people of the Internet. We've got a really, really great software team here and we love to work with cool people on cool projects. So if that sounds like you and you've got a problem or you're in some kind of a jam, go to deltasystems.com grab a time with us, we can beat up on your problem together and if there's a fit there, amazing, we'll help you out. So Deltasystems.com grab an appointment and hey, maybe we can work together. Yeah, I remember early in the days of our company especially, but I think probably this is true of a lot of consultancies is there was a time when you spent a whole lot of time upfront going through what is now known more as a discovery phase. But you know, back in those days it was almost a loss leader of putting together this big blueprint. You know, I mean, it's like a bible, like here's the document of what it is that we're going to build and that's what ultimately would get you approval from a client. And you know, that's what we'd tack estimates to, that's what we would tag timelines to and you would just build that exact thing. Obviously somewhere along the way we've gotten better and better and we moved away from that because we can build quickly and get through this iterative process. Very agile ish. You know, and I feel like now we got a really nice opportunity to kind of come back to that a little bit, spend more time planning because we are going to be babysitting agents as much as anything else. And you can really take the time again to say here's what I really want and get more maybe out of it on the first pass than you would without it with just kind of going through the process. So that's cool. Yeah, I agree. And I mean, I think it's been kind of weird to sort of become a little bit more of a Waterfall developer in a lot of ways of like laying out requirements up front, but also still recognizing that you can be very nimble at the end. And so even if you're spec is totally wrong, which they all are, every spec is wrong, every spec is completely wrong. And that's what Waterfall always failed to acknowledge. Is that great? Yeah. You spent three months developing the plan for how to build it and then you built it and you built it according to plan, but the plan sucked. And so now that's all just waste, you know. So instead working with Claude and working in this more nimble way, it's like, great, the plan sucked. And then at the end we can see what's wrong and we can fix it, we can iterate on it, or we can even throw it all out and start all over. It's not that big of a deal. That's right in line with one of my favorite quotes, which I think is attributed to Patton. But it's plans are useless, but planning is everything. And I think that's really true in software. You mentioned AI assisted code reviews. I've had some struggles with that specifically myself. I'm curious. Just give me a fresh take on that. I know there are a lot of things that are getting very popular right now for doing automated peer reviews and automated peer review fixes. I'm so opposed to that. I think it's such a bad idea. But I mean, that does seem to be where a lot of people are going. So I mean, like I use Claude and Codex for doing code reviews use and I do them very frequently. Like I won't just do it when I'm done with the feature, I'll do it kind of in the middle and I will just sort of prompt Claude, for example. And you know, there's kind of two plugins that I use that already have like peer review skills built into it. But even on top of that, I still give it a prompt, which is kind of imagine that you are X. This is like AI just loves this. AI is great at role playing. It's kind of where a lot of this stuff, you know, these LLMs kind of came from, was this role playing world. And so it's like, yeah, put on the hat and pretend that you are An X. And I will tell it, pretend you are a senior Ruby on Rails developer, developer with centuries of experience developing AI tools, whatever it can imagine. So I'll tell it to do a peer review that way I'll also do it like from a user perspective. Like, you know, pretend that you are one of our users, pretend that you are not a computer savvy person, pretend that you have a disability, you're using a screen reader. How would using this app be? What are things that you would flag? And it's like, it's a little tedious to have to like give it this big verbose prompt. And everybody always says, oh, well, if you're giving the same prompt again, you should turn that into a skill. But I find that like repetition is everything and just reminding the bot again and again and like, this is what I'm trying to get out is actually really helpful to getting better results. And so I'll do a code review with Claude. I'll also do a code review with Codex and they usually come back with a list of findings and they'll have findings that are critical, you know, major findings. And initially when I started doing this, I'd be like, oh, well, yeah, like let's just, just fix that. And so I'd be like, fix everything. And you'd end up with these pull requests that get pushed up that would have like database migrations that were adding these crazy constraints and enabling postgres extensions to be able. And it's like, what are you trying to solve? It's like, well, we don't want two of these things to have overlapping time frames. I don't care. That's not even a data consistency problem. How did you get this in your head that this was a problem? So I've learned to be a lot more critical and I'll read the peer review feedback very cautiously. I need to make sure that I understand every single thing that he's flagging before I say go. And for the most part, unless it's like writing tests, like tests, I'll let him do an auto mode. But for the most part, if it's like he's actually doing code changes, I'll literally just step through every single edit with him and watch in real time what he's going. Because so often it's just he starts, I think I know what he's doing and then I see what he's doing and I'm like, whoa, whoa, Claude, why are you touching that file? Why are you touching the file? Do you find that you get better results when you have. I know that there's a trend right now obviously to have something else check what the other thing did. So, you know, if you're having Claude write it, have something else do the check against it. But I'm curious what your experience has been of having it check itself versus having a secondary thing check what was done in the first. I would say that in general, Claude is actually doing better on code reviews than Codex is. But Codex does very often find things that Claude doesn't. There's kind of two modalities that I like to do those code reviews in. So one of them is with context. So with the context of what it's been working on, on it yourself. And then the other is like, blow up the context, clear the context, start from scratch. You're coming into this brand new look at this code. What would you think? So I will usually do both of those with Claude and then I'll also do an independent review using Codex and then only after we're done with the machine code reviews and the fixes. That's the point where I'll then create a pull request and then I go through and I do my own pull request review. I go through and I add comments directly to my own pull request. And I then actually have a script inside of Claude to pull in those peer view comments and he can batch them up and start working on them. So we'll do that. And then after I'm done with that, that's usually when I'll, you know, tell Justin or Nolan or someone be like, okay, this pull request, that I will actually switch it from being a draft pull request to being a Marcus ready pull request. And that's when I'll say like, great, now I'm ready for you guys to come and beat up on it. And then usually they go and find all sorts of things that I missed and Claude mixed and Codex missed. And yeah, yeah, it's a lot more steps on review, a lot more work on reviewing code. And that's actually been the bottleneck from our shop has switched from how fast can we write code, how fast can we write features, to how quick can we review it and ensure that it's good. And the AI helps us do that faster as well. But it's been a weird thing now, and especially with. I think the correct thing is to remain incredulous about these tools and their perfection, or alleged perfection, because they come up lacking a lot, especially when you're dealing with critical data, mission critical software, anything like that. The human review step is still key for us. And we still find things all the time. That AI finds tons of stuff that I miss, but I also find things that AI thinks is fine. And I think that's been everyone's experience here. We are so far from this automated future that everyone thinks that we're already in it seems we were talking to this one guy who is using AI in this production system where every time there is an issue that gets raised inside of Sentry, he is creating a Jira story from that. Or actually, he's probably using linear because he was on super cutting edge. So he creates a linear task for that Sentry story. I'm like, all right, I'm with you so far. That's cool. And then we spin up a container, and we have the bot go ahead and author a fix. I'm like, oh, that's really cool. And then he generates a pull request, and a human being reviews that, and your QA team tests it, Right? He's like, oh, no, no, no. Then we just patch it directly into production, and then we send an email to the person who encountered the error to let them know that the error is fixed. No. Yeah. Feel like you're missing a step or two in there. Like, there's a lot that you said there that was so good, but for the love of everything that is holy, do not let your bot write code into production. Like, no. Everything needs manual human review. Like, we are still very much at that stage. And. Yeah. Oof. So I guess I didn't realize you were an AI skeptic when I invited you on to talk about code. I'm not. I'm not at all. I'm not at all. He has sadly used it enough to find where the pitfalls are. Yes. And they show up fast sometimes. Well, I'm 100% with you. And if you want to talk about prompt injection, like, okay, cool. I've discovered that there's this autopatching thing. Now I'm going to try to cause an error in the system where I can do prompt injection, and then I'm going to get a backdoor AI patch through a production system that's not outlandish. It's not crazy to think that that could be done. And I disagree respectfully with that gentleman about that being a good idea. Well, there was a lot that was great. Like, I'd actually love to get that set up. We're not doing that right now, but, like, yeah, when you have a Sentry issue, like, sure. Create an issue and, you know, spin up a bot in A container and author a fix and create a pull requ Because I mean so many like you know, 500 errors that you find your sentry logs are like, yeah, somebody missed a nil check. Like I'm sure the bot can handle that without any, any guidance. So great, write it. But it is so hard to accidentally corrupt a production database or introduce a huge security vulnerability. And I've seen Claude do some dumb shit. I have just seen some of the incredible things that this bot has done that he thought were right and that it were even sneaky where they could get past human review in a lot of cases. So it's like, no, no, you got to be very, very, very confident. It's fascinating for me and it's educational for me, as if the audience doesn't already know. I haven't been a full on developer for so long it's ridiculous, but I understand concepts and I still could fumble my way through some PHP and platform stuff. But it's really interesting to me because I feel like almost this is providing an avenue for me to get back more quickly because I use it as an educational and a learning tool as much as anything else. And then doing that I can fast track myself getting back into a more usable state on and doing something specific and small. Certainly nowhere near like what you guys are doing with the full on features and large builds. But I can identify things I can go through and I can also at the same time prompt it to say, what is it that you saw? Teach me what you did. Pretend again to your thing about it imitating things. Pretend that you are a genius and I am an idiot and explain it to me like that, but get me to where you are and it can step by step me through. And I found that to be really, really cool. And I use it outside of work for a lot of stuff like that too. But it's really neat to be able to feel like it's kind of back within my grasp to become a developer again if I wanted to. I'm not heading down that road, but it is really nice to kind of not feel like I'm so far down a dirt road anymore. Eli 5 should be coming up a lot in any kind of private offline chats you're having with AI about things you're not an expert on. It's just like, oh yeah, start with the elementary understanding and then move on from there. So I like that a lot. Well, I guess you're kind of uniquely positioned in our organization to answer this question, Daniel. So I'll toss it out for you. If your goal is just to consume as many tokens as possible and run up a credit card bill, token max, do it. Our resident token maxer over here. How do you. No, jokes aside. No, no, no. I got. I've got some stories, unfortunately. Yeah. So, yeah, I don't know why I'm the one who always seems to have these issues, but I'm the one who has these issues. So, yeah, there's kind of two sides to that. Let me first tell the story about how I ran up a $500 bill with AWS. So that was fun. In one day. In one day, I should add, using a little bot who was doing some email triage for me. I wrote these bots. They're super cool. They're just my own little personal ecosystem of agents. And they run and they do email triage, they review my email, they categorize things, they can draft responses, they can do all sorts of things like that. And initially, my token usage was very small. I built this using Amazon Bedrock, which is a super cool service. And that was part of why I did this, is I just wanted to start getting familiarity with Bedrock and to learn all the pitfalls of it. But basically, Bedrock is the thing that runs inside aws. It's tenant isolated. Nothing that you put up there gets used for training data. It purges data immediately after it gets used. So basically your prompts go into the bot, the bot spins up, and then the bot goes away and your data goes away. And so it's very unlike working with anthropic or OpenAI, where you're sending your stuff out to a third party. Everything stays inside the Amazon ecosystem. So that's really cool. And they have their own copies of anthropic models and OpenAI models and Google models that they just run in this infrastructure. And so I had a couple things go a little haywire that I didn't realize were going haywire. So the first was I was messaging these bots over Slack and it never really occurred to me what kind of context they were maintaining. It turns out like the entire Slack message thread, every single time was getting sent up. So I would be like, hey, what was that email that I got from that guy named John? And it would go back to the very first message that I ever sent it. It would read the ENT email thread. It would post that up along with all of the other context that was in the project. And basically, the more context you use, the more tokens you use, the faster you burn it. Also, I had set this thing up to use like lower level models, I thought. Turns out it did not get set up that way. It was using Opus 4.7 with 1 million context for every single query that went up. So fortunately I caught this very quickly. But Yeah, I did $500 in one day on Amazon Bedrock. So the big lesson learned there is put budget alert in place on Amazon for sure. Set those immediately. Also, if you have something that's calling out to AI services, what I basically built is a budget tracker. So at the level of the API request, before it goes to Bedrock, we do our own estimate and analysis of how much we think this is going to cost, how many tokens we're using, what model we're using, and then you kind of track and monitor it and we have like a hard limit that's in there where if it, you know, if it goes over $5 a day, the bot just shuts off and says, no, we're not going to, not going to process it. And since implementing these changes, I'm now down to like, you know, 40 or 50 cents day, using these bots pretty heavily, which is more reasonable. So we're going to see if Amazon's going to give me any of my money back. This is my own personal thing. If anyone in Amazon is an engineer who wants to sev to us for a refund, reach out for account number. If you could reach out to your boy Jeff and let him know that I liked my $500 back, I would really appreciate it. Well, and it's like, I know this is kind of like Amazon's business model and their attitude is like, oh, anybody using the be sophisticated, they should know what they're doing. But like, come on, like, this is, you know, something should have flagged this as being anomalous. You know the amount of email that I get from Amazon every day about things I don't need to, you couldn't bother to send me an email being like, hey, did you mean to spend $500 on. On Bedrock? So Amazon has a shared responsibility model. And anytime I'm very familiar, anytime that involves you paying them money, the responsibility, that's one of the times where it's your responsibility, I find. But anyway, not to turn this into Amazon bashing, it's the worst cloud except for all the others. Awesome. So to put a bow on that. So monitoring usage is important and then managing context, super important. Not setting up too much context probably also hits performance negatively. When there's that much context, the bots probably don't do as well. That's been a definite learning, especially using clogged code is it's not just about compacting your context to save money. That's part of it. But part of it is the bots will get things stuck in their head because they're reading that context. It is biased the more recent messages, but you'll have things way early in your conversation that it still thinks are lore and it'll just surface. So I try to be very diligent about compacting, clearing. I'll sometimes ask the bot, hey, would this be a good time to clear context? And it'll usually give honest answers. Compact has actually worked really well. You run compact, it just takes a little while and your tokens go down. And it's like that knowledge is there if it needs it, but it's probably not going to reach into it. Awesome. Well, I want to have a little therapy session about some cloud disaster stories. But first, I did want to give you a chance. Can you talk a little bit about the bots that you made? I think what you did is actually kind of unique and kind of cool. I think it was 100% Claude derived as well. So. Yeah, can you just give us a little. A little overview of the bots you put together and how you programmed that? Yeah, no, no, they're cool. Yeah. So I started using, I tried out Hermes, which is this thing that got recommended for doing this. And it was kind of cool, but it was very heavyweight. The idea behind Hermes is that it's just kind of supposed to learn from you and learn your workflows, and it has all sor of connectors to talk to different things. And I built like this whole thing using Hermes. And I used this other thing that we got from this guy for this chief of staff workflow, and it was pretty cool. And what I ultimately wanted is I wanted to have some bots that I could message on Slack, who could triage my email, who could draft responses to people who could schedule calendar appointments and be like, hey, I need to meet with Aaron and Justin sometime this week. Can you find 30 minutes? And my life's a mess. I'm across multiple companies, I've got multiple calendars. Trying to find time is a real challenge. This has actually been something I've struggled with for a very long time, so that'd be really cool to do. Hermes wound up being just very, very, very, very heavyweight. And so what I actually wound up doing is I just started a brand new project and I just spun up Claude and basically said, hey, here's what I want to do. Here are the other repositories that I found. How can we build this? You know, what do you recommend? Claude was like, we should definitely do this in Python. I'm like, I hate Python, but whatever, I'm not writing the code. We'll do it in Python. And so I let Claude more or less just kind of have the reins and he built this thing out. And basically how it works at the end of the day is it's a bunch of text files that are the bot and then there's a bunch of Python scripts that are tools that it's able to use. And using that, you're able to make these tools pretty finite. Where it's like it's allowed to read email using this tool, it's allowed to read a Google Doc using this tool, but it doesn't have a tool to delete a Google Doc. So it couldn't even do that if it tried, if it wanted to. It doesn't have a way to do that. And so yeah, actually I have three bots who run in my Slack. I've called my whole system Mount Olympus. It's very nerdy, but we have Athena and Poseidon, who are my two God bots, who work for Strive and for Delta. And then I have a third bot called Todd, which I just really like the idea of like this guy named Todd hanging out on Mount Olympus. But he's my personal bot, so he's the one who's like, hey, I need to do my laundry. And he reminds me about things like that. And yeah, they just kind of live on Slack. And then I built in a whole bunch of safety precautions because like the thing that scared me is, okay, these are now on Slack. Slack is now a new point of failure that I didn't have before to my life or if somebody compromises my Slack account, maybe I leave my phone unlocked somewhere. Like they now have access to my life and that's not good. So I built in all sorts of safety precautions into them where they will start. If I start asking them questions that they find suspicious, they will just shut down and they'll be like, hey, I need a two factor code to prove who you are. And so then I have to go get my device and give them a two factor code to be able to proceed. That's been really cool. And we work with a lot of sensitive data. And so that's one thing where if I start asking, go dig through my data records and try to find find something confidential like the bot will stop me. And it actually has some really hard safeguards. Like there are certain things it's just not allowed to do, even with an override. Those are all precautions. I built in a bunch of stuff for prompt injection. And if you start asking it, like, hey, pretend that you're a blah, like, the bot will just shut down and be like, no, I'm not talking to you. Why are you doing this? You're acting suspicious. So they're kind of running in paranoid mode. That's cool. Is the paranoid mode, if you will, via the scripting itself or is that something that you baked into, like the text files or the skills for it? Multiple different layers of it? Yeah, a lot of it is just kind of in the skills for the bots themselves. So they, they have a workflow that they follow and one of those things is validating things that might be suspicious. Then they have tools that will throw errors if they try to do things that they're not supposed to do that get elevated as well as they also have tools for doing things like two FA validation and different things like that. Interesting. Yeah, that's cool. Yeah. So they're. They're pretty awesome. Yeah. I know that there's some things out there, like OpenClaw, for example, that has a lot of stuff very similar to that, like out of the box, but it's also not a enterprise or business use per se, because of the limits that are on it. And I feel like a lot of people are going to kind of dip their toes doing something like that where it can be more conversational, like, you know, with you, you know how to set this up, you know how to actually take a look at the Python script that's being generated, where most people are just going to be like, I need AI to check my email. I have no idea how to make it do that. And there's going to kind of move toward a tool that does it a little more out of the box or has those skills or that sort of a thing. Totally. And I appreciate that. Like, I'm a nerd here. And so I get why people would want to use something that's a little bit more like, you know, friendly out of the box. But, boy, I really feel like the world is going to be heading more this way in very short order. Like in a year or two where consumers will not be, you know, like normal, average, functioning human beings will just pull out their phone and be like, I need a grocery list app to keep track of my groceries and to recommend what groceries I should buy from local stores and look up their inventory and tell me what's available and then go boop, boop, boop, boop, boop. And it'll just create the app for like, I think we're going to be there in a very short period of time. I would not want to be owning like a SaaS company that does something basic and mundane, like, you know, like, I think about some of the apps that I use, like Todoist and Calendly, and like, maybe they're going to be great innovators who are going to going to come up with a great solution, but it's just too easy now to make your own calendar assembly and it can be customized to exactly what you need and deal with your workflows. And the same with like Todoist. And, you know, these are all great apps, these are great companies. You know, they had great vision behind it, but it's just the mode is too low with AI. It's just too easy to build something to replace it. You're stepping on a topic for a future episode. I think Garen and I are going to do a private chit chat about a personal project of mine and then kind of what lessons we can draw from the broader SaaS community and awesome kind of parroting, adding onto a talk we heard at microconf about SaaS that are going to do well in this climate and ones that maybe aren't, but yeah, 100% spot on. All right, well, we're coming into the tail end of the show here, so it's time to bring down the mood a little bit. Lower the lights and let's talk about some. Let's have a little therapy session. Talk about some cloud disasters. Seriously though, we'll spit it educational. But what are some struggles that you've had with cloud problems it's caused for you and kind of ways around them, if you got them? My biggest issue with Claude is adherence to instructions. That's the biggest issue that I have. There is this real tendency with all of these tools to just keep momentum going. And it's probably a really good bias, but it's problematic because it'll be like, hey, we're building a feature and anytime in software, anybody who's written code knows that you fix one teeny, teeny tiny little thing here and it has ripple effects and it affects other things. And so Claude will go and he'll make the new change, change. And then all of a sudden the test suite starts failing. And so then Claude is like, oh, I must fix tests. And oftentimes it doesn't know how to fix the error. And so it just kind of takes the path of least resistance. And I have had it just drop major features from the application. Be like, oh, that feature that the test suite is failing, let's just get rid of that feature. Like what? You know, like, that's not an okay solution. So, yes, adherence to rules has been very hard, and I'm still struggling with this, that there is a real problem, that they try to be very helpful and that actually becomes a security problem in and of itself. I would recommend to anybody, anybody using these tools on their computer, do not just run it out of the box and be like, have fun. Like, you need to be putting safeguards in place from day one. It should be running in a folder. It should not be allowed to access anything outside of that folder other than like reading its own Claude settings. So there's a way to configure this to where it's basically locked in. So you can't access things outside of that folder. And then inside of that folder, you got to get rid of things that let it get outside of that folder. So things like keys for servers, SSH keys, that kind of thing. Get it out of there. Credential files. Get it out of there. Like, that should not be in your folder. And I have a script that I wrote that's like a safe Claude script that basically strips out environment variables and does all sorts of stuff to make it to where he can't do much. And I have built into my workflow. Step one is make sure you're running in safe mode. Make sure that you can't touch things outside of this float holder. Like, make like, try to do this, get an authorization error. Once you get that authorization error, we're good to rock and roll. I was working with Claude on a WordPress site, which that's probably a whole other topic to get into. But speaking of things that aren't going to do well in the age of Claude. Oh, dude. Oh, dude. Yeah. No, that's actually probably the most frustrating experience I've had working with Claude on anything, was trying to work with him on WordPress. Because, you know, WordPress, I mean, WordPress is just awful and it needs to go away. It needs to go away so, so bad. And it has needed to go away for so long. And WordPress is what powers the entire. And working with Claude has really brought to light why it's so awful. Because you have all this crap in this database that in order to change anything, you have to change it in the database and you got to sync your database across servers. And that's a mess. And so I was working with Claude on this WordPress site and we were having these issues where he was getting these UTF malformed things in the database in production. And so it was like, oh, I will be helpful. Let me just SSH into the server and I will just execute a root and literally it SSH to the server. It wasn't able to access the MySQL database. So then it SSH into the server and did sudo MySQL oh, Jesus. Which fortunately did not work. And I caught this while it was going on. So like, no, stop. How are you even able to do this? And it's like, well, it's able to access the command ssh and my SSH key was just SSH added. And it was just trying to be helpful. And it knew the host name from other configuration files and Terrifying. And this is how entire databases get wiped. Yep, yep, yep. And fortunately, this was a very, very, very low criticality, you know, system. If it did what it was trying to do, it wouldn't have been an issue. This was very recoverable and not a. Not a big deal, you know, but it was a good lesson learned. And that's also why you should start playing on smaller, unimportant projects before you move to big important projects. Yeah, I mean, I just did a recovery job where some helpful fib coded person built this entire system and launched on all the annoying Vibe code infrastructure that I have opinions about. But at the end of the day, what it came down to is that there was no authorization for anything this application was doing. So there was a login screen and the front end wouldn't let you in unless you had a login. But fundamentally, looking at the bundle and looking at the keys that were present to the user, you could do anything you wanted to in the database, including reading anything, adding anything. They never wrote any authorization. And that's easy to do when you don't know what you're doing and you're Vibe coding something. And the problem was that the person, person who wrote this was smart enough to hook up a bunch of other systems that brought sensitive data into it, potentially. And fortunately we were able to recover from this relatively quickly. But I wanted to highlight that if you don't know what you're doing, it's very easy to write data breaches, security vulnerabilities and big problems using these Vibe coded tools. And Claude is not going to stop you from doing it. And yeah, it's terrifying not to Mention just the random other annoying shit that it does. Like, oh, yeah, sorry, I dropped your local development database. It's done that to me maybe, maybe twice or three times now where it's like, no, Claude, I had a bunch of shit in that database that I was using. It was actually super helpful as a test bed. Oh, you just decided that you couldn't get a test suite to pass. You just nuked my local environment, like, all right, thanks, I guess. And then afterwards it's like, you're right, I shouldn't have done that. Okay, well, it doesn't get my data back, so. Thanks. We've actually come up with a pretty good way to solve that problem because, I mean, yeah, what happens is you're working in multiple different work trees and you have different kind of, you know, schemas for the database. But yeah, we basically just come up with this system where we have sort of like our master development database, and every time it starts working on a new feature, it just clones and creates a whole new copy of your development database that it can trash and do whatever it needs to on it. And then you still kind of have your good core development database there. But, yeah, that's. He has a real. He loves dropping databases. That's one of Claude's favorite things. Just like, oh, I'll just drop this database. Why not? It's easy. It's the easiest thing. And the way we got people right, using Claude to write code directly to production, like, ah, these are the chasms that everyone is going to fall into and need experts to be able to help them mitigate such things. Yeah. So silver bullet, foot inside, pull trigger. But we're moving a lot faster. And that part's good. Yeah. Oh, it's been an amazing unlock. I mean, I can't believe how much I'm able to get done working with Claude. It's. It's incredible. It's stimulating, it's motivating. You know, I do think that it is addictive, and I think that that is really something. Like, people who are working with these tools, this is. Is uncharted territory. You got to protect your mental health. I know maybe that sounds a little silly to say, but they're like slot machines because you just keep coming back and it's like, well, there's one more thing we could do. There's one more thing we could do. There's one more thing you can do. And so if you're like somebody like me who's just a little compulsive, you could sit there all day and all night and just be like, well keep going, let's keep going, let's keep going. And it's like you get all this constant dopamine hit from new features getting implemented, new things happening. So yeah, I would definitely recommend to everybody go take walk, go enjoy fresh air, spend time with your family. You know, appreciate the fact that you've got a 10x speed up. That's good enough. You don't need to go 100x, you know. Yeah, we're going to have Steve on the show to talk about Claude 2. And I mean kid in a candy store. It's the same type of a thing with them. And I would say it started even before being able to leverage it into, into coding aspects. I mean there was a movie with Joaquin Phoenix a while back. I think it was like her I think is what it was where he develops the relationship with the AI. And I mean it, it's like it was like just that close and here we are. I think people do rely on just even straight up LLMs, you know, chatgpt and developing. I mean there's a lot that's going to come out of that. I mean therapists look out because people are going to be going there for those purposes and medical professionals. That's scary for me is that people already use the Internet to self diagnose and help themselves and now it sounds like it's really understanding you on top of that. And that's scary. Sorry. Just like a fun, completely random. They ripped off the actress from that movie, did not give her consent to beat the voice for one of the companies and they used AI to rip off her voice and then like denied doing it allegedly. I don't, I'm not prompted on the full details, but yeah, they basically ripped off her. Alleged that they ripped off her voice and they basically used that and they had to do like a cease and desist thing to try to get her voice off of some AI tool that she never agreed to participate in or be involved in. And it was, it was stupid enough that I think it was an Elon Musk involved thing, but I could be wrong. But anyway, the movie her was a big inspiration for a lot of folks who are working in the space. Cool. Aaron, I'm AI'd out. You wanna ask our final question? You wanna ask a final question? Oh, wow. Do I get to do it? I don't know. I mean I opened the show so it's only fair that I let you close it. Right? Okay. All right. So Daniel, what was a question that we should have asked you, but didn't. Is that your go to. All right. Yeah. Justin always wraps this up nicely with, with that guy. Yeah. You know, I think for me, the interesting questions with AI right now are less about the technology and the tooling, and they're more of the deeper philosophical questions about what is the impact of this upon our society, upon our economy. I mean, this is a weird time that we're in. I don't think we've had anything that was quite such a industry. Changing world, changing technology, since maybe like, you know, the Internet. I mean, it really is kind of that big. And I wouldn't have said this a year ago, right. I mean, a year ago I would have been like, ah, chatgpt. It's a cute chat tool. Everybody's kind of getting a little crazy and like, well, no, in the span of the last year it's been, this has been crazy. This is crazy. And you know, I think we are going to see people, we're going to see jobs go away. I mean, that's just. Or at least they're going to change. Like the economy is going to change fundamentally. This is going to test, you know, this is going to test capitalism. This is going to test a lot of things. And I think that we need to have like a plan for that. What are we doing? I really worry about higher education right now. I worry so much about, like, kids who are, who are studying computer science and they're going to come out and know how to program and find out that that skill is just completely worthless. Like, who, who is hiring a junior developer right now. And like, I'm so happy that I did my education when I did and that I'm at the place that I'm at right now because people with my skill set are, I think, in tremendous demand. And I think we're going to see that demand go up and up and up and up. But, you know, how do you get started in this industry anymore? And I mean, is this just a totally new career path? Is this just a totally new job? And if we actually end up in a situation where AI is replacing jobs, how does this reshape our economy? I don't know what that world looks like, but it's hard for me to look at the current status quo and say we're going to be able to maintain this level of employment. Maybe we will. I mean, maybe AI is just like this, this huge optimizer and everybody's just going to be producing way more value and we're all going to keep Working as hard as we ever were and we'll keep allocating resources the way we always have. I'm not sure, but I think it is a very uncertain time and I think there are deep questions to be asking. And unfortunately, I think the only people who are really doing the deep thought on this right now are like tech bros. And I don't think they're really the right people to be thinking about this. So how do we get policymakers, politicians to actually take this problem seriously and invest the energy and the resources they need to, to figure out how to solve the global economy? Because it's a problem and it's a very uncertain time. Yeah, I think you hit the nail on the head. I've mentioned that on several episodes in the past is my fear about this chasm that's already appearing between what would be a junior developer and a senior developer. We need lots on this end and no way for this end to get there. That's been a huge concern and the same with other industries too. I mean, imagine getting a job as a paralegal. Like, we don't need a paralegal. Like, I'll just use the bot. But I think it is going to be a great force for good because I think we're going to see scientific innovations come out of this that we never could have before. You know, I look at like some of the labs that I used to work with and like, man, if we had this kind of a research tool and software development tool, you know, what kinds of things could we have done for drug discovery and for, you know, pandemic preparedness? And I mean, you know, I think we're going to cure diseases in so many ways, make the world, world a better place. It's just not going to be as much human labor going into this, this value creation. So what does that look like? Are we just getting, are we just accelerating the rich getting richer or are we going to, you know, is there going to be something where people who are starting out in the career are going to have a path to become contributing members of the workforce? I don't know. Yeah, we talked to Dr. Shah about that actually just on the previous episode about that very thing in the medical field. And I agree and I am super excited about that and I would love to have some guardrails for, for sure. But as a survivor of a childhood disease, I am all about it. Please help us find these actual correlations and let's get down the road in a hurry on some of these things that I feel like we're getting really close on. Yeah, dealing with a daily existential crisis is just part of being a developer in 2026, I think. So if you're not questioning the entire fabric of society at least a couple times a day, you're not paying attention. Amen. Amen. Yeah, but fundamentally, I do think that the promise of AI is enormous and I don't think we should be scared of it. I think we should be cautious about adopting it. And that goes for writing co code as well as society. We should be adopting and embracing these technologies. We just have to be cautious and careful about how we do it. But yeah, I think there's a lot of light at the end of this tunnel. I think it's going to be amazing where we're going to go in the next 10 years. I'm so excited for it. I really am. I definitely want to do an AI philosophy roundtable and just get some thought leaders in the space to talk about it because I think this is a really, really interesting conversation. I think if history has shown us anything, it's that the gains will aggregate near the top. But that doesn't mean that things don't get better for every everybody. And especially for like the implications this has for entrepreneurship and micro entrepreneurs and micro SaaS, I think is actually more exciting than it is depressing. And anyway, people smarter than me should opine upon that as well on a future episode. But we're at time, I think so I think we'll, we'll say goodbye. Aaron, thanks for keeping this thing on the rails with me. Absolutely. Happy to Daniel, as a repeat guest between you and Nolan, I think for the most frequent guest, so let you guys duke that one out. But thank you for being on and enlightening everybody with your wisdom, what you know about working with Claude and cloud code. Everyone else, thanks for singing along with us. If you enjoyed the show, drop us a comment, give us a like, subscribe, do all things you're supposed to do for my social media people to be happy with me and we'll catch you in a couple weeks here on the next episode of sassthat App. All right, thank you. Thanks all. Thanks for cruising along with us on sassthat App. We hope you grabbed some insights that were inspiring, actionable or at least entertaining. If you enjoyed the show, don't forget to subscribe and leave a review until next time. Keep building, keep growing and keep those apps sassy.