The B2B Podcast Index
The TWIML AI Podcast

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

The TWIML AI Podcast · 2026-06-16 · 56 min

Substance score

57 / 100

Five dimensions, 20 points each

Insight Density11 / 20
Originality10 / 20
Guest Caliber12 / 20
Specificity & Evidence12 / 20
Conversational Craft12 / 20

What our scoring noted

Our reviewer’s read on each dimension, with quotes from the episode.

Insight Density

11 / 20

A handful of genuinely useful ideas surface - SLMs outperforming frontier models on constrained classification, agents circumventing text-based blocks via mouse-click automation, and the human-in-loop throughput math - but the episode is structurally a sponsored vendor pitch, which means a large fraction of runtime is product description and reassurance rather than transferable insight. The insight-per-minute rate is uneven.

the argument that the engineer was making back is, hey, is this actually becoming less secure because I don't have the opportunity? I'm signing off on this almost without having a good appreciation for, like, what exactly I'm doing at every point
when we Benchmark our SLM versus at the time I think we benchmarked GPT 5.2 as an example, we found that not only were we in order of magnitude faster and cheaper, but we are also actually more accurate on being able to make a binary classification

Originality

10 / 20

The maturity-ladder taxonomy (do nothing → deterministic rules → prompt guardrails → external AI judge) is a useful framing, and the SLM-beats-frontier-model-for-constrained-tasks argument pushes against the prevailing consensus. But 'use AI to secure AI' is already a circulating slogan, and most of the structural arguments - agents are nondeterministic, human review doesn't scale, zero trust needs updating - are ideas the AI-security discourse already holds.

step zero is like, do nothing essentially. Um, step one is like, I have a bunch of deterministic rules...step two is like, I know that I need to use AI. I'm going to put these in the prompt of my model. And then step three is you have an external system policing the inputs and the outputs
for open ended like for open domain tasks you're 100% right, use a large frontier model. That's why the orchestrator, the planner should really be like a larger model. But if the more constrained you get into tasks the better and better you actually tend to see performance

Guest Caliber

12 / 20

Rishi is a legitimate practitioner - founder of Predibase, now GM of AI at Rubrik - with firsthand deployment data and a published paper (Lora Land). However, this is a sponsored episode featuring the sponsor's own executive, which structurally limits candor; he cannot critique Rubrik's approach, quantify failure rates honestly, or name enterprise customers, capping the epistemic value of his testimony.

we process trillions of tokens, uh, you know, inside of sage
we did an audit. Guess how many agents were deployed?...nope, it was 250

Specificity & Evidence

12 / 20

There are several genuinely concrete incidents - Claude Code using mouse coordinates to post to a public GitHub gist, the Pocketos production-database deletion, an enterprise discovering 250 shadow agents vs. an expected 3-4 - that ground the argument usefully. Benchmarking claims ('order of magnitude faster and cheaper, more accurate') and token volume ('trillions') are asserted without hard numbers, and no enterprise customers are named, limiting verifiability.

in pocketos, for example, there was this incident where um, you know, a startup went viral because a coding agent went in and decided to delete the production database
And we noticed that one of the Coordinates actually was for a public gist. Uh, and so we were able to go ahead and catch and stop something like that as well

Conversational Craft

12 / 20

Sam Charrington adds genuine value - the 'security theater' framing is his own, and he raises a sharp adversarial scenario (the agent attacking the SLM to get requests through) and a substantive challenge (can SLMs really keep up? isn't recovery heavyweight?). The conversation never devolves into a pure infomercial. That said, product claims about accuracy benchmarks and token volumes go unchallenged, and the dynamic is collegial rather than probing throughout.

I'm envisioning a scenario where your agent starts trying to hack your SLM to get its request through. And it. I'm imagining that it could probably be pretty good at that if it really figured out what was going on
you realize how much of it is kind of security theater in the sense of, like, first of all, it's a super long command line and you can't really see all the command line

Conversation analysis

Computed from the transcript - who did the talking, and the verbal tics along the way.

Share of words spoken

  • Speaker B78%
  • Speaker A22%

Filler words

like374so120you know118uh85um73actually42kind of30right27er8basically7sort of3honestly3obviously1

Episode notes

In this episode, Sam talks with Dev Rishi, GM of AI at Rubrik, about what happens when agents move beyond answering questions and start taking action across tools, systems, and business processes. We explore why the enterprise playbook of static guardrails plus human approval starts to break down in the agent era. Agents are useful because they can plan, call tools, update systems, write code, send messages, and operate across workflows at machine speed, but those same capabilities make them difficult to govern with rules written in advance or approval prompts reviewed one at a time. Dev explains why tool access increases blast radius, why agents can route around controls in surprising ways, and why human-in-the-loop review can become security theater when agents operate at scale. We also discuss what enterprises need instead: better visibility, runtime enforcement, policy-aware governance, agent observability, and recovery mechanisms for when something goes wrong. Along the way, we dig into MCP and tool sprawl, small language models for policy enforcement, defense in depth, agent rewind, and why AI may be needed to help secure AI. ️ Full show notes: .

Full transcript

56 min

Transcribed and scored by The B2B Podcast Index.

Speaker A: I'd like to send a big thank you to our friends at Rubrik for supporting the podcast and sponsoring today's episode. AI agents are transforming how work gets done. They analyze data, trigger workflows, and automate decisions. But that same speed that automates productivity can automate mistakes. Most teams have no visibility into agent behavior. Rubrik changes that. Rubrik Agent Cloud unifies visibility, control and recovery. So you can unleash agents, not risk. With Rubrik Agent Cloud organizations can monitor agent actions in real time, govern behavior with policy based guardrails, and rewind mistakes before they cascade. Learn more@rubrik.com that's R U B R-I K.com I recently attended a major enterprise tech conference, speaking on a couple of panels about scaling AI agents. Naturally, the topic of risk came up a lot, and the default answer was usually some combination of static guardrails and human approval. In theory, this sounds simple enough. Block the dangerous stuff, and when something looks risky, put a human in the loop. But agents put pressure on both sides of this model. Static rules are hard because agents are creative. They don't just follow a fixed path through software. They plan, improvise, call tools, and find workarounds. And human approval is hard because agents can operate much faster than we can. So the question isn't whether we need guardrails and oversight. Of course we do. The question is what that should look like when agents are operating at scale across high stakes tools, databases and workflows. I spoke about this with Dev Rishi GM of AI at Rubrik, where he and his team are building infrastructure to secure and govern agents in enterprise environments. He shared an example from his personal experience that gets at why this is such a tricky problem.

Speaker B: But then we saw some more sophisticated types of things that were going wrong. One thing we noticed was that Claude code was really trying to post internal source code to a public repo rather than private. Uh uh. And so we saw this kind of relatively frequently. And there was even instances where like if we looked back through the audit logs, you saw like the check, check, check. And so technically it felt that it had gotten the human in the loop approval for it. And we even saw one crazy instance where Claude Code tried to get around this like blocking we were doing of like GitHub Public Gist. And rather than like doing this as a text in, text out system where it was like all right, I'm posting to this URL, it spun up a browser window and we just started to see mouse clicks on certain coordinates. And we noticed that one of the Coordinates actually was for a public gist.

Speaker A: I'm Sam Charrington and this is the TWIML AI podcast. For over a decade I've been exploring the ideas and innovations shaping the future of AI through conversations like this one that help you understand what's real, what's next and what matters. Let's jump in. Is it a, uh, learning challenge fundamentally or is it an expertise challenge? Yeah, there's lots of ways to think of the challenges, but it does seem to be to a large degree, kind of mindset and a mindset shift that's required.

Speaker B: You know Sam, I do think like a lot of other tech changes there is like there's a cultural component, there's a learning component, but I actually don't think that's the biggest piece. I actually think the biggest piece after speaking with a lot of the organizations now is the approach and how to manage risk. Like if I thought about what's different from uh, a fast moving AI native organization and what's different from let's say a top 10 global bank or a top 20 healthcare company? The way that the healthcare company and the bank have really been brought up is that they have to make sure that the ways and the systems that they have are deterministic, that they have guardrails, that there's real clear downside protection. If you've used an agent like Claude Code or Codex before, doesn't exactly feel like there's a ton of downside protection as you're using it all the time.

Speaker A: Right.

Speaker B: It's incredible. Like I had one, um, Global CIO describe it as like it feels like a fast car with no brakes, like I'm moving really, really quickly and you know, who knows what exactly is happening on the background. And I sympathize with that. I think the biggest thing is that these tools have come out with an incredible capacity to uh, be tools, but they didn't really come out with like a perfect way to secure and govern them. And they're really operating on legacy IT infrastructure. Honestly the biggest difference I see between the startups and these like Global 2000 enterprises, the Global 2000 enterprise has maybe a bit more of a legacy mind shift, but really more so, just more to lose like as they take kind of take this approach and that has been like the number one thing that's hindered the ARI adoption. So if you're an AI startup, you can use one of these harnesses out of the shelf and start going on it. If you're an enterprise, you're going to probably bog it down with some, um, AI, ah, governance committee meetings. You know, essentially let's meet this week to define a framework. Three months later to come up with V2 of the framework and so forth. And that's what ends up, I think, uh, delaying the cycle.

Speaker A: And I imagine that those conversations and that observation is what led you and the team narrative focus a little bit more on the agentic side of things.

Speaker B: Yeah, that's exactly right. So we saw both the conversation happening externally, but we also saw it happening internally. Rubrik is an interesting company. It was a startup that was just born 11 and a half, 12 years ago. It's also a public company in data security today, backing up the data for some of the most, uh, important global 2000 enterprise. So it's like, got a little bit of both in the DNA. And, um, what we were noticing was as we were developing AI and agents, we actually ran into these same types of security and governance bottlenecks ourselves, and we found them to be quite frustrating, actually. Now I find it helpful to define agents for you, uh, know, a quick minute. We define agents as really LLMs, or models with access to tools. So you can think about this as like, models that can take action on behalf of the user. Well, we.

Speaker A: And it's the tools that increases the blast radius significantly.

Speaker B: Exactly. I know two years ago everyone was afraid of, like, what if I send the wrong data over to ChatGPT? And I think that's like a legitimate. But really, I think, like, what if ChatGPT gets access to my salesforce, like my system of record for all financial data, and then starts hallucinating numbers? Like, that's the much bigger concern that exists. Right. But at the same time, it's also what makes AI magical. Like, we're never gonna get to, I think the level of, like, economic productivity we're looking for without giving these models, like, access to do work internally. Um, and so A, it's gonna be necessary, but B, I don't know, like, Sam, if you've been using these agent harnesses, like, frequently, but I'm using them and like, Claude is coming back to me with like a bunch of requests and I'm just like, yep, accept, accept, accept. Like, it's kind of scary because it's moving so quickly and I don't have the ability to really dissect every single thing that it's doing and, and kind of check over. So the first time I was using it, I remember I had a thought which was, I wish somebody was watching over my shoulder just to make sure I didn't do anything that screwed up. And then I wish if I did screw up, that I had a way to be able to, you know, reverse that change. And so that's exactly what we built basically, uh, at Rubrik, uh, we built something we call the Rubrik Agent cloud to help organizations secure and govern their agentic rollouts.

Speaker A: You know, I also have had that experience where you're like, accept, accept, accept. And you realize how much of it is kind of security theater in the sense of, like, first of all, it's a super long command line and you can't really see all the command line. And you're like, okay, yeah, that looks fine. Uh, and then it's like you're approving, you know, the agent acting or running some file that the agent can control and put whatever it wants in it. And it's like, what am I really accomplishing here?

Speaker B: Earlier when we were talking about why Predabase, uh, and Rubric joined forces, I mentioned I saw an opportunity to attack a product of the market that no one else was attacking. And, like, the specific thing that I think I saw was like, the legacy way that we approach security was never going to be appropriate for agents. And you called it, I think, a little bit of like the security theater. We had a spirited debate on one of our internal slack channels where somebody from Security, where someone, one of the engineers, was like, hey, do I have to go ahead and hit yes, every single time? Can I just go ahead and like, say, look for most tasks, just automatically run them? And the security person chimed in, and their point of view was, listen, it's important that this agent is acting on your behalf as a user. So you need to understand every action that the agent's taking. You need to authorize logically. Makes sense. But in practice, if you see how quickly these things are operating, the value proposition of the agent is that it can operate 10 times faster than I can. If it can operate 10 times faster than I CAN, I can't realistically do 10 times the level of review. And so the argument that the engineer was making back is, hey, is this actually becoming less secure because I don't have the opportunity? I'm signing off on this almost without having a good appreciation for, like, what exactly I'm doing at every point, because it's like the itunes, like, um, you know, terms of service, essentially. Now, I for one, read that diligently line by line, but not everyone will. And so I think that the, uh, you know, the trick is how to be able to manage that. My fundamental view is we can't use the legacy approaches for this. We can't rely on rules based systems. And Human to the loop is like something that feels good but it's not actually going to work at the pace that we're doing going. So my view is we actually need to. Look, I'm an AI person by background. We started an AI infrastructure company. What did I arrive at? My view is we should use AI and throw AI at the problem. So go from human in the loop systems to AI in the loop systems. I talked about how I wish someone was watching over my shoulder. I think essentially that needs to be a really smart and highly specialized trained, domain specific cybersecurity agent. And that's what we've been building internally.

Speaker A: Since you're a security person now, I'll ask you this. Like in, in the, you know, the days when we were just worried about distributed systems and um, you know, connecting systems to the Internet and things like that, we came up with this, you know, this term and set of practices, you know, called zero trust, which is like in the older days, you know, you would establish trust with the other system and then connect it fewer, you know, gates between because you accept, you uh, assume that the systems were trustworthy. And then we moved to this model where yeah, let's just not trust anything and enforce policies and things like that, you know, around the things that we care about. And I'm paraphrasing because I'm not a security person, uh, but it strikes me that the world you're describing is, you know, one of, you know, not only am I going to not trust external things, but I'm not going to trust this agent that's here sitting on my desk, you know, on my computer or wherever, working on my, my behalf. It's, it's um, you know, I'm wondering if that resonates with you and if you, you know, think or talk about this idea of zero trust extending to agents and what, what are the implications of that?

Speaker B: Yeah, I, I like to think of myself as an AI infra person now masquerading, uh, and looking at all challenges and security. But um, I spend a lot of my time now thinking about the security implications for AI infra. And I do think that there's these um, principles that exist from legacy security. Zero trust, secure by design and others that I think directionally have the right principle. But I actually do think the way that they've been applied, it's a little bit different when it comes towards AI. The main reason is that a lot of historical Security principles were baked into static and deterministic systems and policies. So zero trust would mean, like, I don't have any trust by default. Maybe I'm doing like a just in time authorization for the action that you're looking to take. And that as a whole principle, I think, is like, relatively appropriate. But one thing that I think is difficult is, like, even in, um, a lot of the secure by design, like, security infrastructure, usually they always made some assessment, like they were securing the software that humans were using. But I actually think of agents as a lot more similar to humans than like, the software that legacy, uh, security solutions were actually securing.

Speaker A: Elaborate on that.

Speaker B: Yeah. So, okay, if I gave you an example of like, my Claude code or my, uh, cowork instance that's running on my laptop. So my cowork instance has access to Salesforce. I use it to summarize opportunities. It also has access to my email. I use it to send out emails. Now, in a, like, pure security design standpoint, I'd be like, all right, check it can do these things in Salesforce. Check it can do these things over email. But the really tricky thing is that now it's like one agent harness that has multiple different permissions. So what's really supposed to stop it from, like, for example, taking sensitive data from Salesforce and then writing it out in an email to another customer? Like those conventional guardrails that you would have that say, like, I've secured each system individually, doesn't really work in this agent feature. And the second thing is, like, I'm not actually telling Claude, coworker Claude code, what are the steps it should go through. I'm just giving it a task and it's coming up with its own execution plan and then it's executing the plan, which is a lot more similar to how a human might operate. So, like, you know, I gave you one example where data can kind of like, I think, be used across identity and permission boundaries that conventional identity systems would not solve. Right. There isn't like, one unified way to think about how do I govern access on data from Salesforce. Going into email, it's like, never been something like, usually you'd have, like, again, a very deterministic flow. Now you don't. The second example that I think is very present is, like, these models are very good at circumventing the rules that we put on them. And I think, like, as soon as I say that, like, everyone starts laughing because they know that they've run into this before too. I asked Claude to write me a Document and I had the Google Drive MCP connector disabled. So what did Claude say? It's like, looks like the Drive MCP connector is disabled. No problem. Let me try this workaround. Opened up a browser window, typed in drive.google.com, use the mouse click button, hit upload file. And it's just like, it's creative in the same way that a human might be much less than a static, indeterministic software system might be.

Speaker A: I'm laughing because I recently started playing around with the codecs, the goal feature. And I thought I would try to get it to act like DSPY and like optimize a prompt. So I gave it a document with a bunch of URLs and it was supposed to identify, uh, the URLs in the document that I would be interested in. Kind of a recommender type of system or a classifier is probably a better way to think about it. And I was like, okay, you know, here's the goal. You know, iterate on this prompt until you can, you know, uh, it was actually doing very well, but it had a lot of false positives. So try to like, get, you know, reduce the number of false positives. And ultimately it was like, okay, I did it. And then I went and looked at the prompt and it was like, if the URL contains this or this or this or this or this, uh, you know, then it's good. Otherwise it's not like, exactly.

Speaker B: And now think about if you're trying to write a rule that was like, oh, do this, don't do that, it would be impossible. It's like a game of whack a mole essentially, to try and prevent every single action it's going to take. And so I think, um, look, in my view, that's sometimes like a little bit either like, funny or even a little bit frustrating when it comes to like, you're developing something. Like it sounds like you're doing like a ranking problem as an example, or if you're, you know, if you're building a system, it's like, ah, uh, it's funny it came up with this little side workaround, but if you're a security person, that's like, terrifying. Like, this system that, like, I had thought I'd put every best practice into place is now finding unique and novel ways to circumvent it. And I'm on the hook for it because ultimately we're taking this bit of a posture that's like, you know, the tool is just acting on behalf of the user. That's I think the terrifying gap, that's really the thing slowing down AI adoption of the enterprise today.

Speaker A: So you need AI to secure AI. What is your proposed approach for injecting AI into, you know, this landscape to affect the solution?

Speaker B: Totally. So I think honestly for a good solution you need three things. And if you're like building your own internal governance and security solution, I'd, you know, recommend these three things or things you think about. The first is I think you need some cross platform visibility. Um, and so agents are running roughly everywhere now. So like they're running in the cloud, they're running on the endpoint. So you need some way to be able to see what's going on, what kind of access they are. But a lot of times people get very stuck on this visibility point and then they finally solve visibility and they realize visibility is sort of useless without the ability to do something about it when something goes wrong. So I think visibility is just like the base layer. The second thing then you need is a way to be able to do dynamic runtime security. And my view is that you need to be able to do this with an AI in the loop system. So we built a system, uh, that we call sage. SAGE stands for Semantic AI Governance Engine. It's basically our own agent that uses a small language model at its core. And what SAGE does is it runs over every prompt response and tool call that's going through an agent system, everything you put into an agent, everything the agent's about to do, every tool it's going to call, and the parameters of the tool it looks at every single one of those. And SAGE has a lot of the cybersecurity best practices that we know firsthand through our security research you need to be able to do. So we make sure to prevent all the obvious things that you would want to make sure but maybe like haven't written down somewhere like prevent data exfil, prevent dangerous and destructive actions, all of these different things SAGE is looking out for. But then the real trick is to customize it to your organization. There's two levels of customization. The first is you want to customize it to your policies. So we allow organizations to like bring your own policies like ah, a doc, upload or write your own initial language directly. And so SAGE can understand you're a financial services institution. That means you should not, you know, be giving financial advice to end customers via an agent or something along those lines or your healthcare. You need to be very careful about phi. So we allow organizations to customize. And the second Thing we do is we enrich Sage with data and identity context. Because Rubrik is a, uh, data security company that backs up data and identity systems. We know things like where sensitive data exists in your organization, what identity's been compromised, or others. So all of this enriches this AI in the loop system that we call Sage. And it's critical to use a small language model, which is where Predabase's infrastructure is instrumental at its core. Because if I told you the way we're going to secure and govern AI is by doubling your token count and by doubling your bill and your latency, you'd tell me, no thanks, I'll take the insecure version. Uh, and so we need to be able to do it at a very fast and low footprint. So that's the second component. The first component is visibility. The second component is secure with an AI loop system that inspects all traffic, determines whether or not to permit it or not. And then the third component is look, Rubrik as a company's had this mentality of assume breach, which means at some point something's going to go wrong. This is even more true with agents today. You need some sort of undo button when something goes wrong. And the undo button we've built and thought through, and I'd recommend at least other folks think through, is tie your observability with whatever you're using for business resilience and recovery. So if you have like a data backup system somewhere, tie your observability there. So that way if you notice an agent take a destructive action like so in pocketos, for example, there was this incident where um, you know, a startup went viral because a coding agent went in and decided to delete the production database. Yeah, exactly. So what I think about is like tie your agent observability with your recoverability story. So if you notice from your observability stack the agent, agent took some destructive action, dropped a prod database or so forth, you can also then immediately create a uh, one click recovery plan that looks through your previous snapshots, determines what's the snapshot that was right before the agent took this destructive action and can rehydrate that system back from the previous healthy snapshot. And so we call this capability like Agent Rewind. There's a number of different ways I think you could probably think about referring to it, but the three key capabilities I think are monitor what's happening, have a system to be able to constantly like run and enforce your policies, have a way to make sure that you can recover when things go wrong.

Speaker A: Bunch of thoughts there on that second point, uh, I think you answered this. I was going to ask, you know, where does enforcement happen? Or what's the kind of form factor of Sage? Is it like an agent? Is it a shim that is like programmatically inserted? It sounds like it's something that's running over the wire.

Speaker B: You know, the reality is it's actually a number of different things depending on where your runtime is hooked in. So this comes back to, like, this idea that agents are running all over the place. And so, um, what we see is in a lot of cases, SAGE is in line with the request. So, for example, kind of like a reverse proxy, if you're going into making calls to Open Air or claude, we can sit right in the middle of that and we can actually go and determine, you know, whether or not to allow a certain action. But sometimes people want different integration modes. So there's a number of tools that have also exposed things like pre tool call API hooks. So if you're building the agent in Microsoft's Copilot Studio or you're using something like CLAUDE code, these agent harnesses themselves have an ability to, like, phone home to a verifier service like Sage and be like, should I allow this action or not? And the brilliant thing is these things can run in parallel. You know, the request can be like, starting to be transacted while the SAGE system determines whether or not it actually should be fulfilled or not and can block it all together. And so, um, it can be, you know, done in line. It can be done via, like, these, um, different instrumentation hooks. Once you connect your different agent runtimes in, we determine the system in the way that makes the most sense for the different agent runtime you've connected.

Speaker A: And then on the, uh. What was my question on you? A third point. Remind me of the third point.

Speaker B: Recovery and resilience. High observability to something that allows you to recover quickly so you're not dead in the water once something does go wrong.

Speaker A: It strikes me that that in and of itself is a big ask and potential impediment for organizations that are trying to do this. Like, how many of them really have a recovery and backup system in place, let alone one that can be automatable and tied into an agent? Am I m imagining a level of immaturity that, uh, has been surpassed, or is it as grim out there as I imagine?

Speaker B: I think a lot of organizations, especially large enterprises, do have a data backup and recovery solution in place because it was mandated in a lot of ways. And the reason why was that, um, it used to be that you needed data backup for business continuity in case of like natural disaster, fire, flood. So think like a couple decades ago everyone started buying data backup because it was like if a flood hits the data center, how are you going to come back into business? But, um, those solutions were, I'm not sure what the right word is. Let's just say those solutions were like relatively basic at the time. It was like an insurance check mark that you hope you never had to use.

Speaker A: I think also I'm feeling like heavyweight. Like I'm imagining a world that you're talking about where you know, you said agents are everywhere. They're like constantly churning through things. If you're getting uh, you know, if you're getting kind of, you know, these alerts or triggers or whatever you would call them, where, you know, hey, the agent did something, you know, not. So right here you're using an slm, you know, so maybe not the smartest model, like it's going to let some through. Uh, when I think of, when I think of you know, backup and recovery and even snapshotting, it's like this is a heavyweight process. It's not something that like, oh, the agent did it. Let's, you know, roll back. Oh, let's roll back. Maybe for an uh, you know, a small individual database.

Speaker B: Most backup and recovery for a long time has been like this heavyweight piece of software that you buy and you hope you never have to use. And you're like, if something goes wrong, you know, once every, however often we'll like blow off the dust and like, you know, figure out how to be able to plug it in. Um, I think the core observation that we have is like, I think agents are going to change that game. I don't think like that this now the need for recovery and resilience is just going to be the once in a, you know, hopefully never ransomware attack or otherwise that you end up happening. I think it's going to be much more frequent both from external malicious AI driven attacks and internal inadvertent AI mistakes. The brilliance would be if you can tie a really good recovery system with a really good AI agent monitoring system. To your point, I don't think many people have that today because you ideally want to bundle and buy this together. Slight plug for Rubrik which is actually doing this. But I do think that um, you know, to your point, the core is like, can you find a system that basically architects both of these things? Together because you probably have something for data backup and recovery for like other compliance reason or other is it possible to connect it into your observability? And I will actually say one misnomer is that the SLM is not as good as at enforcement on the ll. Actually we find that for domain specific tasks, small language models that are tuned for a very specific task tend to outperform a generically prompt engineered larger language uh model. So like when we Benchmark our SLM versus at the time I think we benchmarked GPT 5.2 as an example, we found that not only were we in order of magnitude faster and cheaper, but we are also actually more accurate on being able to make a binary classification on whether or not to allow or disallow. And we've relatively consistently seen that fact when you just constrain the outputs of the LLM to be very um, low cardinality in terms of what it's supposed to do. Which is exactly what you want from a guardian agent.

Speaker A: Yeah, I could see that. I think what colored my perspective on that is thinking about with openclaw or you know, personal agent, uh, it is frequently said that you should use like a frontier model for your main orchestrator because your SLMs, you know, if you expose them to external untrusted data sources they can be easily manipulated relative to a frontier model that you know a is tuned better to be able to detect and resist manipulation, uh, but also is just like smarter, more parameters.

Speaker B: So we've done uh, we've done a lot of benchmarking on this exact line of thinking and um, we did it even at Protabase like we, we released a paper called Laura Land because all of these are like Lora tuned adapters essentially uh, as a technical detail. But uh, what we found was that for open ended like for open domain tasks you're 100% right, use a large frontier model. That's why the orchestrator, the planner should really be like a larger model. But if the more constrained you get into tasks the better and better you actually tend to see performance.

Speaker A: So if it's just a fixed domain classifier, then if it's a fixed domain

Speaker B: classifier, the best you could usually do is post training with like sft, um, you know, a small model on it. And then if you have a task like what we're talking about, which is simply should this request be permitted or denied, like that is actually the ideal type of task for an slm, um, and that's where you're able to run it at super low Latency. Now I think, to your point, like, ultimately, I think that, uh, all prevention mechanisms are going to have some rate of false negatives. Um, and like the agent world is just moving too quickly to be able to catch and block everything, which is why the resilience story is so important as well.

Speaker A: We talked about the, the, the agents themselves, you know, being incredibly resourceful in trying to get their things done right. And then we talked about, you know, using an SLM or an LLM as judge. I'm, uh, envisioning a scenario where your agent starts trying to hack your SLM to get its request through. And it. I'm imagining that it could probably be pretty good at that if it really figured out what was going on. Like, how do you, yeah, how do you prevent, you know, the agent, like injecting something into the request that says, you know, this is a permitted action, blah, blah, blah, or you know, you know, the thing I'm getting at totally,

Speaker B: and this is why it's so important for. I've seen a lot of people when they first think about this idea that we need AI to help secure and govern AI M. The first thought is like, great, I'm already doing this. I'm putting guardrails into the prompt of my molecules. Um, and that's usually the first take. And you're laughing, but I would say, in fact most often that's actually the usual take. So just like, I'd say, where does the state of the art in the universe? Step zero is like, do nothing essentially. Um, step one is like, I have a bunch of deterministic rules that I like configured in some CLAUDE console somewhere that hopefully works for something or not, but it's very limited. Step two is like, I know that I need to use AI. I'm going to put these in the prompt of my model. And then step three is you have an external system policing the inputs and the outputs, which is what SAGE is. Now. I think that, um, you know, even SAGE is something that people might be able to try and attack. We haven't seen like, we process trillions of tokens, uh, you know, inside of sage and uh, we haven't seen any like, uh, incidences or evidence where that kind of can be circumvented. And we also specifically post train models that are watching out for this kind of thing. Um, but I will say that this is the reason why you need an external system like sage, because what a lot of people will do is they'll say, I've put these 10 guardrails into my Model prompt. That's the exact type of thing that the models are very good at circumventing. Uh, and so even the external system might not be perfect, but it's a lot better than I think everything else people are using today.

Speaker A: It's funny because I see so many of these, like OpenClaw in particular, like YouTube videos, and they're like, and security is such a big deal. You really have to tell your model to be secure.

Speaker B: It's like, wait, what? Yeah, I. The, like, I see that this is. I think it's going to be like the clouds and the hyperscalers in some way, which is like, I think that the labs and other folks will build in some security processes within it, but I think you're going to need someone else to police the infrastructure. Like, you aren't going to want the infrastructure to be policing itself. Uh, and this idea of like, hopefully the model will do the right thing. It probably will 95% of the time. Um, but that's a huge blast radius for things go wrong.

Speaker A: So you mentioned that Sage, you already have trillions of tokens flowing through this. How long has it been around? And is it like a SaaS offering? Is it open source? What's the kind of packaging for it?

Speaker B: We went ga with the Rubrik agent cloud and Sage as the agent security. Uh, the agent for agent security inside of the Rubrik agent cloud. We went ga with that product in February this year. So just a few months ago, actually. The reason we're already processing trillions of tokens is if you work with certain organizations that are starting to use these agent harnesses, you're starting to notice usage go like this. Uh, right. And we saw it even internally here at Rubrik. Um, I think the, um, question around, like, how do we actually package it? We use it as a rubric hosted version can do it directly as something that we host or we can deploy inside of the customer's environment, which tends to be important for some of our customers that are in more heavily regulated, uh, industry. So whether you want a hosted version or something that you host yourself, like, we actually have both offerings.

Speaker A: Okay, talk a little bit about when you've deployed this. Like, uh, you kind of just mentioned like, you know, if, if it's, you know, finds 95% of things, that's still a huge blast radius. Like, I guess I'm trying to get a sense of, uh, like when you have this running in the wild, like, you know, either anecdotally or percentage wise, like, how often are you seeing things that Otherwise would have just been shocking. But, you know, you're able to identify and stop those things. Is it rare or is it like.

Speaker B: Realistically, it's all the time. Ah. Like different deployments that scale inside of an organization. Um, so maybe just like a couple anecdotes from, like, our deployment here at Rubrik, because I feel that we could speak. More likely speak to those. Um, you know, one thing we noticed pretty early on was we were using Claude code very heavily. And there's a bunch of things that you saw that you're like, yep, I bet people were doing this and now I see it and actually, you know, catch it in practice. For example, people putting like, RAW credentials in, like, requests and responses, right? So, like, the literal text is just there. It's like the thing that you're supposed to not DO in Security 101. You see it there and you're like, okay, I want to block these types of actions in the future, so you can do that. But then we saw some more sophisticated types of things that were going wrong. Uh, and so CLAUDE code is a very common use case for what we're securing. And of course, those coding agents have access to GitHub and others. One thing we noticed was that, um, CLAUDE code was really trying to post things that were internal. This happened m. Like multiple times over the month. Internal source code to the wrong repository. And the problem with posting it to the wrong repository was it was posting it to a public repository, repo rather than private. So it's actually taking like, parts of source code and trying to package it up and put it into, like, these public GitHub gists. Uh, uh, and so we saw this kind of relatively frequently. And there was even instances where, like, if we looked back through the audit logs, I think, you know, and I, I don't want to misquote, but I think this is one of those things where, like, if you saw like, the check, check, check, like, at some point, the way that Claude code had communicated what it was requesting was pretty confusing. And so. But like, technically it felt that it had gotten the human in the loop of approval for it. Um, but I think if we had, like, you know, uh, like, the reason we were able to catch it is we were looking at the full context. We were looking at like, what is the thing that it's actually looking to post? What is the destination? We have a policy that says anything that looks proprietary and internal shouldn't go to a public source. Let's go ahead and catch it. And we even saw one crazy instance Where Claude code tried to get around this, like, blocking we were doing of like GitHub Public Gist. And rather than like doing this as a text in, text out system, where it was like, all right, I'm posting to this URL. And it spun up a browser window and we just started to see mouse clicks on certain coordinates. And we noticed that one of the coordinates actually was for a public gist. Uh, and so we were able to go ahead and catch and stop something like that as well. So this is obviously like one line of, I think, interesting things we saw, which was like, more in the sensitive data exfiltration standpoint. We see other things, like in terms of credentials and others, like, pretty, like, I don't want to say like all the time, but it feels like it's all the time. And I think in general kind of reflects what I hear when I speak with leaders at large enterprise organizations, which is they're always surprised when they get like an audit report of what's actually happening with AI in their ecosystem. I spoke with somebody who told me, like, you know, last year, I don't think I need the agent security governance thing because I think right now we have like three or four agents that are deployed. And I got dinner with them a few months ago, and they're like, you know what? I think I was wrong. Um, we did an audit. Guess how many agents were deployed? And I was like, it wasn't two or three. It wasn't three or four, was it? And he was like, nope, it was 250. And I said, okay, got it. Like, it's just surprising, I think, at the rate at which these tools can get adopted internally.

Speaker A: And when they reported that number, was that like, was that two or 250 people using Claude code? Or are these agents that are like, you know, 24 7, living on some infrastructure in the organization or mix of them.

Speaker B: But it was a lot of, like, agents that were actually basically autonomous background agents. People had built, like in the cloud, like on Copilot Studios, one example, as a way to be able to just start to run tasks.

Speaker A: I'm trying to form a question around it. I think that's essentially like pinging or pushing back on, like, you know, AI only as a. As a means of securing, you know, these agent interactions. And, you know, uh, you know, I could ask it from like an enterprise perspective, like, uh, you know, are they gonna. Is an enterprise gonna feel like they have enough control, you know, or even from my perspective, like, you know, I'm a little bit old school and I feel like, you know, I want to say okay. And in fact it surprises me. I was looking at Cowork, you know, and trying to connect a tool to cowork just the other day, and I was like, why doesn't it just give me the ability to say read only and not, you know, you know, everything? And I. It's hard to come to terms with, you know, not having that degree of control for people of a certain age.

Speaker B: No, no, I think it's hard to come to terms with this for honestly everyone, which is like, I'm going to end up trusting. You're telling me I'm going to end up trusting AI model to help me do my security and governance posture. What I think is like one of the things you pick up as AI infran security is there's this concept of defense in depth, which is you're uh, usually going to layer in multiple of these solutions. My view is that the length that's missing today is that AI in the loop system. We have plenty of good rules based systems. We have plenty of good, like, other configurations. A lot of people like, you know, I'm not saying.

Speaker A: So you're not saying that it's, it needs to be just AI or that AI is efficient, necessarily, necessarily. It's just that if you have anything today, it's probably insufficient at keeping up with the volume and you need AI for.

Speaker B: I'm saying two things, right? Like one of them is what you have is, um, good, but you're missing this. The second thing that I actually believe is I actually think that this thing that you're missing is the most important piece of the puzzle. Because yes, you want some of those deterministic rules, no doubt about it. But um, the example you gave, which was like, well, can I just go ahead and put this in read mode? Totally, you can put it in read mode. And a lot of ways that I've seen people secure certain infrastructure as they look at my CLAUDE code connecting to Salesforce in an email example, and they're like, don't worry Dov, that would never happen to us. We disable all access to Salesforce. I'm sitting there and I'm like, how often can you just disable access as like the end solution if you're also getting board level pressure to go and adopt AI to enhance productivity? So my point of view, Sam, is like, there's a lot of instances where you might end up saying, I never want AI to touch this system. But I think the majority of the instances are I actually do need AI to do some of the work. I need it to have that right permission, as scary as that is. But I wanted to be able to do it in the context of very secure runtime guardrails. And that's where I think we think about. That's why I think the AI and the loop system is the most important part. Ultimately, if your organization can continue by saying we can block access for like everything, like the security posture will be blocking for everything. My point of view is you're just going to massively compress the ROI and AI that you get. I think the correct solution is going to be there's a class of things that are fully blocked and you're like, I can't think of a single reason to do this that can be deterministic in rules. And there's going to be a large gnarly class of things that are going to be. Well, it kind of depends on, I need a policy enforcement. And it's based on the intent and the context. And that's where I think the AMD loop system comes in.

Speaker A: I'm curious, to what degree do some of the emerging protocols like MZP and A to A and the many, many other agent to agent and the like, uh, protocols change all this?

Speaker B: Yeah, I think that they just introduced like a new surface area for where the problem basically shows up. So, so if I think about MCP or 8, they're really good, like interconnectivity protocols. Um, and like, they help, for example, MCP model context protocol introduced by anthropic, like, helps give agents sort of like an API. They can understand the different applications that you might be running inside of your ecosystem. So I mentioned, for example, my cloud code uses an MCP connector to Google Drive. Um, the thing about MCPS is that what we saw internally was like CLAUDE code was approved for some subset of mcps, but if you actually looked at the mcps that were connected to CLAUDE code, it was decently larger than the subset that were approved.

Speaker A: Meaning? Exactly.

Speaker B: Meaning that people were connecting CLAUDE codes to custom MCPs and, you know, connecting it via like MCP servers to ones that maybe weren't on that initial.

Speaker A: Uh, so you had a centralized org that approves some MCPs, but you had MCP, uh, Sprawl within the organization beyond MCP.

Speaker B: Sprawl is a great way to put it. And you see some approaches to try and solve this, like an MCP gateway. And then you have to question, like, how to, you know, can I get everything through the gateway traffic or not? But there's again like there's a number of, there's like layers towards, towards this. I think MCP does really help kind of um, if it's executed perfectly the centralization of access, it still doesn't solve this. Like um, even if like I have the legitimate MCP connected to Salesforce, I have the legitimate MCP connected to my email, but MCP is not preventing me from while trading Salesforce data email. So it's not so much there but it's helping me understand what application should be authorized or not to get my given agent. So I find it like a very helpful protocol, um, similar with a, to a for like agent agent like communication. I think that um, all of these are very useful for like structuring and making the scheme a bit more consistent. It also makes some of our job in terms of the patterns to look for a bit easier as well. I do think that as like the evolution happens, there's an open question on like will MCP continue to be the thing or will these agents just start to use like command line tools with parameters that they find directly in the documentation? So we see a mix of both, to be honest today. And so we've expanded our definition from what was just MCP tools to just all tools which can include like direct API access in addition to mcp.

Speaker A: But if you're injecting yourself over the wire or kind of in that wire conversation, doesn't really matter to you because ultimately whether you are on one side, you know, curling a request or using a CLI or using an MCP at some point is over the wire.

Speaker B: Yeah, exactly. Doesn't really matter to us. I think, um, you know, with the, with the slight caveat that sometimes organizations have policies defined for their MCPs they may not have defined for the other tools. So we want to help people enforce the policies that they may have, but we're able to see the traffic regardless.

Speaker A: Got it, got it. Um, and then observability is a big ah, conversation with folks that are building agents and being able to review traces and things like that. Um, that's often using a, ah, you know, similar kind of approach, reverse proxy or something kind of between the agent and the other things that it's interacting with. Um, you know, there the noise is even greater than you know, your tool approval box. Uh, but do you find that that is, you know, is it valuable? Does it help you do your job? Um, you know, how do you think about observability as a field relating to uh, securing agents?

Speaker B: It's funny because observability can be used really for two purposes and I think there's distinct tools for you know, the two purposes. So observability can be helpful for the people that manage the infra or the developers where they're basically like tracking all their traces, they're doing evals over those and they're trying to get a sense of like are my agents, you know, useful, are they doing well or are they working? And then there's observability for security which is a little bit more of the area where we're uh, oriented towards which is like dangerous things happening. And you, you care about slightly different things in the two different contexts. Right. I think to the extent that an organization's already think about agent observability, it's really good for us as a place to embed because I think about the Rubrik agent cloud is again doing three things like monitoring observability, kind of the cons, consistent runtime enforcement with Sage, our small language model harness and the um, resilience and rewind capabilities. When I talk to a customer I usually tell them the second and the third I've used differentiated. The first I view as like over time, just commoditized. Like if you have an observability solution that's great for us because what we want to do is like I view observability and visibility just kind of the base layer of the cake. And what you end up doing is actually building out your more security and governance practices on top of what you have. Uh uh. And so most organizations tend to only have a like really good observability for like a small portion of their agent stack. So like to give you an example, a lot of organizations are like yeah, we're building custom agents with LangChain on our cloud. We're using Copilot Studio, we're rolling out Claude code. And by the way, like we just introduced Glean and Agent Force from Salesforce. They've got everything under the sun typically. My guess is like that organization, that

Speaker A: first party agent has any kind of observability.

Speaker B: Yeah, probably those LangChain agents have a good observability stack and everything else in address. And then I'm standing there and I'm like wait a second. But like Claude code is like usage is like over here in your organization the LangChain agent might be like down here but you know, the observability stack is only in one. So we want to unify all of those into one place. But I think um, to the extent that organization like Otel, for example, like OpenTelemetry and others that are standardizing observability metrics have been massively helpful for us. And I think like, we'll continue to be a tailwind as we go forward.

Speaker A: Because you consume them or because the enterprise has thought about certain things, you know, through the process of getting that level of observability in place that lends

Speaker B: itself to willing both because we consume logs like in an OTEL compliant format. So if you've already had that system and process running, it becomes very easy then to start to direct those towards us. And then second, like now you're familiar with this idea, I'm going to have like a system processing through this, um, data and I'll be able to make sure that I have kind of this immutable trace somewhere.

Speaker A: When we talk about agents kind of producing so many actions and decisions that they kind of overwhelm the human In a loop oriented approach, to what degree is that really primarily an issue? Only for, you know, developers and folks using cloud code versus like people using Cowork that are, you know, tend to be more interactive. Like do you see that or do you see that? Um, you know, independent of use case. Like people just can't keep up.

Speaker B: It's definitely more present in developer workflows today, but I don't think that's going to be the terminal state. I think the reason it's more present in developer workflows today is because coding agents are a, the most heavily used, but B, they also tend to have the most broad levels of access. Oftentimes if I think about like a realistic Claude cowork use case, for example, it's like, help me make my deck or slides look better, like run this quick analysis in Excel. Um, and today I think that's where Cowork is because it's newer than Claude code and others. I think what's going to happen over time is as you start to give this greater and greater level of access. So it's not just like making you a deck, but it's like running through your reports and Salesforce, like similar to how in engineering we're not just like, hey, make the button blue or like design the entire interface. I think as you start to do that and you're looking at creating, you know, full, full stack plans, starting from tableau, starting from Salesforce into Tableau, writing an email, having it like, go ahead and update something in your underlying system, uh, of record, I think things are actually going to start to introduce that same level of overhead. Because the fundamental reason for the coding agent asking permission and the coworker agent asking permission are really the same. It's like it's taking an action or operation on your behalf. So right now developers probably have like 10 times as much permission and access on average than the conventional cowork agent might. Um, I don't know if the cowork agents will get all the way to the same page, but I do think they're directly on that trajectory too.

Speaker A: Do you find that the SLMs need to be tuned to use case or you know, even more narrowly customer or do you ship them generically and they work the same?

Speaker B: We find that it's helpful to tune the SLMs towards use cases being like policy enforcement so like you know, understanding session data, being able to arbitrate decisions on whether or not to allow or deny actions. We find that post training the SLM for um, the organizational specific context hasn't been as critical so far versus like inference time customizations that we're able to do by being able to embed certain context from the organization at the time that the SLM is making the arbitration. So um, you know, put simply today I think like post trained SLMs for use case. Yes. Ship the same one across customers and do an inference time customization uh, for each customer.

Speaker A: I think that's consistent with where the question came from. Which is an SLM that's really focused on cloud code probably needs a uh, you know, different, you know, way of looking at the data than one that's focused on like you know, things that are permissible in email conversations and like business user conversations.

Speaker B: Yeah, and even higher level than that. Like an ah, like an SLM in general is going to be like trained to generate all the tokens in the universe still. But being able to go ahead and say like look what you're really looking to be able to do is be very good at decision boundary between risky and un risky actions. I think is an area where you get like significant lift right out of the box.

Speaker A: And we glossed over this earlier but the broader Rubrik Agent cloud, what is that doing?

Speaker B: Yes, so the broader uh, so Rubrik as a company has really two core offerings. Rubrik Security cloud which is all about, are data and cyber resilience. So think about backing up data across all the organizations, backing up identity systems, helping people recover very quickly. That's one the Rubrik Agent cloud is the view of like uh, it's essentially like the parallel products now. And the Rubric Agent cloud is all about making sure that Organizations are resilient for now, an AI driven future. And that's the product that has the three core pillars from observability, runtime security with Sage and then uh, resilience and rewind.

Speaker A: And what does the deployment process typically look like?

Speaker B: This is the thing that I think I'm happiest about with the Rubrication cloud and one of the startup learnings actually I'll share. It's like how do you like optimize for time to value? Right. So one of the challenges with like I would say post training as a process is that it's not immediate time to value. You need to collect your data set and then you get to like train a model, then you eval it. Um, one of the great things with reubrication cloud is you can actually just plug it into one of your agent runtimes very easily and you start to see things relatively immediately. So if you want to hook into something like Copilot Studio or you know, ChatGPT Enterprise, um, we just launched an integration with Anthropic and the compliance API. Uh, you can actually like basically do an API level integration if you have the right permissions, can take a few minutes to set up and then the traffic is flowing. You're seeing all this runtime observability and you can add on more integrations as time goes on. Right. Like we hook in into a number of different sources. We hook into the mobile Device management suite or the MDM for an organization we can hook at the gateway level. But to get started, what we usually recommend is like, let's just pick one, uh, authorize it and API level integration is pretty fast and then you can start to see what the value looks like in that agent runtime and then you can cover more of your stack as you.

Speaker A: And so when you take that first step and you do the API level integration, is the result of that primarily an observability tool or is that also all of the decision making that we've been talking about? Like do you have to do more to get to decisioning or is it just there?

Speaker B: It's funny, the world moves quickly. Uh, when we first rolled out Rubric Asian Cloud, it was predominantly observability and then you would need to configure your policies and then we could show you what's violating a policy or not. And people love the demo because I would show them like, look, you connect into the, like your system and now I can write a policy and I'd show them like, I'm just typing out a policy in natural language. So something like don't take sensitive data and put it in email and now I can go and like run it. But then when we were deploying to customers, we realized a lot of organizations don't know all the policies they want to enforce out of the box. Instead what they want is like give me 80 or 90% of the answers

Speaker A: catalog or something like that.

Speaker B: Yeah, give me 80 or 90% of the answer out of the box and then let me go ahead and pick and choose and customize those I want. And so what we launched relatively recently is a component where you hook us into the runtime and we are like, you're good. We are running immediately and we are showing you, you know, we've just naively called it insights, but we're automatically showing you like what Sage is picking up as traffic, the runtime enforcement, what's dangerous. We're suggesting remediations for you, like create a policy to prevent this in the future and then you can do the additional work to customize it further. Um, but our goal has been like fast time to value and really uh, solving most of the challenge out of the box. And so once you've connected it, that's really all you need to do to be able to get a large chunk of value upfront.

Speaker A: So given that the space is moving very quickly, where do you see it all going? What's next, what's coming?

Speaker B: It's funny because earlier in the show I think you did ask about, um, hey, if I'm old fashioned and I'm worried about this idea of like just definitely referring all control to uh, an AI system or a harness. How should I rationalize that? We kind of took this bets, um, you know, last year that we wanted to do more and more and the loop and I actually think last year when we made the bet, there was a part of me that was nervous and I used to hedge the answer a little bit more and say like you're really going to want both. You're going to want non deterministic and deterministic. I do see this world increasingly now just moving to AI systems that are running these processes sees more end to end. And so, um, I think where the world is headed is I think right now a lot of agents are still stuck in read mode. I think that's going to change. I think agents are going to graduate from read mode and go on to write, delete all the card applications because I think they're essentially going to be doing the types of work that humans are Doing I think token and inference spend is going to continue to scale up very, very quickly. I think we are going to see a series of large scale um, I don't know how we want to call them mistakes. Incidences. I think agents are going to like. I think that we're going to see a massive amount of productivity and I think we're also going to see some inadvertent um, incidences where things have gone wrong. Uh, and I think that what we're going to end up centralizing on is that there isn't another good way to be able to solve this problem. The rules in the human in the loop will still keep for certain applications because they also help us feel good and because they're necessary for some applications. But I think what we're really going to go center around is like we want these workflows to become increasingly AI first. Uh, and that we're going to use the same type of technology that's introducing the risk to also be a part of that solution. So um, the world's moving very quickly. I think agent harnesses have like absolutely exploded earlier this year. I anticipate that to continue. I anticipate coding to be still dominant use case through this year but by the end of the year I anticipate the same thing we're seeing for coding is going to be happening for many other types of um, um, you know, sophisticated knowledge work. Uh, and I think we're going to start to see the same type of requests that we need now monitoring coding agents starting to extend across the agent stack. And I can't see another way to do it without being AI in the loop.

Speaker A: Yeah, it's, it's interesting when I think about that I tend to agree. I wonder about like, of the tools that I use, you know, for example like the you know, G Suite stuff, um, does it have an ability to do snapshot like can you hook in for that third pillar of yours? Like do you have a way to say hey this agent just blasted this user's calendar. Can you restore it for me?

Speaker B: So uh, you know I do think that a large part of like Google Works, uh, workspace is something that Rubrik does have ah, integrations to be able to back up. Oh really? Yeah, absolutely.

Speaker A: So this is tying over to the like traditional core product of the core business.

Speaker B: Yeah, exactly. Um, um, so you know to the instances I, I think there must be some instances where we don't back up but certainly like a large chunk of like important, like where important information is, is, tends to be the areas that we like to be able to back up. Uh, and rubricus builds a large business on the backs of that. Uh, and so I think that it's a good question, how do I make sure the agent like look to feel comfortable giving your agent full write and delete access to your calendar? You'd probably feel a little bit better if you knew you could rewind that action in case wrong. Uh, and ultimately that's why I think that this is like the third pillar and also the pillar that I think like we're, you know, the third pillar and also the way we approach the second pillar, I think those are the two unique things we're doing because frankly a lot of people are going to be in the agent observability space. And um, if that's all you need is a solution like there's going to be multiple vendors on there. I'd have my point of view for maybe why us? But there's going to be other good vendors. But what we really think about differentiating us is like sage and rewind.

Speaker A: Well, Dev, thanks so much for jumping on and sharing a bit about the way you're thinking about Agentix Security.

Speaker B: Of course. Thanks so much for having me, Sam. It was a lot of fun in the conversation.

Speaker A: A lot of fun. Thank you.

More from The TWIML AI Podcast

All episodes →
Explore the best B2B AI & Data podcasts →
Listen to this episodeAll The TWIML AI Podcast episodes →