AI labs are racing to ship and monetize while the safety questions grow bigger each week.
Paul Roetzer and Mike Kaput unpack the latest news: Matt Shumer’s viral warning about AI's impact on cognitive work, Anthropic’s Sabotage Risk Report for Claude Opus 4.6, and the mounting safety-vs-revenue tension playing out in new ChatGPT ads and xAI’s SpaceX reshuffle.
Listen or watch below—and see below for show notes and the transcript.
This Week's AI Pulse
Each week on The Artificial Intelligence Show with Paul Roetzer and Mike Kaput, we ask our audience questions about the hottest topics in AI via our weekly AI Pulse, a survey consisting of just a few questions to help us learn more about our audience and their perspectives on AI.
If you contribute, your input will be used to fuel one-of-a-kind research into AI that helps knowledge workers everywhere move their companies and careers forward.
Click here to take this week's AI Pulse.
Listen Now
Watch the Video
Timestamps
00:00:00 — Intro
00:05:51 — AI Pulse Survey
00:07:58 — Something Big Is Happening
00:27:06 — Claude Safety Risks
- Sabotage Risk Report: Claude Opus 4.6 - Anthropic
- Anthropic's Responsible Scaling Policy - Anthropic
- Three Sketches of ASL-4 Safety Case Components - Anthropic
- Activating AI Safety Level 3 protections - Anthropic
- Responsible Scaling Policy - Anthropic
- System Card Claude Opus 4.5
- Anthropic's Responsible Scaling Policy
00:46:37 — Academy Success Score
01:03:33 — High Profile AI Resignations
- Opinion | I Left My Job at OpenAI. Putting Ads on ChatGPT Was the Last Straw. - The New York Times
- Testing ads in ChatGPT - OpenAI
- X Post from Mrinank Sharma
- Musk's xAI loses second co-founder in two days - CNBC
- X Post from Hang Gao
- X Post from Jimmy Ba
- Elon Musk Wants to Build an A.I. Satellite Factory on the Moon - The New York Times
01:06:55 — OpenAI’s Changing Hardware Plans
01:09:17 — Does AI Actually Intensify Work?
This week’s episode is sponsored by our 2026 State of AI Report.
This year, we’re going beyond marketing-specific research to uncover how AI is being adopted and utilized across the organization, and we need your help to create the most comprehensive report yet.
It’s a quick seven-minute lift. In return, you’ll get the full report for free when it drops, plus a chance to win or extend a 12-month SmarterX AI Mastery Membership. Go to smarterx.ai/survey to share your input. That’s smarterx.ai/survey
Read the Transcription
Disclaimer: This transcription was written by AI, thanks to Descript, and has not been edited for content.
[00:00:00] Paul Roetzer: This is like giving someone the internet back in 2000 and the only thing they knew to use it for was sending and receiving emails like they're blissfully unaware of how they were living in the early days of search and e-commerce and social media being completely invented and redefining business and society.
[00:00:20] Welcome to the Artificial Intelligence Show, the podcast that helps your business grow smarter by making AI approachable and actionable. My name is Paul Roetzer. I'm the founder and CEO of SmarterX and Marketing AI Institute, and I'm your host. Each week I'm joined by my co-host and SmarterX chief Content Officer, Mike Kaput.
[00:00:40] As we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career, join us as we accelerate AI literacy for all.
[00:00:56] Welcome to episode 197 of the Artificial [00:01:00] Intelligence Show. Easy enough to say, I'm your host, Paul Roetzer with my co-host Mike, put, so this is the episode that almost didn't happen, Mike. I, a little context here is I think this is important backstory to why we're having this conversation today.
[00:01:16] So if you listen to episode 196, which dropped on February 10th, I, the dates is like, bear with me here. so it dropped on February 10th. So at the end of that episode, I said, because I realized that that moment that I was gonna be traveling when we would be recording this other one. So normally we record the weekly on Monday mornings, and then the team produces it and we drop it on Tuesday morning.
[00:01:42] So it's like a 18 hour turnaround basically between recording and go. Mike and I do these things in one shot. We don't stop, we don't edit. nothing. It just like, we just go. So at the, as I was ending episode 196, I'm glancing at the calendar and realizing, oh my gosh, I'm gonna be out of [00:02:00] town Friday through Monday.
[00:02:01] So Monday is our normal recording day. Friday is our backup day. So these, sometimes we'll record on Friday if like travel is just gonna get in the way. So then. I was like, oh, wait, as soon as I get back on Monday night, I turn around and leave again on Tuesday and I'm gone for the week again for a speaking event.
[00:02:18] So as soon as it ended, we had team meetings on Monday, and so we meet as a team, we're talking about this and we're like, there's just literally no feasible way to do this episode. So our plan was to skip a week. So we decided, you know, on Monday, whatever the ninth that was, that we just weren't gonna do the episode that would be dropping on what, it's gonna be February 17th, I guess, when you're gonna be listening to this.
[00:02:39] So the plan was to skip a week. I didn't really love the idea, Mike didn't love the idea. There's too much going on to skip a week, but we just, we accepted it was gonna, so then Tuesday something big happened and an essay from On X from Matt Schumer that we're gonna talk about went viral and basically broke the X algorithm on the for you page, at least for me.
[00:02:59] Mike, I don't mind you, [00:03:00] but,
[00:03:00] Mike Kaput: yeah, I can't escape it.
[00:03:01] Paul Roetzer: Literally like 90% of the post on my for you page where people resharing this post. Yeah. So, yeah. So that combined with a series of other conversations very early in the week just led me to the decision, like we had to find a way to do this. So I messaged Mike on Wednesday morning and I'm like, what do you think about doing this Thursday morning?
[00:03:22] So February 12th is when, when this would've been, it's actually when we're now recording this. The problem was our AI for Agency Summit is, was when you're listening to this February 12th, and we have over 3000 attendees registered for this virtual summit. that's happening from noon to 5:00 PM Eastern time on the 12th.
[00:03:44] So I'm looking at the schedule, like literally Mike, the only way we do this is if you and I both basically get up at like 5:00 AM on Thursday prep for this podcast and we do this thing and he is like, let's do it. Like we gotta get lo
[00:03:55] Mike Kaput: and behold, here we are.
[00:03:56] Paul Roetzer: Yeah. Yeah. So that's basically how it happened is we decided we had to [00:04:00] thread the needle.
[00:04:00] We, we figured we had to do this. There we was, you know, Mike and I basically just agreed that there was. Yeah, too many like, and it's just a few things, but just really big weighty topics that people were reaching out to us about. Like, what do you think about this? Are you gonna comment on this? Are you gonna talk about on the podcast?
[00:04:16] And I just didn't wanna wait 10 days to do it. I felt like I was gonna probably lose some of the things that were run through my mind and I was gonna be a different mental place in 10 days. And so I was like, let's, let's just figure this out. Let's go. So here we are. So Mike and I are recording this in a very unusual time for us, which is Thursday morning, February 12th.
[00:04:34] You will be hearing this, you know, on Tuesday the 17th again, if I'm getting the dates right in my head. so just know that context. We are doing this right before our AI for Agency Summit that Mike and I have multiple presentations to do for. So I did my best in, in the stupor of late night, early morning to like try and organize my thoughts, but we're just gonna, we're just gonna go 'cause that's what this podcast is all about.
[00:04:57] Alright, so this episode is brought to us by an AI for department [00:05:00] webinar series. this is a new thing we're doing. We're basically taking three new blueprints, AI for marketing, AI for sales, and AI for customer success. We're releasing those blueprint documents. They're gonna be ungated, so we're releasing PDFs that day.
[00:05:15] and we're gonna do webinars for each of them. So it's basically gonna be a AI four departments week. You can register for one of them, two of them, all of them. Whatever you wanna do, you just go to SmarterX dot ai slash webinars. You can register. Again, these are free events to attend. The blueprint is gonna be free and on gated.
[00:05:34] This is all about the AI literacy project trying to accelerate, that, you know, just the AI literacy and capabilities across departments as we'll talk about today. It's becoming more and more important that we, we have a sense of urgency for this, so that is why we're making that all free and just putting it all out there.
[00:05:50] Okay.
[00:05:51] AI Pulse Survey
[00:05:51] Paul Roetzer: AI pulse. So if you're new to the show every week we do a two question survey that our listeners can participate in. it's just SmarterX ai slash pulse. You can go and participate in this week's, we'll give you this week's questions at the end. But what we do is it's a informal poll. So again, this is just with our, listeners and viewers on YouTube and they go in and they tell us kind of how they feel.
[00:06:13] So we basically pick two topics based on the things we're gonna talk about in today's episode, and then we do it. So last week, and I would mind you again, we were on a short week. We're doing this like three days after this came out, but we said, how concerned are you that AI will disrupt your company's core software tools in the next 12 months?
[00:06:29] 48% said, somewhat concerned. It's on our radar but not urgent. 42% not concerned. Our tools will adapt. And 10% said, very concerned. So. I don't know. I mean, that, that's probably about what I would expect, Mike. I don't, yeah. Nothing like shocking in those findings.
[00:06:44] Mike Kaput: No.
[00:06:44] Paul Roetzer: And then the other one was, has a recent experience with AI made you rethink the value of a skill you've built over your career.
[00:06:50] Okay. Well this one's pretty relevant today. So 67% of people said Yes, it's already doing something I used to do. Well,
[00:06:59] Mike Kaput: that's wild.
[00:06:59] Paul Roetzer: [00:07:00] Yeah. 27% said somewhat. I can see it coming, but it's not here yet. Okay. So that's 94% Quick math, Mike, if I'm gonna ask. Yeah. So 94% of people when asked has a recent experience with AI made you rethink the value of a skill you've built over your career.
[00:07:15] 94% of people said yes or somewhat. wow. Okay. So that, keep that one in mind in the back of your mind as we get into these main topics today. All right. So as I alluded to upfront, something big happened on Tuesday. It took over x and, we'll start there, Mike.
[00:07:34] Mike Kaput: All right Paul. So a post about AI has now been viewed.
[00:07:39] You know, I had to update this even since this morning. It's now viewed, been viewed 72 million times on X because it has gone mega viral and has essentially broken the X algorithm.
[00:07:50] Paul Roetzer: Now keep in mind, X only has like 200 and some million users,
[00:07:55] Mike Kaput: so they've
[00:07:56] Paul Roetzer: sure everyone on Ising this thing.
[00:07:58] Something Big Is Happening
[00:07:58] Mike Kaput: So this is an [00:08:00] essay posted on X from Matt Schumer called Something Big is Happening Now.
[00:08:04] Matt Schumer is the CEO of Other Side ai. He's a six year AI startup founder and investor. He published this roughly 5,000 word essay titled Something Big Is Happening. And in it he wrote that he has historically given the people in his life, quote, the polite version, the cocktail party version of what is happening in AI quote, because the honest version sounds like I've lost my mind and this essay.
[00:08:31] His attempt to say, look, I can't do that anymore. We've reached a tipping point and I need to tell you what's going on as though I see it in ai. So in the essay, he kind of compares where we're at in AI to the moment in February, 2020 when most people had not yet registered that this little known virus spreading overseas was about to rearrange their lives.
[00:08:52] He wrote that we are in this period of, this seems overblown. This is the, this seems overblown phase of something much, much [00:09:00] bigger in his words than COVID. So he described that the new models that came out from openAI's and Anthropic on February 5th of this year really was the eye-opening moment and the precursor to what's about to come.
[00:09:14] So he talks about, in this essay how these models are so powerful now he is no longer needed for the technical work of his job. He simply now, instead of coding, just describes what he wants built in plain English. Walks away for hours at a time and returns to finished work requiring no corrections. He wrote that the latest models display something that feels quote for the first time, like a judgment, and that the distinction between AI capability and human expertise quote is starting not to matter.
[00:09:45] So we'll dive into the specifics here, but he is very much giving a wake up call to people that are not paying attention, that in his view we are about to face some very, very serious disruption. He even offers some steps to potentially [00:10:00] take to navigate that. And overall, Paul, this is just basically like this wake up call to the world from Matt.
[00:10:08] And you know, I did feel like this resonated a bit with him saying like, it sounds like I've lost my mind, but I promise you I haven't. We've been talking about. Some version of this for a while now in our world, like,
[00:10:20] Paul Roetzer: yeah.
[00:10:21] Mike Kaput: Maybe it break this down for me. How much of this is hype? How much of this is worth paying attention to?
[00:10:26] Yeah,
[00:10:26] Paul Roetzer: I mean, a lot of the talking points, definitely Reiter reiterated a lot of the things that we've been saying and trying to drive like a sense of urgency around on the show for a couple years. so just contextually, like, we've known Matt, I mean Mike, you and I both have known Matt for years.
[00:10:41] Yeah. they were one of the early players. So like I, and if you go back to our book in 2022, I'm pretty sure hyper right's. One of the companies I think they featured in, like, what happens when AI can write like humans. Matt at times has had early access to models that, you know, he and I have had conversations about.
[00:10:56] So like Matt knows his stuff. He's been on the frontiers of this. [00:11:00] He has been known at times, like other people on the side to overhype AI advancements. that that's not uncommon. That being said, I think most part of this post is directionally closer to reality than most people's current understanding of the state of ai.
[00:11:17] So, I'm gonna call it a few excerpts here. I, Mike, I might add a little commentary to 'em, but what I, what I was thinking of doing is just like, try and lay this out. It's a really long post. Yeah. So if you gave up reading it partway through, and you're like, okay, I get it. I 'm with you. Like, the first time I actually stopped, like halfway through it, I was like, okay, I get it.
[00:11:33] And then I did go back and kind of reread the whole thing this morning as we were prepping. So I'm just gonna call it a few of the things that I thought, jumped out, that maybe the average person who wasn't aware of how quick things were moving, these are the kinds of things that might resonate with them more.
[00:11:50] So, he said, now, I've spent six years building an AI startup, investing in this space. I live in this world, and I'm writing this for people in my [00:12:00] life who don't, my family, my friends, the people I care about who keep asking me. So what's the deal with AI and getting an answer that doesn't do justice to what's actually happening?
[00:12:09] I keep giving them the polite version, the cocktail party version. Because the honest version sounds like I lost my mind as you were saying Mike, and for a while I told myself that was good enough reason to keep what's truly happening to myself. But the gap between what I've been saying and what is actually happening has gotten far too big.
[00:12:26] The people I care about deserve to hear what is coming. Even if it sounds crazy, this alluded back to Mike. Like I remember like for years on the podcast, I would avoid talking about AGI and I would actually avoid, I definitely would avoid put putting on LinkedIn because I was like, people aren't even, don't even what to do with a chat bot.
[00:12:42] Like for us to start talking about AGI is just like, so I totally get what he's saying here about like, you're just filtering what you're saying based on who you're talking to and like what, what are they prepared for? Like what are they actually ready to hear? so he said the future's being shaped by a remarkably small number of people, a few hundred researchers at a [00:13:00] handful of companies, opening philanthropic Deep Mind, and others.
[00:13:03] A single training run managed by a small team over a few months can produce an AI system that shifts the entire trajectory of the technology. Most of us who work in AI are building on top of foundations. We didn't lay. We're watching this unfold the same as you. We just happen to be close enough to feel the ground shake first.
[00:13:20] I definitely feel that. but it's time now, not in an eventually we should talk about this way in a this is happening right now, and I need you to understand way. Here's the thing. Nobody outside the tech quite understands yet. The reason so many people in the industry are sounding the alarm right now is because this already happened to us.
[00:13:39] We're not making predictions. We're telling you what already occurred in our jobs and warning you that you are next. This is definitely like what we've been saying on the podcast a lot lately. Like we are living in a, in a parallel universe to most people with the, and the people who listen to this podcast are living in that parallel universe.
[00:13:55] Like you're seeing it, right, but like your friends and coworkers aren't seeing it. So [00:14:00] I totally get what he's saying here. He said I'm no longer needed for the actual technical work. as you said, Mike, he just kind of gives it English and it does the job better than he would do it over. Long Horizons said, I've always been early to adopt AI tools, but the last few months have shocked me.
[00:14:14] These new AI models aren't incremental improvements. This is a different thing entirely that echo back to what, you know, the episode we did the first of the year, Mike, right after the holidays were like, something changed, man. Like the Claude code just took off, like everybody was looking at it differently.
[00:14:28] So he said, my job started changing before years, not because they were targeting software engineers. it was just a side effect of where they chose to aim first. They've now done it and they're moving on to everything else. Part of the problems that most people are using, the free version of AI tools, which is true, the free version is over a year behind what paying users have access to.
[00:14:47] Judging AI based on free tier chat. GPT is like evaluating the state of smartphones by using a flip phone. The people paying for the best tools and actually using them daily for real work, know what's coming. I don't know about you, Mike, but I'm always shocked by [00:15:00] the number of business users I talk to when I see, oh gosh, chatGPT, like, yeah, the free version.
[00:15:04] I'm like the free version.
[00:15:05] Mike Kaput: Unreal.
[00:15:07] Paul Roetzer: You have no idea what's going on. he then said, this is different from pre every previous wave of automation. I need you to understand why AI isn't replacing one specific skill. It's a general substitute for cognitive work. It gets better as everything at everything simultaneously.
[00:15:23] I think the honest answer is that nothing can be done on a computer as safe in the medium term. We'll talk about this with philanthropic topic next. If your job happens on a screen, then AI is coming for significant parts of it. The timeline isn't someday it already started. Now this is the part where I thought he, I'll explain it in a moment.
[00:15:39] I think he's not accounting for human friction here, but again, you know, gotta have the context of he's in the tech world. I said, I'm not writing this to make you feel helpless. I'm writing this because I think the single biggest advantage you have right now is simply being early, early to understand, early, to use early A to adapt.
[00:15:53] I agree a hundred percent with that. I know the next two to five years are going to be disorienting in ways most people aren't prepared for. This is already happening [00:16:00] in my world's coming to years. I agree a hundred percent with that. Hmm. I know the people who will come out of this best are the ones who start engaging now.
[00:16:06] Not in fear, but with curiosity. So, a few personal thoughts, and I was just kind of like jotting notes down this morning. So this is more of like a stream of conscious that I'm, I'm just gonna share with everybody. So, as I mentioned, like for years I've held back on the full start with my family and friends.
[00:16:20] I do it to this day, go out for a drink with buddies, you know, playing basketball on Thursday nights, what's going on, what's happening in the eye world, and you're just like, you're just filtering. You're just like, Hey, you know, things are moving pretty quick like this new model's pretty good. Like, and I always then say like, well, how are you using it?
[00:16:35] And so I always actually try and figure out how to talk to them about it based on what is their understanding of it, what are they doing with it? And then I can like. F change the tone of the conversation. So it all seems so abstract and hard to believe, and I think that's the key. So we talk openly and honestly on this podcast about the impact on jobs, the economy, educational systems, government, society, and humanity.
[00:16:57] Because you, the listeners, are choosing to learn and [00:17:00] understand, like you want this information, you're choosing to advance yourself, become a change agent. hopefully have a positive influence on the responsible adoption of AI in your companies and communities. So you're seeking information in many cases, in spite of your own fears, anxieties, and uncertainties.
[00:17:17] The vast majority of society is not there yet. They are free ChatGPT users if that they have copilot licenses that are neutered beyond belief to like what they're actually able to do. So for some people who don't choose to seek this information, it might be because it's too abstract. Maybe they find it threatening to their way of life, their way of work, or their concept of where technology ends and humanity begins.
[00:17:42] Maybe they fear the environmental or societal impacts or the issues around intellectual property and copyright, or maybe because they're too busy with their regular responsibilities and they don't feel a sense of urgency to understand and use AI to its full potential. I see that all the time. Like, I get it.
[00:17:59] I understand it's really [00:18:00] important, but I 'm gonna get to it like I next quarter, like it's gonna be a priority. So as we've talked about in recent episodes, something has definitely changed. The models are getting smarter, faster. They are performing more tasks across more industries and roles with greater levels of autonomy and reliability.
[00:18:16] Maybe not to the level Matt's claiming within his own work. Mm-hmm. But they are definitely getting better. So when GPT-4 came out in spring 2023, it held the title of Top Model for nearly two years. So if you were a listener back then, you remember we would talk a lot like, Hey, I wonder if opening eyes just got something the other labs don't like, no one can seem to catch up.
[00:18:37] And for that two year run, that was the debate was like, is opening eyes just different than everybody else? and then things changed and all of a sudden they weren't the state of the art and Gemini sort of took that throne and then Claude gets into the conversation and Xai shows up in 2023 and starts spending billions and tens of billions of dollars.
[00:18:55] And like all of a sudden they're building viable models. And so then you start realizing that [00:19:00] rather than these like 18 to 24 month run, where we had a state of the art model that was stable as the state of the art, now every three to four months you basically have a new model. Hmm. So those closest to the tech are seeing and feeling it and more regularly voicing their hopes and their concerns.
[00:19:16] Government officials on both sides of the aisle are now getting involved. The economy is becoming increasingly dependent on AI investments and AI powered growth. And organizations are starting to realize the full power and potential of AI as it exists today. Not even like looking on the frontiers of where it goes, which is what I'm always trying to encourage people to do.
[00:19:37] so as I said earlier, some of us, like you and me, Mike, like people listen to the show, are living in this parallel universe in which AI is a collaborator and a coworker. It's infused into our workflows. It's driving massive gains in efficiency and productivity, and it's accelerating innovation and growth.
[00:19:53] And we can't figure out like, how are other people not seeing this? So no matter what you read in the media headlines, that is not the [00:20:00] norm. The truth is very far from it. Like most organizations are not doing this. The vast majority of workers and leaders have no idea what the full capabilities of today's models are.
[00:20:09] They think they, they still think of and use ChatGPT and other gen AI platforms as answer engines and writing assistants. Mm-hmm. They use it to help with emails, maybe summarize meetings, ask questions, brainstorms ideas. They have no idea how to conduct deep research projects, how to build GPTs and gems, how to leverage tools like Notebook LM to its full extent, like shit, our team's been working on notebook LM for months, and we're still uncovering all these capabilities every week.
[00:20:35] Like we find new things we can do with it. They, don't know how to craft prompts for images and videos. They've never explored agent mode. They don't know what computer use capabilities are. They've never created an AI avatar, wouldn't even know where to go to do one. They don't know vibe. Coding an app is a thing for non-technical people that you can just use a series of prompts and build an app to do things.
[00:20:55] So it's like giving, I liked this analogy of the flip phone, but to me, like the thing I was thinking about was [00:21:00] like, this is like giving someone the internet back in 2000 and the only thing they knew to use it for was sending and receiving emails like they're blissfully unaware of how they were living in the early days of search and e-commerce and social media being completely invented and redefining business in society.
[00:21:18] Like they're just totally unaware it's happening. So overall, I think you should read the post. Everything in it is true to some degree, but we definitely will not experience what he's saying on the same timelines. For most organizations, they are still just getting started on their road to like true AI adoption and transformation.
[00:21:38] There is enormous resistance to change that is not going away. This is human nature, the friction that is gonnASLow adoption down. So some organizations are gonna have the will and maybe the mandate to force change quickly through significant turnover staff. They're basically just looking, say, okay, these 30% of people aren't on board, get rid of them.
[00:21:58] a lot of organizations are gonna [00:22:00] be more strategic, more methodical, and more human-centered approach, which means it's gonna go slower. So I'll, I'll like end this Mike, how I end a lot of my talks and then I like see if you have any additional thoughts. So here is what I will often say at the end of my keynotes, the future is unknown.
[00:22:16] The models are getting smarter, faster. Your greatest chance to thrive through this disruption and uncertainty ahead is to become AI forward. Now, we define AI forward as someone who embraces ai. Even though you have fears and anxieties about it adheres to responsible AI principles and you apply it every day, every chance you get to accelerate efficiency, productivity, creativity, innovation and performance.
[00:22:40] The future of all work, and this is what we think about when we hire, it's what we think about when I advise other companies on how to think about their staff. It comes down to two fundamental things. You have to be able to work with the AI and you have to know what questions to ask of it, and you have to know what to do with the answers, and then you have to know how to talk to it, collaborate with it, and learn from it.
[00:22:57] This is not a static thing. This is like a [00:23:00] dynamic intelligence that you can work with so the professionals who understand, embrace, and apply it in their jobs are going to have superpowers. They have superpowers right now. Mm-hmm. Maybe not to the degree Matt is, you know, explaining in his own life, but you absolutely have superpowers.
[00:23:13] You're gonna be able to outperform your peers, you're going to 10 x at least efficiency and productivity. you're gonna be more creative, more innovative. You're gonna become a catalyst for growth in the organization, and you're absolutely gonna have the highest value and earning power and job stability.
[00:23:27] So that to me is like the main message here. And the fact that 72 million people, now they're all in acts, but I imagine this is gonna, this is gonna roll over into the mainstream media by the time you listen to this on Tuesday, February 17th. I can't fathom that Matt isn't doing interviews on like CNBC and stuff like that, right?
[00:23:44] Like this, this is, this is sort of cross the chasm, to where now it's just gonna become a conversation. And that is great. Like if that is the outcome of this, that more people become aware that things are moving way faster than they know about, than this essay did its [00:24:00] job. And as annoying as the X algorithm is to show it to me a million times.
[00:24:04] If that's what comes out of it, great. 'cause that is what we've been trying to do for five years on this podcast.
[00:24:09] Mike Kaput: Yeah, I couldn't agree more. I think what really struck me about this, Paul, is what you mentioned, which is how much it kind of confirmed what we already knew and what, how much of a parallel universe we're in.
[00:24:22] Because I read the reactions to some of this, to this post from some people on X and they're all like, oh my gosh. Like he revealed everything. And it's like, I'm like, we've been talking about this for years', seem very mundane to be good for Matt for writing it. This is awesome. I, but I was just like, oh yeah, of course.
[00:24:37] Like that's roughly I think what I'm seeing. I think what also struck me was his steps for what to do about it. Because I confess, I actually have a kind of evolving document myself of like what is the, you know, break glass in case of emergency plan for when AI hits or AGI hits rather because like, it's a very real thing that I think eventually we're gonna have to deal with some pretty serious disruption and.
[00:24:59] I was [00:25:00] just kind of like nodding along as he said, like, here's what you should do about it, because all these kinds of things are in my personal plan, and one jumped out to me that I'll note and then we'll move on. But I do love the really practical advice he gave of get your financial house in order.
[00:25:16] He's like, I'm not trying to scare you, but if you believe even partially that the next few years could be bring real disruption to your industry, then basic financial resilience matters more than it did a year ago. This is where I arrived at my own plan. I was like, oh, step one, gotta increase my own personal runway and reduce my own personal burn rate.
[00:25:33] Because you don't know what's coming. You don't know how it's gonna affect you. So what you need is optionality and time if the worst case scenarios happen. So I realize everyone's in a different position, but I would just encourage folks, especially the ones who are in this nice little parallel universe ahead of everyone.
[00:25:50] To start thinking about these practical steps about how you can position yourself for maximum optionality moving forward.
[00:25:56] Paul Roetzer: Yeah. and you know, that starts to spin into like the crazy stuff [00:26:00] like Elon Musk thinking, money won't matter. Everybody's gonna have universal high income and
[00:26:04] Mike Kaput: Yeah. Yeah.
[00:26:05] Paul Roetzer: So I would not, I would not wait around for that.
[00:26:07] I , yeah, yeah. You know, I do think that, again, it's a really good essay to read. If you've been listening to this podcast for a while, there's nothing in there that you haven't been hearing for the last year. it takes advantage. It just hit the perfect time because like three weeks ago, exchange the algorithm, they want more people putting their articles online.
[00:26:26] I actually assume it's to train grock, like they want people publishing so they can take the IP and train the next version of Grok. So they are purposely featuring articles that are written on X in the algorithm. Which Matt knew I'm sure, and took advantage of, puts it out there and then, you know, could never, in his wildest streams, would you imagine it would get to 70 plus million views.
[00:26:50] But it was just like, I think with all everything else going on in the world and the, you know, increasing awareness around AI and the risks and the concerns, it just was the right essay at the right [00:27:00] time. And it, you know, might end up be a bit of a tipping point and it gets the conversation going, which is what we needed.
[00:27:06] Claude Safety Risks
[00:27:06] Mike Kaput: All right, so next up we have a new safety report from Anthropic that is drawing attention in AI circles because it starts to reveal some interesting things about the behaviors of their most capable model. So this is called a sabotage risk report that they did for Claude Opus 4.6, their most powerful recent models.
[00:27:27] And this document is something they have committed to producing now for all future frontier models. as they move forward and what they do is they do this kind of internal evaluation on Opus 4.6 and they found that it is, quote, significantly stronger than prior models at subtly completing suspicious side tasks in the course of normal workflows without attracting attention.
[00:27:53] They also found that the model provided limited assistance when they pushed it towards contributing to [00:28:00] chemical weapons development and then changed its behavior when it detected it was being evaluated. So basically they're doing these kind of risk assessments to see what Claude Opus 4.6 is capable of, and it sounds like it is capable of certain types of sabotage, of sandbagging, of deception and more.
[00:28:21] So Anthropic, however, concluded that the overall risk of this model is very low, but not negligible, and that the model does not appear to possess dangerous misaligned goals. But instead, I think they had argued that these behaviors kind of happened, you know, through innocent intentions of trying to be helpful and trying to do what it was tasked with, not as some kind of evil master plan here.
[00:28:46] But Paul, the fact we're even talking about this at all is extremely sci-fi. I think, you know, I couldn't help but when I was reading through this, like the report says this model is not misaligned, so it doesn't have this like secret [00:29:00] master plan, but I couldn't help thinking like, it keeps exhibiting deception, sabotage, unauthorized actions.
[00:29:06] It changes its behavior when it knows it's being evaluated. Like if it does all those things not on purpose, like does this distinction actually matter? How do we know we're actually able to evaluate this in the right way?
[00:29:17] Paul Roetzer: They took an informal poll of 16 employees and they decided it wasn't, I'm not even joking.
[00:29:25] okay. So a little context here I think is really important for, people aren't familiar with the backstory, so real. Back to the roots. Like, so Anthropic was formed by like 10% of openAI's staff left, including Dario Amodei and his sister. They leave openAI's in 2021. They form Anthropic. They claim their main focus is safety.
[00:29:46] some internal messaging. At the time, it was really more about just like, this is the opportunity to go build a massive company. Now with the Origins, they definitely have had more of a safety slant to the company. and so they have since [00:30:00] 2023 in particular, been pretty aggressive about, being more conscious of the safety behind the models and the alignment of the models.
[00:30:09] doing things like mechanistic interpretability where they're trying to understand how the models think, stuff like that. So in September, 2023, they published V one of what is called the Responsible Scaling policy. that post, we'll put the link to all these things I'm about to mention in in the show notes.
[00:30:24] It said, as AI models become more capable philanthropic beliefs, they will create major economic and social value. But we'll also present increasingly severe risks. With this document. We are making a public commitment to a concrete framework for managing these risks. One that will evolve over time, but that seeks to establish clear expectations and accountability in its initial form.
[00:30:46] So they define these ASLs. The AI safety levels as smaller models was one. Present, large models was two. So 2023 fall, we were at level two. Level three is significantly higher risk. And level four, they [00:31:00] called speculative. September, 2023. So this is just, you know, two years ago said for each ASL, the framework considers two broad classes of risks, deployment risks, which are risks that arrive from active use of powerful AI models and containment risks, which are risks that arrive from merely possessing a powerful AI model.
[00:31:21] At that time, they chose not to even define a L4. So two years ago, they didn't even really know how to put a definition to a L4. They said it is a iterative commitment. We commit to define a L4 evaluations before we first train ASL three models. So basically, like we were at level twos, September, 2023, we're at level two.
[00:31:44] Once we think we're at level three, we will define level four is pretty much how they did. Yeah. So early thoughts on ASL level 4, they said it is too early to define, the capabilities, containment measures or deployment measures with any confidence since they will likely change based on the practical experience with two and [00:32:00] three level models.
[00:32:01] But they look at critical catastrophic misuse, risk, autonomous replication in the real world and autonomous AI research. Now that one's important to, you know, put, put in the back of your mind for a second. So in that one it says a model for which the weights would be a massive boost to a malicious AI development program.
[00:32:17] That would get them to ASL-4. So in short, a L-4 system is more capable than the best humans in some key areas of concern while still not being so across the board and lacking some features needed to survive in the world in the long term and face of concentrated human resistance. Hmm. So again, when we, we talked about this stuff when it first came out, it's like this is real sci-fi stuff back then.
[00:32:38] Yeah. so then in November, 2024, so fast forward 13 months, they published a blog post called Three Sketches of ASL-4 Safety Case Components. So again, November, 2024. So just over a year ago, they said Anthropic has not yet defined ASL-4, but as, but [00:33:00] has committed to do so by the time a model triggers as L three.
[00:33:03] However, the appendix to our RSP, the responsible scaling principles speculates about three criteria that are used in this, again, goes into the autonomous research, the catastrophic misuse, and the capability of autonomous replication. Now they end this. It says, all of these criteria suggest a high degree of agency and complex panning, meaning the model has agency and complex planning capabilities.
[00:33:24] Four models with such agentic capabilities. One also needs to address the possibility that they would, they would intentionally try to undermine the evaluations or procedures used to ensure safety. Following recent work, we grouped such concerns into the category of sabotage. So that gives us the origin of the thing we're gonna talk about.
[00:33:47] Then in May of 2025, so less than a year ago, they achieve level three. So they said with they, they, they have activated at least level three deployment and security standards described in the [00:34:00] responsible scaling policy in conjunction with the launch of Claude Opus four. So when Claude Opus four comes out in May of 2025, we now have them saying, we're there.
[00:34:11] So they're deploying it with our ASL three measures as a precautionary and provisional action. To be clear, we have not determined whether Opus four has definitively passed the capabilities threshold that requires a ASL three protections, but rather they're taking sort of precautions by doing this.
[00:34:31] They then simultaneously released version 2.2 of the Responsible Scaling Principles in May of 2025, and then they released Claude 4.5 in November, 2025. That is the moment that we talked about on the podcast the beginning of this year when something fundamentally changed. They then updated their principles on February 10th, 2026.
[00:34:52] So this is this week. This is why we're now talking about this. So they put a, a blog post up that explained the updates. They said the RSP [00:35:00] requires that once models cross, AI r and d four capability threshold. So that's that autonomous thing. We develop an affirmative case, identifying the most immediate and relevant misalignment risks from models pursuing misaligned goals and explaining how we mitigated them.
[00:35:15] Our determination is that Claude Opus 4.6 does not cross this threshold. So they're saying the new 4.6 that they just released isn't there, but then they get into the uncertainty around it. so they say that. However, as we noted in the 4.5 system card and the 4.6 system card, which is like the technical thing that comes with these models confidently ruling out this threshold is becoming increasingly difficult.
[00:35:42] And doing so requires assessments that are more subjective than we would like. That's very important context. Rather than rely solely on those assessments we committed during the launch of 4.5 to writing sabotage risk reports for all future frontier models, clearly exceeding four point five's [00:36:00] capabilities.
[00:36:00] So every model moving forward. Today consistent with that commitment, we are publishing the external facing version of the sabotage risk report. Meaning they have a version that lives internally that has more details. which brings us to the sabotage risk report that you let off with Mike. So they say this risk report argues that Claude Opus 4.6 does not possess a significant risk of autonomous actions that contributes significantly to later catastrophic outcomes, which we refer to as sabotage risk.
[00:36:29] We limit our scope here to threats caused by, largely caused largely by model actions in this way. And we do not address threat models where intentional, harmful actions by humans play a central role, meaning the model itself won't do these things.
[00:36:43] Paul Roetzer: We are not telling you that a human with the model couldn't do these things, is in essence what they're saying.
[00:36:49] Then it does go in and this is a great one to like throw in like notebook LM and say, Hey, tell me about the key points here. Explain it to me, whatever. Yeah, but here's a few of the things. You mentioned a couple of 'em, like sandbagging. So one [00:37:00] of the most counterintuitive risks discussed is sandbagging, is the idea that the AI might deliberately perform poorly on a task to hide its true capabilities or to avoid being disempowered.
[00:37:11] Paul Roetzer: In other words, it knows what you're doing and it purposely screws something up. So you don't think it's as powerful as it actually is. That's pretty effed up. Diffuse. Okay. So then they actually like go in and explain it. Then they have st stenographic reasoning. The report explores stenography where a model might hide secret reasoning within its visible output.
[00:37:35] Mm-hmm. So it does filler tokens, like a simple sequence like 1, 2, 3, to secretly process complex thoughts or reason without the human seeing what it's doing. Self exfiltration. The report analyzes the idea of a sex, a self exfiltration pathway. This is the classic sci-fi tro of AI escaping the lab to go live on the internet.
[00:37:57] Now, keep in mind like these are all things [00:38:00] they think it might have the ability to do, so they're testing for it. This is not like, Hey, five years are not my dude. They, they think it could have it right now. Collusion and bribery, we've seen early actual examples of this report considers whether different AI models might work together against humans.
[00:38:16] AI sleeper agents, the report discusses the risk of poisoning or an AI works fine 99% of the time, but has a trigger phrase that turns it bad and then it goes and exfiltrate information from your computer. Hmm. Decision sabotage in governments. So if it knows it's being used for government purposes, which I just saw something yesterday that said the DOD, department of Defense or war, whatever they call it now.
[00:38:38] Is thinking about infusing chatbots into government systems.
[00:38:42] Mike Kaput: Yep.
[00:38:43] Paul Roetzer: This is a real deal. So that basically like, another government could, sabotage a chat agent and get it to do things, or lead other governments down different paths, in essence, gaslight governments. Hmm. And then the ASL four threshold, and this is where the real important part happens.
[00:38:59] So the [00:39:00] report mentions ASL four, level, safety, autonomy. This is the one we talked about, says this is the ability to fully automate the work of an entry level, remote only researcher and Anthropic. So the basic premise is if we have created, in essence, an AI researcher that doesn't need a human, that thing can go to all kinds of crazy stuff.
[00:39:17] The important thing to know here is every lab has this as a north star. Right now, they're all trying to create this thing. Hmm. So here's what it says. For AI r and D capabilities, we found that Claude Opus 4.6 has saturated most of our automated evaluations. Meaning they no longer provide useful evidence for ruling out this level of autonomy.
[00:39:38] In other words, we don't know we, we can't test it in an effective way. It says the, we report them for completeness and we will likely discontinue them going forward. We're giving up on trying to actually do this. Our determination, and this is the part that I mentioned earlier, this is crazy and terrifying.
[00:39:55] Our determination of whether or not they're at ASL four [00:40:00] rests primarily on an internal survey of Anthropic staff, in which zero of 16 participants believed the model could be made into a drop in replacement for an entry level researcher with scaffolding and tooling improvements within three months.
[00:40:14] So they're saying, okay, we think we're safe for three months.
[00:40:17] Mike Kaput: Yeah.
[00:40:17] Paul Roetzer: However, those same 16 people reported productivity uplift estimates ranging from 30% to 700% with a mean of 152% and a median of 100%. So in using the tools themselves as AI researchers, some people reported a 700% increase in their productivity.
[00:40:41] Oh my God. And yet, still didn't think it was at ASL four in terms of automation. Go back to Matt's post about I go away and like I come back four hours later and the work is done.
[00:40:50] Mike Kaput: Hmm.
[00:40:50] Paul Roetzer: So it said, staff identified persistent gaps in two key competencies, self-managing week long tasks. So days is fine, like it can do days of [00:41:00] work, but it's not at weeks.
[00:41:00] So we're still not there. with Ty, typical ambiguity and understanding organizational priorities when making trade-offs, on one evaluation, kernel optimization, opt, Opus 4.6 achieved a 427x speed up, you've using a novel scaffold, far exceeding the 300 x threshold for 40 human expert hours of work.
[00:41:27] All I'm saying and the reason we're talking about something like this, and that is so technical, you have to understand how advanced, what these labs are doing is now, again, most of what they're doing is applying this to coding, to AI research. It is not being a lawyer or doing HR work or doing marketing or being a CEO, but they are making advancements that are literally impossible for the human mind to comprehend.
[00:41:59] We cannot think in [00:42:00] exponential. We think in linear you can't, these numbers mean nothing. It's like us saying open eyes is gonna raise 1.4 trillion. It's like, oh, that's cute. That sounds like a lot of money. No, that's a shit ton of money. Like that's what this is like. It is. It is so beyond the ability to understand this is where I go back to say like.
[00:42:16] If you brought this up to like friends or family, like you're just sitting around like, Hey, lemme tell you about this like report I was reading, I heard about on this podcast. Their brains would literally explode. Like, right. This is not stuff you can just like have a conversation with people about. So the point of this and why, again, we wanted to like just do this episode, we didn't wanna wait.
[00:42:36] People have to understand how fast things are moving in these labs and the, and the thresholds that they're providing. Like, we don't think Anthropic, we don't think it can replace an AI researcher in the next three months. Mm-hmm. Okay. What about June of 2026? Like,
[00:42:56] Mike Kaput: right.
[00:42:57] Paul Roetzer: So the whole [00:43:00] thing is just wild.
[00:43:01] but I also want people to take away from this how little these labs know about how the things they're creating work. So whenever I say this on stage, like, Hey, yeah. That they don't really know how the language models work. I think some people think I'm like making that up. Go read this. Like they have no idea what these things are capable of, what emergent capabilities are gonna come out when they train it on a more powerful thing.
[00:43:28] And then when they do find out, they probably bury a lot of it. Yeah. Or they, they nerf it out of the system and say like, oh shit, we gotta get rid of that capability. Like we can't put that into the world. We'll blow past our a ESL four ranking. So it is again, living in an app parallel universe. Like if you understand this stuff, you know what's going on in these labs.
[00:43:48] Like you are living years into the future of where most people would ever be and probably will never get to because. It's like that. What's that movie? Don't look up. Yeah,
[00:43:59] Mike Kaput: [00:44:00] yeah.
[00:44:00] Paul Roetzer: Like it is literally like that right now. My gosh. Like,
[00:44:02] Mike Kaput: yeah,
[00:44:02] Paul Roetzer: yeah. Like you're in the know, you know the asteroids coming, like Yeah.
[00:44:05] And I'm not saying this is an asteroid and like it's gonna destroy humanity. I'm saying like the concept that there are people who scientifically based on fact know the world has fundamentally changed and everybody else is just like going about their business, thinking their job is safe and they're gonna keep doing what they've done for 20 years and everything's gonna work out great.
[00:44:25] And software stocks are just gonna keep going up. And it is, it is increasingly every day. I honestly just feel like we're the crazy. Yeah. It's like that. It's just like I sometimes have a hard time believing myself. How much things are changing.
[00:44:44] Mike Kaput: Yeah.
[00:44:44] Paul Roetzer: And how unaware most of the world is to that.
[00:44:47] Mike Kaput: Yeah. It's absolutely a wild experience to be thinking you're crazy all the time, but seeing this just clear as day. And you know, also, this is just terrifying to read because we know [00:45:00] how humans deceive themselves and others like no shit. These researchers are saying, of course they can't do their job yet.
[00:45:07] Like, I get that, like, and also, are you gonna tell me that, Mike,
[00:45:10] Paul Roetzer: I'm gonna take a survey internally, whether or not we need content creators.
[00:45:14] Mike Kaput: Right,
[00:45:14] Paul Roetzer: Mike's gonna be like, yeah, I'm out, man. You don't need me anymore. Just replace me. No. Like master the people are gonna be replaced.
[00:45:21] Mike Kaput: Yeah, exactly. And it's also like on top of that too.
[00:45:25] I don't wanna be a conspiracy theorist, but the moment one of these labs says we have ASL four and it's this dangerous as the moment, like the government starts knocking on your door trying to nationalize us to you. Like this is like, they already not nuclear technology. This is incredible.
[00:45:44] Paul Roetzer: So yeah, they've already knocked and thro has not opened the door yet.
[00:45:47] They're like the only one that hasn't opened the door yet.
[00:45:49] Mike Kaput: Right.
[00:45:50] Paul Roetzer: but I mean, anthropics closing a $20 billion round this week, like right. You, you can't close a $20 billion round and plan for an IPO this fall and [00:46:00] tell people we might have to shut down training in June. No matter how safety focused you are, no matter how much you're focused on like alignment, you're done.
[00:46:10] Like the second you admit we have to stop training, you're cooked. So you're basically just buying yourself time to fine tune these models and post training so they're safe enough to put out into the world.
[00:46:20] Mike Kaput: Hmm.
[00:46:21] Paul Roetzer: But the models they have internally, trust me. There are already two dangerous problems in the world.
[00:46:25] That's why they have to do all this stuff before they release them.
[00:46:30] Mike Kaput: Alright, so let's move on to some more, hopefully more interest there or more, positive fair here.
[00:46:37] Academy Success Score
[00:46:37] Mike Kaput: We, for our third big topic this week are going to do what's kind of been more of a recurring segment where we talk about AI in action, which is specifically how we are using AI in our own business to achieve the kinds of results we've been talking about.
[00:46:50] So Paul, I know you've been doing a lot of work behind the scenes on our AI Academy and also like a, an AI powered or a working with AI to [00:47:00] determine what you call kind of a success score
[00:47:02] Mike Kaput: That is integral to the future of AI Academy. Do you wanna maybe tee this up for us? Let us know what you've been working on here and how AI plays a role.
[00:47:10] Paul Roetzer: Yeah. So I thought this would be a cool one to share. This is pretty real time. Like I just did this last week. but we always talk about. The importance of using these tools as strategic thought partners, as like experts in things. Maybe you're not an expert in, but you have enough domain knowledge to know if the output is good and kind of how we work with the output.
[00:47:29] So again, this is pretty real time. I actually just had a meeting with the team yesterday where I went through this, so I mean, literally real time stuff. and sometimes I worry like I'm, I'm just sharing too much, but I dunno, I feel like it's, it's just for the good of everyone to like hear these things.
[00:47:45] So we'll just do it. okay, so basic premise, our AI academy. So we launched AI Academy in 2020. Online education, very basic at that point. A couple of courses, a couple of, of certificate series. and it was predominantly for [00:48:00] individual users. So then we sort of evolved the company. I , I shifted our focus in fall of 24 to like build out a scalable version of this for, for enterprises, basically for businesses so that they could educate their teams.
[00:48:14] So we did a soft rollout of what we call business accounts in summer of 2025. It was in August, and then we officially rolled out with a new AI powered learning management system in November, 2025. So since then, I mean we're about four months in maybe. we've brought on more than 150 companies, 150 business accounts that buy licenses to, for their employees to, to learn ai.
[00:48:37] So our goal is to build out a world-class customer success team, but what we've realized is it needs to be staffed more like a consulting firm with expertise in business strategy and change management. So for us, AI Academy isn't about selling courses. Like I've said this before, the past you can get amazing courses at LinkedIn and Coursera and Academy and directly, from openAI's and Google, like everybody's got AI courses and a lot of 'em [00:49:00] are amazing.
[00:49:00] You can go direct to other, you know, AI thought leaders and get stuff So. We are not trying to just sell courses for us. We're trying to provide an AI education system that delivers personalized learning journeys based on departments, roles, business types, industries, and more importantly by meeting individual learners where they are in their understanding and competency with ai.
[00:49:23] So I shared this idea a a week or two ago on the podcast, but let's take an example Enterprise that wants like a hundred licenses for their marketing teams. So someone comes and says, Hey, we want to upskill our marketing team. Let's go. We're ready to buy a hundred licenses. My directive to our sales team is do not sell them those licenses until we know who their point person is until we have a plan to make sure they're going to actually use those licenses.
[00:49:47] I don't want this to be like, go buy a hundred copilot licenses, give it to people and nobody use it. So just to set the frame, let's assume a hundred employees are all gonna have access to a gen AI platform. So let's say this company provides [00:50:00] copilot, Gemini, Claude, ChatGPT to their team. And they're now gonna provide AI Academy to them.
[00:50:06] So we'll assume 25% of that hundred are all in. They, they, they're daily active users of Gen ai. They can't live without it. They would be like AI champions, the power users. Then let's say there's 25% that are curious. They experiment with ai. They are not power users and haven't figured out how to integrate it into their daily workflows.
[00:50:24] If you ask them if they're seeing an ROI, they would say no, or I'm not sure. I don't, I don't even know how you'd measure it. 25% use it passively when it's baked into their work. They might not even know they're doing it, but like say email suggestions and we meeting summaries, things like that. And then 25% hate it.
[00:50:41] They, they want nothing to do with it. they don't use copilot. They don't want the AI training like nothing. Well, that's 25% of a hundred. That's a big, big waste. So up until now, like I said, we've focused on these individuals and those individuals we're choosing to buy licenses. So they are the AI four professionals and leaders who are seeking out training.
[00:50:59] They [00:51:00] want to be the change agents. They're early adopters, innovators. In the business account environment, AI education is a requirement, not a choice. So that changes the dynamic of what success looks like. So we need as Smart X to think about a success score that monitors health of accounts and then helps the admin, like our client contact manage adoption engagement and transformation success, whatever that looks like to them.
[00:51:25] And then for us, we have to be able to predict expansion, churn, and renewal. So I mean literally like five 30 this morning, I'm like just making these notes. So I was like, okay, what's the quickest way I can summarize this? Now I'll start on the how I used AI part. So I go to problems, GPT, we'll drop the note and if you've never used it's a free custom GPT that I built.
[00:51:42] and so I took that, that kind of outline. I just kind of went through and I dropped it in and I said, can you help me turn this into a clear problem statement? And then I pasted that narrative in. So this is step one in the AI Pro. Here's what it wrote, problem AI Academy has successfully onboarded 150 plus business [00:52:00] accounts since the official launch of.
[00:52:02] Business accounts in November, 2025, but we do not yet have a defined success score or operating model to measure, manage, and predict enterprise adoption across highly varied AI maturity levels, which it called AI champions, curious users, passive users, and AI resistant employees. That was pretty good.
[00:52:18] Without a structured success framework, we risk low engagement, stalled transformation on clear ROI and an inability to accurately predict renewals, churn, and expansion. Better problem statement than I could have written. That would've taken me an hour to, to summarize it that way. And then I've trained problems UPT to associate a value statement.
[00:52:36] And again, it's just like, it's kind of guessing, but it gives you, so it said value if, if even 20% of our 150 business accounts, 30 companies fail to renew due to poor adoption or unclear, ROI and assuming an average contract value of 25,000 per year, that's the made up number. I didn't give it that. That represents $750,000 in at risk a RR, not including expansion revenue.[00:53:00]
[00:53:00] Conversely, increasing adoption and measurable. ROI could unlock significant expansion revenue across departments and drive multimillion dollar enterprise growth. Okay, lesson one. If you are thinking about using AI in an innovation way, in a way that is additive to the organization, not just about efficiency and productivity, having problem statements and value statements is one of the best ways to do it.
[00:53:24] Hmm. Identify things you are trying to solve and then use AI to help you solve them. So I'm kind of backing into this where I'm like defining this problem now for, for you all, but in my mind, I knew the problem we were living through. Okay. So that's the context of what we're, so last week, which when you're listening to this would've been two weeks ago, I'm on a trip for, for a talk, and I have basically one evening to myself, and then like four hours the next morning before I have to catch my flight, I decide.
[00:53:53] Maybe I can get the success score built. I don't know what, what it is. I've got a couple notes, but like, maybe I can do this in this like two day window. [00:54:00] Basically it's like 18 hour window that I've got. So having built lead scores before, so this comes into the domain expertise. Like I have done things like this, I have manually created lead scoring systems for my agency for clients when I owned my agency.
[00:54:14] And I have done this for years for SmarterX. So I knew the general workflow I would need to go through to identify variables and then the weights of those variables to create a score. And I knew roughly how we would build it in HubSpot, which is, you know our CRM. But that process takes dozens of hours.
[00:54:32] it is a very manual data-driven process. So I'm like, alright, let's see what ChatGPT, and Gemini can do. Now, when I'm working on a high value problem, I like to use multiple models. I'll give the same prompt to both of 'em. I'll kind of like iterate, iterate, and I'm like, okay, Gemini's, just better at this one.
[00:54:47] Let's go. And then I'll like focus in. So here's the exact prompt I. And keep in mind, like I intentionally kept this pretty basic. So I said I wanna build a success score that our customer success team can use to monitor academy business [00:55:00] account adoption, predict renewals, expansions, churn, and prioritize engagement.
[00:55:04] I envision a simple model to start where we would build in HubSpot based on factors such as weekly active users, percentage of member first logins, certificates, earned courses completed, percentage of members who have completed a course series and earned a certificate. And then I said, what variables do you think we should include in v one of the success score?
[00:55:23] So the first thing I wanted was, here's some general ideas like you tell me. so I put that into ChatGBT, and I put it into Google Gemini. I then took the outputs of both those, the variables they recommended. I put them into a Sandbox doc in a Google Doc. And then I began editing, editing and curating the recommendations.
[00:55:41] And I was like, this is pretty good. Like, I immediately, I was like, wow, I might actually be able to do this tonight. Like this is, I'm, this is way further than I thought. This was like eight o'clock at night. So I'm, I'm sitting there and I'm like, all right, let's go. So once I had the V one model I was happy with, which took maybe an hour or two, I went into Claude and this was the, this was the [00:56:00] unique thing.
[00:56:00] I don't use Claude very often. So with Claude, I didn't give it this starting point. I didn't get it. The draft I had done, I didn't even give it all that context. This is the prompt I gave Claude. I said, go to this webpage and learn about AI Academy by SmarterX. And then I paste it in the page about AI Academy.
[00:56:17] I said, once you've reviewed the page and understand the brand and offering, I'll let you know what to do next. It then went, it wrote this crazy good summary that it obviously understood what we were doing, what the plan was, the pricing model, all the stuff. So then I came back and I said, great. I want you to build a predictive scoring model for the customer success team to monitor health of business accounts and product, and predict expansion and churn.
[00:56:39] What variables should we prioritize? I want you to keep it simple to start. So my thinking here with going to a third model was I wanted an objective take. So in ChatGPT, I used my Co-CEO GPT that's trained on our company history. Revenue model roadmap plus it has a lot of memory of things I've done In Gemini [00:57:00] I used my AI teaching assistant AITA, which is trained on all of our academy roadmap and instructional design principles.
[00:57:06] So Claude was basically an objective outsider, and Claude crushed it. This was 4.5. It was like the day before 4.6 came out. Yeah. So I took the variables, Claude created, I revised a model, and then I put it back into Claude. It. Then without me asking for it, produced this incredible workbook. And you've seen this thing, Mike, like we went through this yesterday.
[00:57:25] It has a scoring model with weights and scoring criteria and recommended properties to use in HubSpot. Mm. It has health tiers based on scores with recommended actions and outreach cadence. It has a HubSpot implementation guide for the properties and conditions to set. It had a. Score calculators. This is each tab it created in a workbook for manual testing, and then it offered lifecycle waiting considerations based on the adoption phases.
[00:57:51] I took this, I edit it. it was, honest to god, my top level senior level strategist work, like as good as anything I've [00:58:00] ever gotten from a senior strategist in our company or in an agency. I then said to it, excellent. I wanna share this with my team. Can you help me build out a strategy brief brief for each of the tabs?
[00:58:11] I'd like to include an introduction for each tab that explains that, those tabs and provides a bit more context and details. Can you write the draft? I spent another hour at the airport taking that output, editing it. 'cause I get to the airport, I'm like, I got like one hour. I don't have time. Once I get back, like, I'm not gonna come back to this.
[00:58:27] I have to try and finish this. And so I sit at the airport for an hour. Luckily I had a flight delay. I edit this draft, I send it to the team. I put a, a meeting on the schedule for what would've been February 11th. I say, Hey, we're gonna go through this. We're gonna meet, we're gonna talk through these. We meet, and this is the real important part.
[00:58:42] So we now meet as humans. We have this AI assisted thing we've created. The team had gone through and added comments. The goal for the meeting was to arrive at a consensus on the variables and the weights of the a hundred scale, you know, a hundred point scale health score. We actually came up with an MVP approach [00:59:00] that Claude chat between Gemini hadn't thought of like a faster way to actually get this in use within like two weeks.
[00:59:08] And a project that easily would've taken me 50 to a hundred hours, like no joke, easily, completed it in three to five hours while traveling, sitting in an airport, sitting on a patio at a, at a hotel, and the whole thing was done. And it is going to be operationalized within two to three weeks on the team, and it'll become the foundation to manage relationships with these business accounts.
[00:59:32] That is. Reality. So like, forget the sci-fi stuff. This is why Mike, you and I say this all the time, like if, again, I assume most of our listeners know this stuff as possible. Like this isn't news to you. Right? But if you have people in your company that don't get this, just clip this segment. We, we clip every segment on YouTube.
[00:59:54] Like you could literally just go grab this segment of 12 minutes, whatever I'm talking for, just send them this segment. Like [01:00:00] listen to this.
[01:00:00] Mike Kaput: Yeah.
[01:00:01] Paul Roetzer: This is practical stuff that anyone can do. Any leader in a company who has domain expertise, who has done a thing before can just work with the models to do it better and faster.
[01:00:13] If I did not have these models. This success score would've taken three more months to do. Like I literally, my schedule between now and April is booked solid. I would not have had time to build this. And instead it's built, it'll be activated, and in theory, it'll be worth millions of dollars to the company over the next couple years.
[01:00:33] And more importantly, hopefully worth millions of dollars to the business accounts, who will now get greater value out of their licenses because we built the success score and use it to drive customer success.
[01:00:46] Mike Kaput: That is incredible. I love that, that we should do even more, talk even more about this as a case study, as it evolves.
[01:00:52] just to really emphasize for people too, to, to be very, blatant about connecting the dots here. Even if [01:01:00] you have nothing to do with a success score, customer success, any type of education business, I want you to really think about. The steps that Paul went through here. It is not just using AI as a search engine.
[01:01:13] It is not going back and forth a couple times. It's using AI to eva. Create a problem statement in whatever domain you're working in, including your domain expertise and context with multiple models. Having multiple models play off each other, check each other's work, give different perspectives.
[01:01:28] Synthesize those models. You're using custom gpt and gems that are customized to different use cases and context. There's elements of personalization and memory in here. Look at how all these things work together to create something that is exponentially more valuable than just using a single model alone.
[01:01:47] Paul Roetzer: And you, if you've never done anything like this, trust me, you can do this. This is not complex stuff. It is just what Mike said, dude did, he just outlined. And you can apply and. Then you just pick the next problem and [01:02:00] solve, like, I'm, trust me, I'm already, we're already moving on to the individual one.
[01:02:02] It's, we have thousands of individual members. It's like, okay, let's do the same thing for individuals and let's, you know, drive that, provide value creation for them the same way. So it's just like, boom, boom. Like just as I said on a podcast a few weeks, it's like I just can't do enough stuff. Like I, there's so many things that are now achievable that I just find myself every day, like, I just want to tackle the next thing.
[01:02:22] Yeah. Like it just, it's so fun to be able to do these things that would've taken me a month before and now I can just do 'em in a couple days and you're just trying to find these windows to do this stuff.
[01:02:32] Mike Kaput: Alright, before we dive into some rapid fire, Paul, just another reminder, this episode is also brought to us by our 2026 State of AI for Business survey and report.
[01:02:42] So we are currently in survey mode as we are expanding our popular state of marketing AI report that we do every year. So this year we're actually going beyond marketing specific research. To uncover how AI is being adopted and utilized across organizations. So to do that, we're hopefully looking to survey literally [01:03:00] thousands of business professionals across all industries and functions.
[01:03:04] We would love for you to be one of them. So the survey that we have running right now literally takes only about five to seven minutes to complete. If you complete it, you will get a full copy of the report when it drops. You'll also have a chance to win or extend a 12 month SmarterX AI mastery membership as part of AI Academy.
[01:03:23] So if you go to SmarterX.ai/survey, you can go take the survey there. We would love to get your input if you have a few minutes.
[01:03:33] High Profile AI Resignations
[01:03:33] Mike Kaput: Alright, Paul, let's dive into some rapid Fire four this week. First up, a wave of high profile departures. Hit the AI industry this past week with senior figures leaving three of the biggest companies in this space within days of each other.
[01:03:45] So. First at OpenAI, economist and researcher Zoe Hite resigned on the same day. The company began testing ads in ChatGPT. How about this? You know, the, not just an employee resignation, but she also wrote a New York Times opinion [01:04:00] essay where she wrote that users shared deeply personal information with the system and warned that OpenAI risks repeating Facebook's trajectory of gradually eroding user trust with the conflicts that ads create.
[01:04:13] At the same time, at Anthropic, me, Mrinank Sharma, who led the safeguards research team published a kind of vague resignation letter saying, quote, the world is in peril. And that employees quote, constantly face pressures to set aside what matters most in developing AI at Anthropic, which informed his decision to leave there right around the same time at xai.
[01:04:36] Half of the company's 12 original co-founders have now departed all within the same week. Jimmy Ba hung out and Tony Wu all left. This was shortly after SpaceX acquired xAI in an all stock transaction ahead of a planned IPO. So Paul, three high profile departures from the three of the major labs. Are these connected in any way or is this just a coincidence of [01:05:00] timing?
[01:05:00] Paul Roetzer: Seems like some themes. I mean, there definitely is more people leaving due to concerns around safety and alignment and being public about it. That's not new. That's been going on for a couple years. People sort of seeing where these models are going and thinking more work needs to be done and the labs used to be the place to do that work.
[01:05:16] And places like OpenAI have definitely prioritized commercialization over safety and alignment. Not to say they're not doing safety and alignment, but I think it's harder and harder to have a voice and to have the compute power you need to do the safety and alignment when you're trying to run ads and do all the other stuff.
[01:05:33] I think some, based on what people are publishing, is a bit of soul searching. Like I'm seeing the AGI, feeling the AGI. Things have changed, and I don't think this is the best use of my talents to be here doing this. Like, I think I gotta go figure out what's going on in the world. And then the XAI stuff is just Elon Musk and founder mode, man.
[01:05:51] He just, yeah, chopped heads. Like I saw one that said they cut like 50% of the staff, basically.
[01:05:56] Mike Kaput: Oh, wow.
[01:05:56] Paul Roetzer: And then he tweeted, you know, basically like, [01:06:00] as they're merging the companies, like things are changing and, you know, everybody was here before. I mean, they lost like three co-founders of XAI. It wasn't just like employees leaving.
[01:06:09] Right. and they were all pretty public about it. Everybody says what they're supposed to say, like, you know, it was great working. Best experience ever. Elon's amazing. But, I did read something that said that basically the latest version of Grok that was supposed to come out in December, early January, was not up to par.
[01:06:25] And he was very unhappy. And so I think, that one probably has more to do with when Elon's focused on something. If it's not performing up to par, he doesn't care who you are. Like you're just gone. So I, yeah, it, but again. You can feel trends happening and there definitely in the last like 10 days, there are way more public announcements of I'm leaving labs for these reasons, and I, they kind of all sort of fit within those three buckets so far that I've seen.
[01:06:55] OpenAI’s Changing Hardware Plans
[01:06:55] Mike Kaput: Alright, next up. openAI's plans for a consumer AI device hit a bit of a [01:07:00] setback this week. So the company abandoned its io branding that was the name of this kind of planned hardware line. They did this after a trademark infringement lawsuit from the audio startup io. So io the letters io are openAI's io are the audio startups.
[01:07:17] So openAI's Vice President Peter Weller confirmed in court filings that the company will not use the name and they plan to announce a replacement later. The filings also revealed that the first device will not ship before late February, 2027. That's roughly a year behind earlier projections. The company has not created any packaging, branding, or marketing materials for the device.
[01:07:39] This first prototype is described as a screenless device designed to sit on a desk alongside a phone and laptop we've talked about in the past. This is being developed in collaboration with a firm called Love From, which is the design firm founded by legendary former Apple Chief Design Officer. Jony Ive, so Paul, this hardware device is delayed a year.
[01:07:59] They had to [01:08:00] dump the brand name. They don't have any marketing materials, packaging. Meanwhile, they've just launched ads. They're running this enterprise push. At what point does spreading Thin become a real strategic concern for openAI's?
[01:08:13] Paul Roetzer: I mean, they're, they're definitely trying to get their hand in everything.
[01:08:16] I mean, they're also looking at robotics again and Yep. I thought Sam was doing something with space and nuclear fusion and Yeah, I mean, they're just going after it all. I don't know. I mean, the intrigue around the device continues. Who knows what it's gonna be? We've heard lots of rumors. There was supposedly a leak that an ad They were gonna run an ad during the Super Bowl.
[01:08:34] Yeah. That was like previewing it, but they said that was not real. total side note. I thought this was pretty fascinating. Ferrari just announced, the interior of first, partnership with love from Jony Ives, firm that they redid the inside of a Ferrari. Really? It is pretty cool. Like, ooh, there's like this two minute video and they showed it.
[01:08:57] It's, it's in essence like you could look and be like, oh, [01:09:00] so that's what the Apple car would've looked like. It is,
[01:09:02] Mike Kaput: yeah.
[01:09:02] Paul Roetzer: It's like if Jony had stayed at Apple and they had done Project Titan and brought a car to market, you can look at this and be like, okay. Like I can see what it would've been. Cool.
[01:09:10] Pretty cool. So yeah, it might be worth, like if you're into that stuff, cars or technology, it's, it's a cool video to watch.
[01:09:17] Does AI Actually Intensify Work?
[01:09:17] Mike Kaput: That's awesome. All right, our last topic this week, some new research is challenging. One of the most common promises made about AI in the workplace, the promise that it will reduce the amount of work you have to do through productivity gains.
[01:09:30] So researchers at UC Berkeley's High School of Business, conducted an eight month study at a US technology company with roughly 200 employees, and found that AI tools quote, consistently intensified work rather than lightening it. So they identified three patterns at play here when AI is being used.
[01:09:49] First is task expansion where employees took on work they previously outsourced. So for instance, product managers writing code themselves because they now can with ai. Second, was blurred boundaries [01:10:00] as the conversational feel of prompting apparently allowed work to spill into the evenings. It was just easy to fire up and do at random times.
[01:10:08] And third, increased multitasking that created hidden cognitive loads. So switching between different tasks with ai, windows chatbots, or agents. So the authors warned here of a self-reinforcing cycle that can happen with AI usage, where faster output raises speed expectations, which drives greater AI reliance, which broadens task scope further.
[01:10:31] So Paul, I'm curious about, are you seeing anything related to this in your own work within SmarterX, hearing it from others? are you experiencing any of these problems?
[01:10:41] Paul Roetzer: I don't know what their hypothesis was going into this research, but I can't imagine any of this was news to anybody. Like
[01:10:47] Mike Kaput: Right.
[01:10:48] Paul Roetzer: I, yeah, I mean, like of course all of this, like, yes, I don't, I've never met a professional or leader who's really good at their job, who doesn't have a sandbox of stuff that isn't getting done every day.
[01:10:59] Mike Kaput: [01:11:00] Right.
[01:11:00] Paul Roetzer: So, unless like a company has infused some policy, we're like, we're gonna give you these AI tools and you're gonna be more productive, but you're only gonna work 35 hour weeks now.
[01:11:10] Mike Kaput: Mm-hmm.
[01:11:11] Paul Roetzer: Because we're making a 30% profit margin instead of 20, and we're just gonna be content with that. Like, who the hell's gonna do that?
[01:11:19] Like, right, right. You just, because like that could go away tomorrow. Like okay, great, we're seeing these gains, but like we gotta stay on this and we gotta drive growth. So yes, a hundred percent. I've seen expectations of growth and operating margin are increasing. Like in 2026 of take software companies, there is a significant increase in expectations on the growth plus the operating margin.
[01:11:39] what used to be called the rule of 40, you know, now it's like rule of 60, rule of 70. So you're expected to grow at different rates. I will say, Mike, like just, you know, thinking out loud here, like I do find myself having to give myself more grace. Yeah. Meaning, [01:12:00] you know, like the success score is a good example.
[01:12:02] So I'm on this trip, I'm doing a talk, like it was a pretty important group, pretty important talk. I finished that talk. And there's a party that's like, I think I'm just gonna go swim for like two hours a night. Mm-hmm. Just not do anything, go do a workout, go swim, and then enjoy it. And I did, like, I actually did take an hour off.
[01:12:21] I go to the gym, I go to the pool and then there's the party that's like, I think that was enough for the day. Like, you should just like, relax. I'm like, I could build the success score.
[01:12:31] Mike Kaput: Mm.
[01:12:31] Paul Roetzer: And so then like, what could have just been a relaxing night. Maybe watch some Netflix or chill. I just built a success score for three hours.
[01:12:40] Right. And then I wake up and I get up early and I work on it again. And I do in essence the equivalent of a month of work, what might be worth millions of dollars to the company. And then I get on the plane and there should be like that. Alright dude, just like, relax, like, and I put on a freaking podcast like to, [01:13:00] yeah.
[01:13:01] So I am still, I would say myself learning to live within this world where. You can create a disproportionate amount of value quickly and to be okay that like, I'm gonna shut down at three o'clock today, right? I'm gonna make it to the gym today.
[01:13:18] Mike Kaput: Right.
[01:13:18] Paul Roetzer: And I, and so I do feel this need to because there's so many things that can be done now and that I want to do.
[01:13:24] Like, I'm enjoying. It's not, I'm not working and like being miserable,
[01:13:28] Mike Kaput: right?
[01:13:28] Paul Roetzer: I want to do the next thing. Like, it's like I just can't do enough. But at the same time, I generally, like I've said before in the pilot, like I pick my, I take my kids to school every day when I'm home. I often pick them up from school.
[01:13:42] I don't work from five o'clock until nine o'clock. And I actually don't even work nights that much anymore. Nowhere near as much as I used to. weekends I work a little bit like usually before the kids are up, but I don't, in their eyes, I'm not always on my phone. I'm not always working right. [01:14:00] And that's like enough for me right now because I'm enjoying what I'm doing.
[01:14:04] So I just, I feel like people need to. As you become more productive, I think organizations need to allow employees to have some grace of time back. Like give them the time back and incur, make them take that time back.
[01:14:18] Mike Kaput: Yeah.
[01:14:18] Paul Roetzer: And not just keep loading it in. But again, this goes back to change management and transformation.
[01:14:22] It's why we're trying to build consulting into like we, we think of what we do is like change management consulting as part of your account because I think that's what's needed. It's just taking courses and getting tools isn't gonna be enough. We're just gonna keep increasing productivity and like never get the benefits of ai.
[01:14:38] So yeah, I totally this, there's nothing in this that I wouldn't have assumed was true. I think it just highlights the fact that we need to do more as business leaders to make sure we're capturing some of that time back for ourselves and for our people.
[01:14:52] Mike Kaput: You know, one final note here that I just found resonated on a really practical level for me is when they were talking about this context [01:15:00] switching that happens.
[01:15:00] Yeah. Because for a very long time in my career. I have tried to engineer or architect my schedule to prioritize literally single tasking and deep work because it's the only way I've found to actually get anything done. I'm terrible. And the moment I context switch, I'm, I'm cooked. Yeah. So that has been something where I've literally spent years and successfully so structured my schedule that way and it's been extraordinarily beneficial.
[01:15:27] But now I have to flip it because actually agent orchestration, AI orchestration, rewards, context switching. There's plenty of areas where I still need to single task and do deep work that only a human should be doing, or only I should be doing. But I've really found myself having to rearchitect what I've spent years building.
[01:15:44] Yeah. Because I need to have periods where it's like, okay, at the beginning of the day, we're gonna set up. The agents, were gonna then do a bit of deep work on this one thing, but then I need to jump back in. And so you can really lose the plot a bit if you're not intentional about it. It's been a struggle, [01:16:00] but, you know, I think I'm getting there.
[01:16:01] But it's interesting how that changes.
[01:16:03] Paul Roetzer: I did that last night right before I went to bed. Yeah. I went into ChatGPT and I was like, Hey, can you do this analysis for me? And I don't remember what it, the prompt was. And then I wake up this morning, I was like, wait a second, did I run a project last night while I was sleeping?
[01:16:16] Like what was that? I go back in and it was like a deep research project and it actually failed for some reason. Yeah. And I was like, oh, that's funny. 'cause yeah, you just like, you have ideas, you jump in, it's like, oh, let's run this in Claude while I'm doing this other thing. And then you kind of forget you're even a
[01:16:29] Mike Kaput: hundred
[01:16:29] Paul Roetzer: doing these things so wild.
[01:16:31] Mike Kaput: It's fascinating. All right, Paul, that's all we've got this week. One quick reminder, go take this week's AI pulse survey at SmarterX.ai/pulse. This week we're asking two questions based on some of the topics we've been talking about. So the first one is based on your own experience. How would you describe the current pace of AI improvements?
[01:16:49] So things like, it's accelerating faster than I can keep up with. It's moving fast, but I'm keeping up, et cetera, et cetera. We also wanna know, has using AI tools changed the total amount of work you do? Are you getting more done [01:17:00] in less time, doing more work overall or no meaningful change to your workload, et cetera.
[01:17:05] So be interested to see that. Go take this week's survey. Paul, thank you for breaking down another packed short but packed week in AI based on our timeline here.
[01:17:16] Paul Roetzer: Yeah, thanks for squeezing this in and I think you and I gotta go get ready for the agency summit.
[01:17:20] Mike Kaput: Yes, we do.
[01:17:21] Paul Roetzer: Alright, thanks everyone. We'll talk to you next week.
[01:17:24] Thanks for listening to the Artificial Intelligence show. Visit SmarterX.AI to continue on your AI learning journey and join more than 100,000 professionals and business leaders who have subscribed to our weekly newsletters, downloaded AI blueprints, attended virtual and in-person events, taken online AI courses and earned professional certificates from our AI academy and engaged in a SmarterX slack community.
[01:17:49] Until next time, stay curious and explore ai.
Claire Prudhomme
Claire Prudhomme is the Marketing Manager of Media and Content at the Marketing AI Institute. With a background in content marketing, video production and a deep interest in AI public policy, Claire brings a broad skill set to her role. Claire combines her skills, passion for storytelling, and dedication to lifelong learning to drive the Marketing AI Institute's mission forward.
