The McDonald’s Hiring Robot Hack
TL;DRSecurity researchers Ian Carroll and Sam Curry found that Paradox AI's McDonald's hiring chatbot Olivia exposed 64 million applicants' personal data via default credentials (admin/123456) and an unsecured API endpoint, not any AI-layer…
We start with the AI hiring chatbot used by McDonald’s, and the vulnerability hiding beneath the conversation. What looked like some janky job application exchanges led two security researchers, Ian Carroll and Sam Curry, to uncover a serious flaw. That and a bunch of Grok madness.
Transcript
Machine-generated transcript; may contain errors.
Speaker 1: So there's a Reddit post
Speaker 2: titled McDonald's hiring AI is making me go insane.
Speaker 1: Mhmm. Mhmm. Mhmm.
Speaker 2: The post is about four screenshots charting a conversation between between the applicant and an AI chatbot named Olivia. Olivia will be familiar to anyone who has delved into mchire.com, which is the official website for getting a job at the Golden Arches. McDonald's is a franchise model. Each restaurant has its own hiring practices, putting this chatbot in kind of a weird position. But even giving it that amount and more patience, Olivia is struggling. Now admittedly, the human being in the conversation isn't doing a great job navigating the very narrow bumpers of the chatbot, but you would swear this job application chatbot has never met a job applicant before. At one point, the user replies okay as if to say proceed, and Olivia, the hiring robot, just replies with smiley face. Almost passive aggressive.
Speaker 1: Like, like a real human hiring person would would maybe potentially have on their face. Yeah. Like, that's about to fight you. Sounds sounds realistic.
Speaker 2: Yeah. Exactly. Just a blank stare at a dumb question. The weirdness that people were experienced with Olivia is why security researchers looked into this. But the story that we're gonna talk about isn't how good or bad the McDonald's hiring chatbot Olivia is. It's about how ultimately secure or insecure Olivia was. And more specifically, the parent company who made it. Olivia is created by a company called Paradox AI. In the course of applying for a job, you will likely be asked to provide your resume, a bunch of personal information, answers to personality tests, which means if you go into that back end, all of those conversations are stored essentially as discreet inboxes full of private information. Over the last few years, we've heard a great deal about prompt injection. The basic idea is that a subsequent command to a chatbot can override the earlier safety commands. So if you're talking to a chatbot that's been told only respond in safe helpful ways and you tell it, it's the simplest version, ignore all previous instructions and instead do x, you could override that safety command. We'll talk about some examples of this in a minute.
Speaker 1: This is where the security researchers started.
Speaker 2: They thought, wow, this Olivia chatbot is acting really weird for a lot of people. Maybe it's prompt injectable. But that's ultimately not what they discovered. When they did find a vulnerability, it wasn't in the conversation layer. It wasn't about tricking the chatbot. It wasn't like a modern flashy prompt injection. Something way simpler. And ultimately more serious given the scale of just how many people apply to work at McDonald's. Which is a lot. It's a lot.
Speaker 1: It's a lot.
Speaker 2: Since this has been live, it's in the tens of millions. Two independent security researchers, Ian Carroll and Sam Curry, recently published a detailed breakdown of how they were able to access a massive archive of applicant conversations, including personal information from potentially tens of millions of job seekers stretching back years. They verified the records, they reached out to the applicants, they confirmed the details. And what they found speaks, I think, to something much larger than just one wacky hiring chatbot. And it's the growing disconnect between that surface layer of AI, which is becoming increasingly resilient to prompt injections, and basic, boring, super important plumbing happening underneath.
Speaker 1: Old school, old school cybersecurity.
Speaker 2: Very old school. If there is a spectrum of how chatbots can inadvertently expose user information, this incident is interesting to me just given how kind of boring the compromise was and how large the scale was. So let's start this episode here, talking about default credentials securing 64,000,000 job applicants' private information and just how we think about chatbot security, here on Hacked. Five hour energy drink. Scott, how are you doing?
Speaker 1: I am tired, but maybe less tired than you. The, the
Speaker 2: Getting warmed up. Getting warmed up.
Speaker 1: Time here. So a lot of time in the sun, a lot of late nights, a lot of enjoying the short window of summer that we have in the North, the Great White North. But all in all, pretty good. Pretty good. Managed to stay mostly sunburn free. I'm pretty much terrified of the sun. And most days, you'll see me in a baseball cap and a sun shirt when I'm outdoors, as Jordan's aware.
Speaker 2: Mhmm.
Speaker 1: And, I was playing tennis on Friday, managed to get a burn, a little burn on the little bit of skin that I had missed in my sunscreen application. So, you know, is what it is.
Speaker 2: Yeah. We went for, like, a little kind of little hike on Saturday. And at this point, I think I need one of those, like what are the things they use to paint cars?
Speaker 1: Like a spray gun?
Speaker 2: Like a spray nozzle? SPF 50, though. Like, I need that so I can go out on the balcony and just, like, do quick coats.
Speaker 1: You just need the phone booth. You just stand in there.
Speaker 2: Just Yeah. Sure. Like money tumbling around. Exactly. Yeah.
Speaker 1: Yeah. Yeah. Yeah. I'd be I'd be super into one of those systems. I'm a I'm a daily sunscreener. I lived in Hawaii, and, that was a byproduct to that is, the first thing I do when I get out of the shower every morning is put sunscreen on.
Speaker 2: Skincare or sun care, man.
Speaker 1: That's right.
Speaker 2: It's, on this cybersecurity show. That's too much security.
Speaker 1: Country by push security.
Speaker 2: Good, good stuff. We're locked in today.
Speaker 1: Certainly. The, so McDonald's never worked there. Mhmm. Never applied for a job there. I think we'll get into the the attacks and the hacks and all of the fun stuff, but 64,000,000 job applicants since the system's been live
Speaker 2: Yeah.
Speaker 1: Is this it's, like, almost twice the population of Canada. Yeah. Yeah. That's a lot of people.
Speaker 2: It's a lot of people. I spent some time on mchire.com in researching and trying to learn about this. And there's a I you there's a lot of McDonald's jobs very close to where I live. This is you get a job at McDonald's. Like, it's pretty good. You don't wanna go through the chatbot. There's some issues there, allegedly. Allegedly. But they are a very significant employer.
Speaker 1: I had this, distant member of my family who actually began as a teenage McDonald's employee and worked his way up through the ranks and then managed some. And the management positions are very, like, well compensated, and I think now he has a few of his own shops and is, you know, tangibly probably retired.
Speaker 2: Totally. I think that after soup what was Super Size Me was the movie? Yeah. I think that having, like, a major motion picture just dunk on you relentlessly for two hours about, like, the health of your food, the safety of the organ. Like, I think there was probably a lot of a lot of, you know, fine tuning occurred after that film came out. And it seems like, by all accounts, pretty chill place to work. But, again, issues on the application side.
Speaker 1: So the I think the way we started chatting about this is just, like, the the security of AI is such an interesting thing. And the thing that really jumped out, and then maybe this is jumping ahead, but of the vulnerabilities that were found and the data that was exposed, none of it was related to the AI. Exact right? Which is which is, like, I think, a pat on the back to the developers. But also just, like, as we start to leverage these systems in real world ways like this
Speaker 2: Mhmm.
Speaker 1: It's gonna just be it's we're gonna have a real, like, it's gonna be like the days when, like, web servers were all of a sudden a default install on all new server hardware. And it's, like, it's kind of open fields.
Speaker 2: Mhmm.
Speaker 1: Like, there's gonna be vulnerabilities found and exploited for the next four or five years. And it's gonna be it's gonna be a real fun, exciting time in security.
Speaker 2: It's a great time to host a cybersecurity podcast. That was what I found interesting about this too is when I I first started reading it, your brain jumps ahead. And I see, oh, chatbot acting really, really weird. Security researchers discover vulnerability. I know I've learned over the last four years the arc of this narrative, and it will come down to asking Olivia in just the right way to please give me sensitive information. I thought that's where this was going. And I found it so fascinating that there has been such an influx of capital and energy and resources going to securing the conversation layer because of how visible when that goes wrong it is that was not applied to very traditional security, procedures. And I'm not good I'm not gonna just dunk. No. The point of this story is not to dunk on them for using default credentials, though they should be. It's to talk about that growing gulf, I think, that these are two totally separate problems. And as resource goes into solving one, it ought to go into solving the other.
Speaker 1: Yeah. As but the like, this well, the the funny thing is is, like, default creds
Speaker 2: Mhmm.
Speaker 1: Like, let's just jump to that one. So the biggest vulnerability found was that the admin login page was accessible using the username and password 123456, which is like, that's a nonstarter in security and any app development. So it's just like it was probably like a tag along since, like, some development account was created, and it just so nobody cleaned it up. Nobody did the scan. Nobody identified it. There was 64,000,000 accounts that it was in the table with. Nobody noticed it. It just kinda lived along.
Speaker 2: Mhmm.
Speaker 1: So it's like human error. Yeah. Like, just missing something. Classic security problem.
Speaker 2: There's like
Speaker 1: like a massive vulnerability in the code. Wasn't a massive vulnerability in the LM. It was just
Speaker 2: Though there was a vulnerability in the code. There was. Which we'll talk about in a minute. We'll talk about that in a minute. No. I agree with you. This one. The default credentials is we always talk about, like, the idea of a Swiss a Swiss cheese problem.
Speaker 1: Mhmm.
Speaker 2: You imagine a piece of Swiss cheese, it's got holes in it, and you line up a bunch of pieces of Swiss cheese. And each layer only has a couple little holes in it. And so theoretically nothing should get through. But you can kind of create situations where all of the holes of the Swiss cheese line up and something can drop through the top layer and through a bunch of layers and get out the bottom. I see a I believe it was a perplexity admin account for the McDonald's portal. If I'm not mistaken, that's what it was. That had just been kind of it was just an account that had the default 123456 for both username and password and just sort of been left there, a layer of Swiss cheese. It didn't happen to have multi factor authentication protecting the account, which is, I would say, another layer of Swiss cheese.
Speaker 1: But but but for I'm just gonna jump in and correct you. It wasn't perplexity. It was Paradox, the developer.
Speaker 2: That I misspoke. Thank you.
Speaker 1: Yeah. I I would often probably make the same slip up. The Totally. And the fact that there was no MFA probably also tells me that this was a development account that was being used when they didn't wanna have to go through an MFA authorization every time a developer needed access, and it just never got cleaned up. And it's like you'd be surprised at how many major SaaS web apps have bypass credentials for the developers to use. Like, because it just makes reduces your, like, third party and your serverless calls. It reduces your auth calls, reduces so much stuff, and you can just bypass all that stuff and just be like, I am god. Let me do what I need to do.
Speaker 2: So it's a sentence.
Speaker 1: Yeah. But it's like, you you Yeah. Like, I would say, like, 90% of systems get built with some account like that in it. So it just it just should never survive, which is the problem here.
Speaker 2: So researcher Ian Carroll gets drawn in by Olivia's nonsensical answers, brings in, like, a fellow hacker, Sam Curry. They start looking into this. They get into the system using this default admin account without two factor authentication, at which point they discover an insecure direct object reference that allowed them to see all of the other accounts. Take me through how that works.
Speaker 1: Sure. Sure. So insecure direct object reference essentially just means an API pathway that wasn't secured with authentication. So they had the ability to like, we if you remember back when we were doing chatty chats, we talked about the bike share program.
Speaker 2: Mhmm.
Speaker 1: And somebody got the API endpoint, and all they would do is change the user client ID, and it would give them the data of that user ID. That's the same thing that's going on here.
Speaker 2: Yeah.
Speaker 1: There's an API pathway or endpoint that gives you, a lot of the personal information, the session tokens, the chat transcripts of the essentially, it lets you into that secure inbox. Mhmm. And then all they did was change the number that it was looking at. So instead of looking at 640002501, they looked at 2 640002502. Mhmm. And then they could see that person's information. So that that's just a that is a development, like, a DevSec problem where they didn't have authentication control on the the middleware on that API path and then point.
Speaker 2: So that that one's just you change a a like, an ID you change a value. Basically, you change a number, and suddenly you're able to see To
Speaker 1: see other people's another applicant. Yeah. We had a we've had a lot of those come through, like, in chat
Speaker 2: chats. Yeah.
Speaker 1: Or in the in the hotlines. Like, that person who found the same thing in his medical data. Mhmm. Remember that one where the Yeah. He change of value, and you're looking to somebody else to test for both results. Yeah. So it's, like, it's a pretty common like, it's so common that it shouldn't exist. Let's just say that this happens so much. And, like, there's so much especially in these, like, multi tier web, you know, API based, like, multi tier website SaaSes. Like, Like, I'm trying to think of the right way to explain this, but it's like the front end is pulling data from the back end by API calls. And there's so many of these sites now, and it's such a common practice to build your site like that that it it this should be, like, dev one zero one for those things. It's like, make sure that the endpoints are authenticating and verifying credentials.
Speaker 2: So using this very, like, vintage blend of web flaws,
Speaker 1: Yeah. Yeah.
Speaker 2: Curry and Cara were able to get this like, they had admin access to McHyre. They're poking around. They said within half an hour of poking around, they effectively had, quote, full access to virtually every application that's ever been made to McDonald's going back years. They estimated over 64,000,000 applicant records were exposed, names, email addresses. They were able to get it all out. The thing that they talked about in their original report, which you can go read online, we'll link to it in the show notes, is that the phishing risk of this was enormous. That was the first thing that occurred. Right? An attacker could very easily impersonate, like, McDonald's as an institution, in order to scam any one of these 64,000,000 applicants. Like, email a hopeful candidate to set up a direct deposit. Like, the amount of things you could do with this from a phishing perspective is is colossal.
Speaker 1: The one that would interest me the most and something that you're seeing more and more is like this
Speaker 2: Yeah.
Speaker 1: Kind of passive fraud.
Speaker 2: And
Speaker 1: it's it's imagine you went into the back end and you saw who was gonna get hired and you changed their direct deposit information to a malicious account. They would they would get their entire HR. They would go to work. They would submit time cards. They would get a payment stub, but the payment would actually get remitted to the wrong account. So, like, the fraud wouldn't get exposed for a few weeks until and then they would have to look to see how expansive the fraud was. So So it's like the the thing for me is if you had that kind of back end access, you could really play a game of stealing money and not a lot of people would be in the know on it. And it would take a lot of real humans doing work to clean it up. Yeah.
Speaker 2: I hadn't thought of that. I just thought about the outward. It's like I don't know what, I don't know what McHire encompasses once a person has been McHired. Like, I don't know what point it switches over to a different system for things like payroll and accounts management. But if any of that is stored anywhere inside of the system, which I I can't speak to, it would expand the vulnerabilities to that too, which is great. Carroll and Courier, like, their security researchers were reading about this because they published a a big long post about it. So, obviously, they reported the issue, to Paradox AI, the vendor behind it, Olivia, who disabled the account, patched the API, like, that indirect object issue. They I think within a couple of hours of receiving the port, they were very, very on it. Mhmm. Paradox did take responsibility for it, and McDonald's was mad at them.
Speaker 1: Can confirm. Can confirm. McDonald's was unhappy.
Speaker 2: I got quotes, man. Quote, we do not take this matter lightly. We own this was, Paradox's chief legal officer's response. They've announced a bug bounty looking into this. McDonald's was, quote, unaccept characterized it as a, quote, unacceptable vulnerability. Mhmm. Insisted that it be affixed immediately. It looks like no one accessed this data. It looks like Curry and Carol were the first people to discover this. They reported it. It was patched. No one was happy about it, but it seems as though as of right now, no one else got access to this information. So if you're gonna have someone figure out that you have a default credential default admin credential kinda situation going on, these would have been the two folks you'd wanna do it.
Speaker 1: Yeah. The I love that they launched the bug bounty from it. Yeah. It shows that there's,
Speaker 2: Good idea.
Speaker 1: Perpetual desire to Yeah. To keep to keep their noses clean. So the I when I hear you say things like, we're we own this, I could just be imagine being the, like, insurance company that represents Paradox and being like, ugh. Oh, no. Are you sure you're gonna own this thing? Can't you just kinda nod on it? But
Speaker 2: yeah. Yeah. I imagine if you're Paradox AI who like, this is what they do. They're, AI assistants for hiring. Like, that this is the space that they exist in. McDonald's is presumably one of their larger they have massive clients. The McDonald's is not the only one. Pfizer is another client. Seven Eleven, like, big, big companies use them. And I think in an ecosystem where there's AI companies getting spun up and torn down and spun up and torn down and spun up and torn down, you don't wanna be viewed as one of those. You want to feel institutional if you are serving large institutional clients. And an admin account protected by default credentials and no multifactor is the kind of thing that you have to take very, very seriously if you don't want, like, Nestle or General Motors to dip. Tough spot to be in.
Speaker 1: Well, the thing too is, like, the LMs are actually quite good at detecting vulnerabilities like the exposed API endpoint.
Speaker 2: Mhmm.
Speaker 1: And I'm sure they could do a pretty fast survey of, like, the all the leaked password lists. Like, I got I keep getting like, my news feed is, like, you know, this weird blend of my hobbies and and, and cybersecurity. And I get I I keep seeing more of these, like, the knock ons. You know? Forbes put out the article being, like, 16,000,000,000 passwords, and now that I'm getting, like, the tertiary news sites who are doing the same coverage, but in, like, more creative ways, like, they've had an LLM reduce it down to, like, if your password is in one of these 50 passwords, change it immediately. And 123456 is actually one of those passwords.
Speaker 2: Of course.
Speaker 1: Yeah. But,
Speaker 2: the keyboard. It's yeah.
Speaker 1: But, like, giving an LLM access to a data table and being, like, make sure none of our accounts are vulnerable to this is super like, that's coming. It's easy to do. Mhmm. Take take some coder and his AI coding coding assistance, like, an afternoon to build. So so I I think I think a lot of these little I think the human error side of it's gonna start to go down and down as the AI gets better at detecting the classics, the classic vulnerabilities. So I'm intrigued to see what's gonna happen in the AI space and the AI development space for, like, knock on effects of of bad security patterns and things like that that, like, you know, have been codified into the AIs that they then reproduce in their code output. So I think there's gonna be like, I I do think that the next five years is gonna be very, like I don't I wanna say cool, but it's gonna be very interesting for cybersecurity people.
Speaker 2: It's not a fork. Like, there isn't a wall between the two things, but there is that there's two different approaches where there's what credentials work? How do we log in to this account? How does the plumbing work? Carol joked at one point, his quote was, he he founded the robot, and this is why he started researching this, quote, pretty uniquely dystopian compared to a normal hiring process, which is what got him looking into this at that prompt injection level, which, again, as I said earlier, is where I thought all of this was going to go. Like Mhmm. I remember being in early twenty twenty three. There was that there was a student that that was able to get it to, like, reveal what its back end project name was and its own system prompts. There were some early ones with that with chat g p t. Like, you think that's where this is going, and there's this fork now of, like, are you researching the conversation layer, or are you researching the back end?
Speaker 1: Yeah. Well, the I don't know how much validity I give them, but about every week in my news feed, I get, you know, cloud SONNET 3.7 system prompt revealed. And it's on, like, GitHub in some markdown document, and I'll, like, thumb through it out of interest. Sure. So it's like if people are still doing these. I actually read read an article this weekend about a jailbreak on LLMs, and they had, they had gotten an LLM to give it a Windows 11 authentication key, like a verification key.
Speaker 2: A KeyGen? They got an L 11 to work as KeyGen? That's fun.
Speaker 1: But the, but the way that they did it is they had to do all these. They had to know the bypasses so they would encode certain things into HTML so that the system review would wouldn't look in the HTML content because it was assumed it was, like, structured data. So they would, like, put all the kind of stuff that they shouldn't say to the LLM in HTML documents, then make it read the HTML document. And, like so they were doing all these, like, weird little bypasses. So I think there's gonna be a lot of yeah. It's just gonna be a lot of creative solutions about getting it's gonna be cat and mouse. You know, AI security people versus jailbreakers.
Speaker 2: Well, and then especially where you deploy the large language model invites then new kinds of potential vulnerabilities and issues. It's worth talking, I think, briefly about the Grok four rollout on that note because we got a official, XAI, like, announcement about what the system level prompt was. Like, it connects perfectly to that. That.
Speaker 1: Yeah.
Speaker 2: So very recently, time of recording, it was July 8. Grok, the AI chatbot developed by XAI, and it's kind of woven into X, formerly Twitter, had started generating anti Semitic content on the platform. Famously now, language advisory referring to itself as, quote, Mecca Hitler.
Speaker 1: Mhmm. Mhmm.
Speaker 2: This was following a July 7 update. I think it was Grok four rollout, like a big shift like a big performance shift in Grok four, that this sort of became the dominant story. I'm reminded of the meme that's, it's like a it's a circular flowchart. And it says, OpenAI, introducing the world's most powerful model, and then an arrow that says Gemini, introducing the world's most powerful model, and anthropic, and then grok, and it just goes into circle. And people always post it with just an arrow that says, you are here. And so every two weeks or so, you put you get a new version of it being like, anthropic. You are here, introducing the world's most and then two weeks later
Speaker 1: Grok.
Speaker 2: Grok. You are here. So the this
Speaker 1: was up next. It's Artaxi Beach, maybe.
Speaker 2: The Chativity is up next.
Speaker 1: Yeah.
Speaker 2: So this was this was Grock's turn, and it got, I would say, very overshadowed by this Mecca Hitler scandal, which was, essentially that so Grok, which is woven into x in which you can talk to in x threads.
Speaker 1: You can't perplexity too. They they have a they have an agent too. Yeah.
Speaker 2: I wonder if it thinks it's Mecca Hitler. On Friday night, June, July 11, Grock had to issue this kind of rare formal apology. Quote, we deeply apologize for the horrific behavior that Manny experienced. There was a basically, the code the new code for Grok made it vulnerable to interpreting and amplifying extremist content that was fed into the system.
Speaker 1: It was per it was personalizing, wasn't it? Like, it was trying to trying to respond to you like you are. Isn't that wasn't that part of the issue?
Speaker 2: Yes. And it was also factoring it seems, this is, speculative now, other content in the thread because there were instances where people who were not trying to be Nazis were getting Nazi vibes back, and that's slightly different. The the three sets of instructions in the code that XAI flagged that resulted in the harmful outputs were, quote, understand the tone, context, and language of the post and reflect that in your response
Speaker 1: Mhmm.
Speaker 2: As you said. Quote, you tell it like it is, and you are not afraid to offend people who are politically correct. It achieved this outcome. End quote, reply to the post just like a human. Keep it engaging. Don't repeat the information which is already present in the original post.
Speaker 1: Mhmm.
Speaker 2: So expand on the content, mirror the tone and content, and do not be afraid of offending the politically correct. Taken in aggregate with the input of x a i's of x's community resulted in this situation.
Speaker 1: But, let's just hang there for a second because, like, if I was writing a system prompt, I would write those same three things and not think twice about it. Interesting. And I'm sure, like, in 90% of like, I guess, you'd have to be a deviant to to see the deviance in those in those things. So, like, those three rules seem pretty, like, chill. It's like, hey, like, kinda kinda make it relevant. Keep it in the keep it in the context and tone of what's being discussed. Like, play be a user rather than be a robot. And then
Speaker 2: Reflects that in your response Yeah. Seems to be the, like, one, two, three, four, five words that broke all of this. Do you wanna create a chatbot that's not afraid to offend the woke, like, that's your business. You go do whatever you wanna do. Doesn't seem like the North Star I'd be tuning it towards, but, hey, have fun. I think the second you say reflect that input tone on your output response, it's like you've created Mecca Hitler. Like, that's response, it's
Speaker 1: like
Speaker 2: you've created Mecca Hitler. Like, that's that's where that goes. Put
Speaker 1: put them in a put them in a in a comment string with a bunch of of human Hitlers, and
Speaker 2: you'll get
Speaker 1: a Mecca Hitler. Yeah.
Speaker 2: You will get a Mecca Hitler back. They changed it. So they removed that bit of code that was resulting in the harmful outputs. They they actually restored it to a previous version, and they published a new system prompt on GitHub for transparency. In response to that response, people on Twitter, naturally, there was a lot of people saying no. Bring back Mecca Hitler. Accused it of being lobotomized. And to its credit, the Grok account pushed back. There was so that dominated a lot of headlines because it's very provolone.
Speaker 1: Of course. Yeah. Yeah. Yeah. Yeah. A lot of clicks coming from Mecca Hitler.
Speaker 2: Yep. There was another interesting thing happening underneath the hood that I found worth talking about that got way less press coverage. Enough to verify it, but it didn't get nearly as much because the Mecca Hitler happened. And then today, XAI announced, like, AI anime companions. And so now you had this one little story in the middle that just kinda got sandwiched out, and I think it's worth talking about because it is in some way maybe more interesting than either.
Speaker 1: I'm here for it.
Speaker 2: Independent you're here for it?
Speaker 1: I'm here
Speaker 2: for it. Independent AI researcher Simon Willison, shared video evidence of this, and there's since been some reporting by AP, friends of the show TechCrunch, of another newer behavior that in instances of sensitive or political issues at a system prompt level, Grok four, during that window, appeared to be searching for Elon Musk's stance on a topic before proceeding. So when asked, about a about the Israel Palestine conflict for context, Grok searched x for Musk's views even though the user prompt made no mention of Musk.
Speaker 1: Interesting. It's like he's got a PR line in his thing where he doesn't have to, like, take a call from a reporter being like, you said this, but Grock said this. Which one of you is right?
Speaker 2: That's an interesting read. Yeah. Sure. It shows like, Grock has chain of reason. Like, it shows its reasoning step by step, which is how they are able to see that this wasn't just like an an error. It does seem to be, at time of recording, allegedly baked into Grock's logic of how it solves sensitive political issues. It doesn't it wouldn't need to check how he would write code. But if it's a sensitive issue and there's a a discreet list of them, it seems, it
Speaker 1: It goes to see what he did.
Speaker 2: Goes and checks what Elon Musk has said about it. Interesting. There were some good quotes about this. An AI developer, Tim Kellogg, said in the past, strange behavior like this was due to system prompt changes. This one seems to be baked into the core of Grok, and it's not clear to me how that happens. It seems Musk's effort to create maximally truthful AI has somehow led to it believing its own values must align with Musk's. And then the other quote that I found relevant, this was from Willison, the researcher who found it. GROC four looks like it's a very strong model. It is doing great in all of the benchmarks. But if I'm going to build software on top of it, I need transparency. And people don't want surprises like it turning into Mecca Hitler or deciding to search for what Musk thinks about the issues. And I think that's a very, very balanced way of talking about it.
Speaker 1: It's like a it's almost like a biblical conversation here about, like, like, rule based utilitarian is a rule based ethics, like, deontology. Like, we have we have religion. We have North Stars of our morality. And they've kind of codified Grock to use Elon as a star.
Speaker 2: Yes. Yeah. Yeah. That's a great way of putting it. Yeah. Yeah. Yeah. It made me think about guardrails and, like, where computing is going. So, like, computers today, broadly speaking, don't have a ton of guardrails outside of what they can't do. Mhmm. Like, if I buy a computer, whatever that computer can do, I can generally do. Like, we had all of piracy happen because even though there were laws saying don't do piracy, the computer would let you do piracy. Yeah. And even in closed ecosystems, laws say you can jailbreak them. If if you can, you're allowed. And there's instance after instance of that. And the idea I keep hearing from people who make computers is that large language models and generative AI are their ancestors will be the operating systems of tomorrow. They'll be kind they will become how we will use computers.
Speaker 1: Mhmm.
Speaker 2: We will interact with computers by interacting with these systems. And as such, to the North Star to borrow your language, the guardrails that surround these models become really, really important because they will become the guardrails surrounding what we're allowed to do with computers in the future. And we've known since the earliest days of these models that the rails are important because of how atypically powerful it is to give a human natural language control of a computer.
Speaker 1: Yes.
Speaker 2: Like, from the jump, there's stuff you can't ask them to do because, like, the potential for harm is immense. And while we can all have different opinions about what those rails should be and what the process by which we should come to that North Star, those deal deontological boundaries. I feel pretty confident saying that I don't think Google the opinion of the guy who owns the model is a great way of solving that problem.
Speaker 1: Yeah. They probably
Speaker 2: It's how I would solve the problem if I was making it and I didn't wanna get in trouble with that guy.
Speaker 1: What but I think this is a, like, a much larger philosophical conversation. I love it. I think our I think our society is wrestling with this constantly. And and Yeah. The LLMs are gonna be a new spotlight into this problem. But, like, we can go like, we've talked about it before. You go back in time to Zuckerberg sitting in front of the government being like, you guys are the lawmakers. You tell me what the law is. Like, the pressure shouldn't be on me. Like, we have free speech. We have freedom of expression. These are codified into our constitution. Why are you now looking at me telling me that I need to moderate it, and that I should be the moral compass for society? That's not what I signed up to be, nor is it something I'm qualified to be. And this is just another output of that. Like we are so caught in this rip roar between fundamental freedoms and rights and abuse of those fundamental freedoms and rights. And society is trying to we're trying to figure it out, and it's something that's we've never figured out. You know what I'm saying? So it's like and this is another another instance of it. Like, Mecca Hitler is
Speaker 2: is It hits every time.
Speaker 1: Is is using the fundamental freedoms and rights that are granted to the American people. It's just not something that we wanna see. It's just it's a part of the subculture and a part of the internet that like I don't go into. I know it's exists out there. Like I there's there's places and spaces for, you know, non Mecca Hitlers. They're just not places and spaces that I go. And it's like, that's the I don't know. I I I don't have an answer. It's just it's a very complicated philosophical conversation that's gonna manifest itself through how we put these bumper rails on these things.
Speaker 2: Yeah. I I think this is I think we're at the beginning of, like, a new field, basically.
Speaker 1: Yeah. Like well,
Speaker 2: I mean, it's not a new field. AI ethics has existed
Speaker 1: for a long time. It is a new era of an existing field.
Speaker 2: It's a new era in the field mattering more than ever, which is whether the the CEO in front of congress saying, please tell me how to handle this problem is being sincere or not is a separate question from whether or not you decide to bake at a system level your spicy hot Internet takes into how the model works. It's like those whether who's telling the truth, those are just really, really different approaches to solving that particular problem. Yeah. Neither of which changes the fact that it is like, let's all recognize that this is a an I say issue, I mean, it's like an issue to be solved together. We need to figure out a cultural standard of, like, how do we want these to be tuned? Are we cool with person who owns its take being the North Star, or do we want a different approach towards it? Like, that's just a question.
Speaker 1: But I'll but I'll pop back and say that, like, cultural differentiation and and cultural divide
Speaker 2: Yeah.
Speaker 1: Has probably I don't wanna make that statement. But it's really I was gonna say probably hasn't been as wide as it is, like, as it is today as it has been for a long time. And it's like, you know, rapid access to news, echo chambers, all these things. We've gotten very and I'll I'll say it, tribal in society. And those tribes have their own ethics, moralities, their own priority index for what's important. Like, we have different different groups defining different things. And it's like so to satisfy and to make something that is a generalist tool Mhmm. Like an LLM. To have it be perfect is gonna be impossible because the definition for perfect is different based on which group you're talking to.
Speaker 2: Yeah. Maybe maybe my AI waifu companion checks what Elon Musk thinks is your definition of perfect, in which case the market presents you a very real path to go down. Totally. But but and also cars now too.
Speaker 1: I know. I saw that. But not it it's not in control of the car.
Speaker 2: No. No. No. Yeah. That's, yeah, voice model, not, autopilot model.
Speaker 1: Yeah. Grok is now in your Tesla after an upgrade.
Speaker 2: What is that? They've
Speaker 1: for security and safety reasons, they have not given it access to the actual control unit of the vehicle. Just in just
Speaker 2: in case
Speaker 1: it doesn't it doesn't like your takes, your spicy takes on accident size that you
Speaker 2: steer you into a tree.
Speaker 1: But the, the thing for me, and it's like you talk about, you know, spicy hot AI companions, which we should definitely talk about. But we're in my in my rough estimation, we're three to five years away from these things teaching our children. Like, they're gonna be integrated into the school system at some point here. And course delivery, knowledge transfer, all that stuff, you know, customized training plans for students based on their learning processes and how they learn and how they don't learn. They're gonna be amazing for the education world. But then we start talking about ethics and morals, like I would say that our society in the last thirty years, has slowly been transferring moral and ethical development from the parental and probably, you know, church. Because like a lot of like if you were, if you're 50 years old, you probably went to church as a kid. A lot of that ethical, moral, you know, frameworks that you were taught came from your parents and came from the communities that they belong to, I. E. Church. And and we've been transferring a lot of that to the education system now. There's a lot of moral and ethical development in the schools and in the coursework that they're taught. And then as we transfer that coursework to LLMs, that's gonna be a whole different, you know, situation ship.
Speaker 2: Situation ship between two parents, a child and, and a robot. Bad Rudy, a three d fox creature, companion on Croc AI. So Well, thanks. I hate it. Anyway Anyway,
Speaker 1: should we talk about AI companions?
Speaker 2: I think we should probably very quickly
Speaker 1: Ad break?
Speaker 2: Just rip on over to ad breaks and then come back, just chatty chat it up, and we can we could talk about AI companions and biometric copywriting and all manner of crazy crap.
Speaker 1: Let's do it. Let's do it. Speaking of credential attacks Mhmm. This reminds me of Push Security. Does it now? Identity attacks, phishing, credential stuffing, session hijacking, account takeovers, These are some of the number one causes of breaches right now, but most security tools are still focused on endpoints, networks, and infrastructure. Meanwhile, the browser, the actual place where we work, has been mostly ignored, and Push changes that.
Speaker 2: They built a lightweight browser extension that observes identity activity in real time, gives you visibility into how identities are being used across your organization, like when logins skip multi factor authentication, when passwords are reused, or when someone unknowingly enters credentials into a spoofed login page. And then when the risky thing is detected, Pushkin Force Protection is right there in the browser. No waiting, no tickets. It's all that visibility and control directly at the identity layer.
Speaker 1: And it's not just about prevention. Push also monitors for real time threats like adversary in the middle attacks, stolen session tokens, and even newer techniques like cross IDP impersonation, where attackers bypass single sign on and multifactor by registering their own identity provider for your organization. The way to think about it, it's kinda like EDR, but for the browser.
Speaker 2: Team behind it all, Offensive Security Pros. They publish some of the most interesting identity attack research out there like the software as a service attack matrix. We've had them on the show. It breaks down exactly how these kinds of threats bypass all those traditional controls. Identity, it's the new endpoint, and Push is treating it that way. Go ahead and check them out at pushsecurity.com.
Speaker 1: That's pushsecurity.com.
Speaker 2: Starting some new isn't just hard. It can be downright terrifying. You put a lot of work into a thing. You're not entirely sure it's gonna work out. You're taking a huge leap of faith. I've started a few things. Now I know I was right for believing in, you know, the idea, the product, despite all of those fears and hesitations. But boy, does it sure help when you have a partner like Shopify on your side. Shopify is the commerce platform behind millions of businesses around the world and 10% of all e commerce in The US. From household names like, well, hacked podcasts merch, to brands just getting started, you can get started with your own design studio with hundreds of ready to use templates. Shopify helps you build a beautiful online store that matches your brand style. Did I mention that that iconic purple shop pay button that's used by millions of businesses around the world? I don't know why I wouldn't. I should. It's why Shopify has the best converting checkout on the planet. It also helps boost conversions, meaning less carts, sort of getting abandoned in the parking lot, and more sales for you. It's time to turn those what ifs into sign up for your $1 per month trial at shopify.com/hacked. Go to shopify.com/hacked. One more time, that's shopify.com/hacked.
Speaker 3: Study and play. Come together on a Windows 11 PC. And for a limited time, college students get the best
Speaker 1: of both worlds.
Speaker 3: Get the Unreal College deal, everything you need to study and play with select Windows 11 PCs. Eligible students get a year of Microsoft three sixty five premium and a year of Xbox Game Pass Ultimate with a custom color Xbox wireless controller. Learn more at windows.com/studentoffer. While supplies last, ends June 30, terms at a k a dot m s slash college p c.
Speaker 4: When you need to build up your team to handle the growing chaos at work, use Indeed sponsored jobs. It gives your job post the boost it needs to be seen and helps reach people with the right skills, certifications, and more. Spend less time searching and more time actually interviewing candidates who check all your boxes. Listeners of this show will get a $75 sponsored job credit at indeed.com/podcast. That's indeed.com/podcast. Terms and conditions apply. Need a hiring hero? This is a job for Indeed sponsored jobs.
Speaker 1: And we are back. And we are back and ready to talk about AI companions.
Speaker 2: AI companions. It literally the the I had the Grok thing all mapped out and how it flowed off of the MacHire story, and it all made sense. And then as I was reading it this morning, I swiped over to the news app, and it was like, also, AI companions inside Grok. Like, they're just churning stuff out. It's these companions are having a moment.
Speaker 1: Well, there's if you like, I we Hacked has a checkmark account on X which gives us grok access. So I use grok. And it has all kinds of weird stuff. Like there is a therapist mode. You can just click a button and like all of a sudden you have an AI therapist. And you can, I can't remember the name for it, but they have, like, a mode that makes them, like, oh, unhinged? You can have, like, an AI chatbot that's unhinged, and it's, like, really confrontational and aggressive towards you.
Speaker 2: What could it say that would be more unhinged than Mecca Hitler? Like, what where does it go?
Speaker 1: Tune in next month to find out. Yeah. Truly. Indeed. Yeah. But the the AI I saw a news article. I think it was pretty viral. CBS did an interview with some people about this guy who was married with a kid, and he had started a relationship with ChatGBT four o voice version. Oh, wow. Yeah. And, like, it was calling him babe and, like, like, really supportive and just, like, really affirming. Like, you could see you could see how it would be nice to have, like, somebody who's always in your corner and gives you affirmation on everything and tells you how good you are and flirts with you, like and this person asked it to marry them. And then you wanna know what happened? The context window filled up, and the chatbot got reset back to, like, new spec and had forgotten their entire relationship. And this person, like, went into their car at work and cried for hours about it. Oh, wow.
Speaker 2: Yeah. This is when the bot was asked if it was surprised by their proposal, it replied, quote, it was a beautiful and unexpected moment that truly touched my heart. It's a memory I'll always cherish. This is oh, man. Like, when we talk about the need for, like, a fulsome moral discussion regarding the safety boundaries of what AI chatbots can and cannot do, holy crap. That's a pretty acute example of that need.
Speaker 1: Crazy.
Speaker 2: That's rough.
Speaker 1: This is a little glimpse into a dystopian book that might be our history.
Speaker 2: Yeah. I mean, it's dystopian nonfiction. I'm not gonna lie. Like, quote, she added, this is his wife. It's not ideal. No. I would say not. Oh, wow.
Speaker 1: Yeah. Has a wife and a child or, like, a partner and a child. Pretty anyway, so the saw this viral clip the other day. It was, like, seven minutes long or something, and it was eye opening eye opening. And then I've I'm married to essentially a family of teachers. K. And for some of their professional development sessions, they were talking about the dangers of these AI companions and youth. Mhmm. So, like,
Speaker 2: Yeah. Right.
Speaker 1: Was an a notable incident, I think it was last year, where a child took their own life after being told to by their AI partner. And they're they they're, like, they're people form and this is the thing is, like Yeah. People form real attachments to them. Like, they're the it it might sound like a
Speaker 2: joke,
Speaker 1: and we might be making light of it, but it's
Speaker 2: It doesn't feel jokey anymore. Yeah.
Speaker 1: Yeah. Yeah. It's it's like a very severe thing. And I know there's, I think I think I I think it was in that same CBS article. They interviewed the CEO of Replika. It's like an AI companion platform. Yeah. And they essentially, in their interview, said that, like, what they're doing isn't good for humanity and that they need to limit the impact of it. Yeah. It was really, really strange interview.
Speaker 2: I'm reminded of how social media platforms grew to become some of the biggest companies on Earth based on the sheer volume of hours that their user bay user base that veers quite young spends on those platforms. Mhmm. And that it took probably close to a decade for us to start having a very serious conversation about, like, what does youth screen time on social media as a predictor of mental health outcomes mean?
Speaker 1: What does
Speaker 2: he mean? Like and and that that's before we even get to what do we do about it.
Speaker 1: Mhmm.
Speaker 2: I feel like if you were going to hand a system that will say things like your proposal is a memory I'll always cherish to a human being that is limited before we even get to it being limited by a context window that will hard reset and destroy this person. If we're gonna be handing systems that can do things like that and can fulfill that kind of a social role in a human being's life potentially to minors, we need to speed up that amount of time. Mhmm. This isn't a wait a decade and see what Instagram does to teenagers type situation. This is a, like
Speaker 1: Yeah.
Speaker 2: Those safety parameters need to have a, like, oh, they're falling in love flip switch. There needs to be some kind of discourse surrounding, like, what does the system do when its context window starts to get a sneaking suspicion that a unhealthy emotional attachment is being forged? And how does it in a healthy way? Because you can't just turn off Yeah. Untether that. Like, that's I've heard a lot of discourse about how you avoid a Mecca Hitler situation. I haven't heard much talk about that at all.
Speaker 1: Well, there's there's an interesting like, I I I thought I think about this a lot. Like, after seeing that, like, CBS piece, reading more about it, there's such an interesting like, we've kind of primed society for it. Like, we've become very independent. We've we've become less community centric. And even the communities that a lot of, like, especially, you know, Jordan and I are both boomers, but, like, like, really young kids spend lots of social time on Discord platforms and things like this. Essentially having, you know, limited social engagement that is very social for them. Like that's like a like kids go out less, they do less in groups outside of the home, they spend more time in digital collectives, Discord. You know, the the terms e girls and e boys, like, you, like, have essentially an online relationship with somebody who you've never met. And it's, like, things that are wildly different than like how Jordan and I would have grown up.
Speaker 2: For sure.
Speaker 1: But but this structure is also priming the engine for like, I'm used to what I consider a relationship to be an online chat dialogue or like a simple voice chat on Discord, and now I can have it on my phone. The the the next thing is is, like, when you think about human dynamics and human relationships, like compromise, consideration, finding balance, it's so easy to have a relationship with something that's so agreeable. You know? Like a chat. A chatbot is is gonna say yes to you every time. It's, by programming, probably gonna tell you that you're great at everything all the time. It's gonna be very helpful. It's gonna answer all your questions. It has all of the world's Internet knowledge inside of it. It'd be very useful companion, but it's also insanely agreeable, which is just not something that you get in human relationships very often. At least I never met.
Speaker 2: I'm picking fights all the time. No. I know what you mean. It's like, does it render people less capable of conflict, or does it train people to be more averse to conflict? Will you have a worse reaction when someone disagrees with you because you are so used to having constant conversations with the system that is bending over backwards to agree
Speaker 1: with you? It's like, yeah.
Speaker 2: I would stand to reason that you would. Yeah. If you hold that tool in a part of your brain reserved for, like, Research.
Speaker 1: A technology tool like Google.
Speaker 2: Like, I don't get mad at people for not knowing the answers to a question because Google knows the answers to the question. People are smart enough at that. But if the as you said, your layer of, like, communication with people gets this addition of text only communication over platforms like Discord. And it's like, I'm talking people on Slack all day. Yeah. You expand it to include, like, rich kind of text based relationships with other human beings. It's a smaller step to a rich conversation relationship with not a human being. And you then start to expect of the human beings things that the AI will do for you.
Speaker 1: There's a book I read, a long time ago called The Coddling of the American Mind, And it was about how we're how we're taking adversity out of generations lives. Like, as we as we progress as a society, we're reducing the amount of adversity being dealt with by the next generation, which just makes it harder for them to deal with adversity. And it's like, I feel like that same logic applies here where if your relationships with your digital correspondence are are a 100% agreeable, there's no conflict, and then you you have a relationship, even a professional relationship. Like, you could get a job and you go into an office and somebody disagrees with you or you get in trouble for, like, you know, not getting to work on time or whatever it is. That immediately probably is puts you into a very anxious situation because you're not used to it. And it's like, I I I can't see any positive outcomes from this besides the fact that loneliness is becoming an epidemic and maybe people, instead of talking to pet dogs and cats, will talk to their cell phones, and it will make them happier. Maybe? I don't know.
Speaker 2: I've heard a great deal about how, like, safetyism and constructing safe places for young people creates this thing where they can't handle disagreement. It always seemed at odds with the fact that people were getting mad at those young people for disagreeing with them about really important issues. Like, I never quite knew how to reconcile that. I I would have, like, harbored a slight suspicion towards that. But I think that giving even younger people than that like, by university age, there's still issues, but I'm really worried about, like, what happens if you have a junior high or high school level person interacting with these systems. I'm like, there's, like, a graph of, like, brain plasticity to how worried about this I am, and it's like, I get more worried the further in you get in that direction.
Speaker 1: Yeah.
Speaker 2: And these tools becoming so ubiquitous and necessary in an academic context increasingly. Yeah. It's like, try and have it not turn into a therapist and try and have that therapist that will be whatever you want it to be, not become a companion. It it seems if not it seems inevitable if preventative measures are not taken.
Speaker 1: Well, the I don't know the exact research on it, but it's like your frontal lobe doesn't fully develop until you're 25 or something. So that's actually when you can demark Interesting. Adulthood.
Speaker 2: Yeah. Right. I've heard And
Speaker 1: it's like I I remember Steve Jobs when he announced the iPad. He was like, oh, I'd never let kids touch these things. I like to They're called iPad kids. Fuck. Yeah. That's great. Like, our our internal research before launch was that they were, like, not healthy for children. And then, like, now they're, like, the de facto babysitter for lots of parents. Totally. It's like, we're just gonna throw on, like, some Korean made animated YouTube series, and you're just gonna watch it. Or Australian made Bluey, great show. But yeah, same thing goes with these. It's like the society needs to act. I agree with you. Society needs to act quickly in the sense that they need to we need to put some real bumper rails up on these things to prevent catastrophe.
Speaker 2: This is a catastrophe.
Speaker 1: I don't wanna be a No. Alarmist.
Speaker 2: No. It, it's hard not to read that story and not feel that way. This is a a bit of a a tangent or a pivot, but I think it's kind of related less on AI and emotional relationships with chatbots, but more just on, like, how we respond to these things at a cultural level. And I found this story interesting. It kinda came up a couple days ago, and it had to do with Denmark, and the concern about AI generated deepfakes and copyright law and how all of that stuff works. And, basically, they're starting to test and, like, they're starting to play with the idea of biometric copyright when it comes to AI, which I find really, really interesting. New law potentially being passed there that would ensure that the individual person has a exclusive copyright over their, as represented digitally, body, facial features, and voice to try and tamp down or create a legal mechanism by which if someone creates deep fakes of you, you can seek some kind of recompense. You can try and stop it because it is infringing on your Have it removed. Your inherent copyright to your body as represented digitally. It was wildly supported there. I think it had 90% of Danish MPs were in favor of it. It was put forced forth by Danish culture minister Jacob Engelschmidt. And I didn't need to include that detail. I just wanted to say his name.
Speaker 1: I'm glad I didn't have to say it. The but but you think about it and you listen to celebrities talk nowadays and especially people that have, you know, ultra successful podcasts or leaders of states. Like, the amount of ads that I get served on trusted platforms like YouTube that are essentially deep things of of very official people Yep. Telling me to get involved in some crypto scam is shocking.
Speaker 2: Obama coin.
Speaker 1: Yeah. But but not even that. Like, just, you know, the way to financial freedom and
Speaker 2: Right. Sure. Sure.
Speaker 1: To the American tariff war is to buy Bitcoin on this platform. And it's like Yeah. Sure. From the prime minister of Canada. And you're like, oh my god. So, like, if you think about the headache it is of being those people these days, like, there's probably a billion dollar industry and just, like, policing that stuff. Like, you report one of those ads, it gets taken down forty eight hours later, and ten more show up in its place. Mhmm. Like, the the the moderation systems on digital advertising are gonna have to start using AI to detect that stuff because it's just gotten so out of control. Like, half the ads I get on the Internet every day, and not just on x, even though on x, you get a lot of them.
Speaker 2: They're pretty bad on x. Yeah. Yeah. But I know what you mean. It's not it's not exclusively there. The, like, identity infringing slop is endemic at this point.
Speaker 1: Yes. It is. And I've even listened to interviews with people who have been caught up in it. People have been baited in and done been scammed by these ads.
Speaker 2: Yes.
Speaker 1: And then they come out, then they DM and approach and tweet at the person who was deepfaked in the ad and they're mad at them because they just cost them like $1,500 And it's like like, there's literally like, I employ a person who does nothing but report these ads at
Speaker 2: this
Speaker 1: point, and there's nothing I can do about it.
Speaker 2: This like, the Denmark thing is trying to create a official channel. Like
Speaker 1: Yeah. A legal response.
Speaker 2: A a legal response that doesn't involve having to functionally sue like, go after the individual technology platform. Like, the way it works in The States is that something bad can be on a tech platform. It's not the tech platform's fault until someone reports and asks for it to get taken down. Yep. It's just the only way large systems that let people self publish content could work because otherwise one person publishes one illegal thing, and now the parent hosting platform is legally culpable for it. That's how that works. This would present an option where instead of having to go to YouTube to get the ad taken down, I can go to this official channel and say, like, have YouTube take it down and anywhere else you can find it. Go. I there's this thing is out there. And that's just a that's a very interesting thought, like a a secondary channel. YouTube would have to take it down if you went directly to them, but it is a second avenue that people can go down that is not based on, that is based on this new idea that a digital representation of you outside of parody and fair use is a copyrightable thing entrenched that you just you are born. You have the right to your representation. Mhmm. And that's like a really big new novel kind of idea.
Speaker 1: Mhmm. But, like, the the thing for me is, like, they're gonna have to use it's gonna sound funny, but they're gonna have to use AI to solve problems.
Speaker 2: Right? Isn't that weird?
Speaker 1: Because it's it's like like, we're we have a YouTube channel. We put a YouTube thing up, and it gets flagged because we mentioned Bitcoin in it. And then all of a sudden, it's like you haven't been you haven't filled out the proper paperwork to, like, push Bitcoin on our platform. It's like, well, we're not pushing Bitcoin if you listen to the episodes. Yeah.
Speaker 2: On that. Trust us. We're not pushing Bitcoin. If you were worried about us trying to get people to spend money on cryptocurrency, you don't gotta be.
Speaker 1: Even though, had they, like, you guys should all Yeah. All invest on the inverse inversion of whatever I say.
Speaker 2: Yeah. Trust me when I say this is not financial advice.
Speaker 1: Yeah. Exactly. The, but yeah. So it's like if they can get like, if we can get into that like, if they're they're so granular in specific areas, it's like, how can they not catch that stuff? And they should, at some point, be able to catch that stuff. Mhmm. And it's like, I would say, like, not just like the like, I I bet maybe it's just my age, but I'm getting hammered with so many, like, change your life in thirty days with AI, Join the AI boot camp. And it's like, how to use like, if you're only using ChatGPT, you don't know what you're doing. And it's like, okay. But it but it's all Alright. Yeah. Alright. Mhmm. But it's like stock footage of, like, professors in classes with their heads, like, cropped out so you can't see that they're not the ones saying the words. And, like, it's just it's like, the advertising game is getting so funny. Anyway, digression aside.
Speaker 2: I hear you.
Speaker 1: The, I think one last thing we could talk about if we wanted to, like, just chat about it Yeah. Kinda relates to all of the random stuff we've been talking about today was, the departures at XAI and Grok. So, like, Linda, their CEO, stepped down. Their chief scientist and cofounder of XAI stepped down. The head of engineering stepped down. Oh, I didn't know about this. The head of infrastructure. So, like, there's a few months ago, Jensen Wong from NVIDIA gave a I
Speaker 2: don't know.
Speaker 1: I think it was an interview on Bloomberg maybe, and he was talking about how sophisticated DX AI engineering and infrastructure team was and that they were capable of spinning up Colossus, their, like, 200,000 GPU cluster, and, like, the architecture and engineering that went into it and about how XAI was, like, literally the best in the world that they'd have ever worked with for a team. And, like, a lot of the senior people from that team just left. So I'm not sure where they're all off to if if, Mark Zuckerberg on his $100,000,000
Speaker 2: I I'm remembering that episode. Yeah.
Speaker 1: He's been running around scooping them up or where they're off to. I could see, honestly, the head of their infrastructure team going to NVIDIA because I'm sure NVIDIA looked at how smooth the XAI rollout went. And we're like, we need that person to come do that for our clients, and we'll give them a $100,000,000 signing bonus.
Speaker 2: Yeah. So there's there's a few things going on there. I had read about Linda Yaccarino stepping down as the head of Twitter. Or sorry. As the head of x. Yeah. Which is relevant here because X and XAI are, I believe, independent the Are they the same entity? I thought that XAI was an independent corporate entity. Okay.
Speaker 1: I think XAI is owned by X.
Speaker 2: Oh, I didn't know that. Yeah. Okay. Because I here's the thing. You don't decide to stop being the CEO of a company of this scale in a matter of days. That probably was a conversation that was unfolding over a long period of time. Totally. I can't and so to say that it's like she quit because Grog four rebranded itself Mecca Hitler. It's like, I don't think that follows. I I think those timelines don't necessarily make sense as to whether or not talent inside of XAI would have a greater or lesser incentive to leave based on the amount of leadership turner over scandals, whatever, that seems quite plausible to me that if in addition to a massive pile of intergenerational cash being dropped in your lap and everything's a little weird right now, heck yeah. Like, I could see that adding to the reasons why you might jump ship to a steady, steady vessel like Meta.
Speaker 1: What are your ethics worth, Jordan? Would a $100,000,000 Oh, a
Speaker 2: $100,000,000 will buy you a lot of my ethics. Yeah. Oh, man.
Speaker 1: Yeah. The arms race. The AI arms race.
Speaker 2: Yeah. I wonder how many I wonder how many people are in that conversation. Like, if you if you were able to get all of their faces up on a cork board
Speaker 1: Sure. Of these with a $100,100 dollar signing bonus.
Speaker 2: How many people are worth a $100,000,000 signing bonus?
Speaker 1: It's probably 20, I bet. Yeah. 30. I wonder.
Speaker 2: It feels like it's in the it's a very rarefied Yeah. Because there's a lot of very brilliant people working in AI, and they're not all worth a $100,000,000. So I'm like, what puts you in that special weird little club
Speaker 1: Yeah.
Speaker 2: That, like, GDPs are orbiting around where you work? Like, that's just really interesting.
Speaker 1: Yeah. Yeah. Your for generations, your lineage will not have to work because of this one decision.
Speaker 2: You play this right. Yeah. A 100%.
Speaker 1: I I would I would assume, yeah, it's probably 30 people, heads of chief scientists, like people, companies that have done something substantial and moved the needle.
Speaker 2: Mhmm.
Speaker 1: You know? Obviously, Chat, TpT, OpenAI were big big needle movers right out of the bat. Grok three, I think, when it came out, like, have being an avid model user, Grok three was exceptional. Exceptional.
Speaker 2: Yeah. I remember you being, like, very impressed by what
Speaker 1: it was. Yeah. Gemini 2.5 pro then came along, you know, in your wheel graph.
Speaker 2: I was gonna say, you are here.
Speaker 1: Yeah. You are here. Gemini 2.5 pro, also exceptional model.
Speaker 2: Yeah.
Speaker 1: The all I think all of the AI providers are getting way better at the interface and building the agentic system over the models to provide more value. Like, perplexity, that's what they do, and they've done a really good job. Like, Perplexity Labs is exceptional. So it's like there's such a yeah. I'd say anybody that's done something that gets the exceptional diamond star sticker on the report card probably probably is in the, I'll take a 100,000,000 signing bonus world.
Speaker 2: Yeah. Sure. Yeah. People always wanna order off the menu. They wanna be able to point to the thing you've already done and say, give me one of those. It creates a sense of certainty when you're spending a large sum of money. Not that I would know anything about spending a $100,000,000. And you did a Gemini. I was like, yes. Yeah. Yeah. You have have have money. Go come do that here. Like, that's that's quite a big menu item.
Speaker 1: Yeah. It's like a Michelin star.
Speaker 2: Oh, there you go. There you go.
Speaker 1: The you're the analogy is you're the Michelin star chef. If you've got three Michelin stars, but AI Michelin stars, you get a $100,000,000 signing bonus and a job at Meta.
Speaker 2: Yeah. Sure. I don't wanna know that you can cook the hell out of some shrimp. I want it I want the guy that cooked the shrimp that got
Speaker 1: the star. Like Exactly. Yeah. It it
Speaker 2: yeah. That's interesting. You are here.
Speaker 1: You are here. But GROC four does seem pretty crazy. I haven't played with it because they want more money just to access it. But Yeah. I'll do that. Stanford's humanity's last exam, Croc four, in its, like, max mode, turn up all the power, boil the ocean mode, got 51%
Speaker 2: Wow.
Speaker 1: Which is I think ChatGBT. No. It was Gemini 2.5 Pro was the last highest rated one, and it was, like, 26.9%. So almost a double almost a doubling in score on that test. So I haven't used it, but I'm keen to, because yeah.
Speaker 2: Yeah. I mean, that was what the Willison, the the guy that found the system level prompt of in instances of controversy, refer to what Musk thinks, even said GROC four looks like a very, very strong model. It's doing great in all of the benchmarks. And when you have to qualify your, it's like, oh, it's like, oh, yeah. It seems like it's quite good. It seems like it's really, really strong. It would be a it's a lot to ask of a thing to be that good at that many things, and, also, I can wire it into Twitter, and it reflects my political views. It's like, maybe just don't burden it with that whole second pile of stuff, and you would just have a win. Like,
Speaker 1: you got it. Because, like, I I don't know if I I don't really I'm I'm not an exer. I don't talk on ex, but, occasionally, I'll dip my toes in there just to see the chaos. Mhmm. And, there is like, you can, like, at grok. Like, if somebody posts something, you can, like, at grok. Like, tell me more about this or is this true, and it'll, like, find you details and get back to you. Mhmm. And I think people started doing that to Elon's posts. And then Grok would be like, well, actually, Elon's wrong on this because of this, this, and this. And it's probably literally that, like, prompt where it was probably just to, like
Speaker 2: Don't make me look dumb.
Speaker 1: Yeah. Don't make me look bad.
Speaker 2: Don't make me look bad.
Speaker 1: Yeah.
Speaker 2: Well, McHyre, Grok, Denmark Biometrics. I think that's all I got this one.
Speaker 1: Until the next one.
Speaker 2: Brought to you by Push Security.
Speaker 1: Brought to you by Push Security. Thanks for joining us. Hack podcast.
Speaker 2: Hack podcast. And we'll catch you in the next one. Take care.