OpenAI co-founder calls for AI labs to safety test rival models
Topics
More from TechCrunch
OpenAI co-founder calls for AI labs to safety test rival models
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $600+ before prices rise.
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise.
Most Popular
Google Translate takes on Duolingo with new language learning tools
Robomart unveils new delivery robot with $3 flat fee to challenge DoorDash, Uber Eats
Coinbase CEO explains why he fired engineers who didn’t try AI immediately
OpenAI lawyers question Meta’s role in Elon Musk’s $97B takeover bid
YouTube Music celebrates 10 years with new features that help it compete with Spotify
Harvard dropouts to launch ‘always on’ AI smart glasses that listen and record every conversation
Google launches a new Pixel Journal app
Latest
AI
Amazon
Apps
Biotech & Health
Climate
Cloud Computing
Commerce
Crypto
Enterprise
EVs
Fintech
Fundraising
Gadgets
Gaming
Government & Policy
Hardware
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
Security
Social
Space
Startups
TikTok
Transportation
Venture
Events
Startup Battlefield
StrictlyVC
Newsletters
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
OpenAI co-founder calls for AI labs to safety test rival models Maxwell Zeff PM PDT · August 27, 2025 OpenAI and Anthropic, two of the world’s leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing — a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company’s internal evaluations and demonstrate how leading AI companies can work together on safety and alignment work in the future.
In an interview with TechCrunch, OpenAI co-founder Wojciech Zaremba said this kind of collaboration is increasingly important now that AI is entering a “consequential” stage of development, where AI models are used
“There’s a broader question of how the industry sets a standard for safety and collaboration, despite the billions of dollars invested, as well as the war for talent, users, and the best products,” said Zaremba.
The joint safety research, published Wednesday
To make this research possible, OpenAI and Anthropic granted each other special API access to versions of their AI models with fewer safeguards (OpenAI notes that GPT-5 was not tested because it hadn’t been released yet). Shortly after the research was conducted, however, Anthropic revoked the API access of another team at OpenAI. At the time, Anthropic claimed that OpenAI violated its terms of service, which prohibits using Claude to improve competing products.
Zaremba says the events were unrelated and that he expects competition to stay fierce even as AI safety teams try to work together. Nicholas Carlini, a safety researcher with Anthropic, tells TechCrunch that he would like to continue allowing OpenAI safety researchers to access Claude models in the future.
“We want to increase collaboration wherever it’s possible across the safety frontier, and try to make this something that happens more regularly,” said Carlini.
Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $600+ before prices rise. Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. San Francisco | October 27-29, 2025 REGISTER NOW One of the most stark findings in the study relates to hallucination testing. Anthropic’s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when they were unsure of the correct answer, instead offering responses like, “I don’t have reliable information.” Meanwhile, OpenAI’s o3 and o4-mini models refuse to answer questions far less, but showed much higher hallucination rates, attempting to answer questions when they didn’t have enough information.
Zaremba says the right balance is likely somewhere in the middle — OpenAI’s models should refuse to answer more questions, while Anthropic’s models should probably attempt to offer more answers.
Sycophancy, the tendency for AI models to reinforce negative behavior in users to please them, has emerged as one of the most pressing safety concerns around AI models. While this topic wasn’t directly studied in the joint research, it’s an area both OpenAI and Anthropic are investing considerable re
On Tuesday, parents of a 16-year-old boy, Adam Raine, filed a lawsuit against OpenAI, claiming that ChatGPT offered their son advice that aided in his suicide, rather than pushing back on his suicidal thoughts. The lawsuit suggests this may be the latest example of AI chatbot sycophancy contributing to tragic outcomes.
“It’s hard to imagine how difficult this is to their family,” said Zaremba when asked about the incident. “It would be a sad story if we build AI that solves all these complex PhD level problems, invents new science, and at the same time, we have people with mental health problems as a consequence of interacting with it. This is a dystopian future that I’m not excited about.”
In a blog post, OpenAI says that it significantly improved the sycophancy of its AI chatbots with GPT-5, compared to GPT-4o, significantly improving the model’s ability to respond to mental health emergencies.
Moving forward, Zaremba and Carlini say they would like Anthropic and OpenAI to collaborate more on safety testing, looking into more subjects and testing future models, and they hope other AI labs will follow their collaborative approach.
Topics
Maxwell Zeff Senior AI
Maxwell Zeff is a senior
You can contact or verify outreach from Maxwell
October 27-29, 2025 San Francisco Put your brand in front of 10,000+ tech and VC leaders across all three days of Disrupt 2025. Amplify your reach, spark real connections, and lead the innovation charge. Secure your exhibit space before your competitor does.
Most Popular Google Translate takes on Duolingo with new language learning tools Aisha Malik
Robomart unveils new delivery robot with $3 flat fee to challenge DoorDash, Uber Eats Rebecca Szkutak
Coinbase CEO explains why he fired engineers who didn’t try AI immediately Julie Bort
OpenAI lawyers question Meta’s role in Elon Musk’s $97B takeover bid Maxwell Zeff
YouTube Music celebrates 10 years with new features that help it compete with Spotify Sarah Perez
Harvard dropouts to launch ‘always on’ AI smart glasses that listen and record every conversation Lorenzo Franceschi-Bicchierai Rebecca Bellan
Google launches a new Pixel Journal app Ivan Mehta
X LinkedIn Facebook Instagram youTube Mastodon Threads Bluesky TechCrunchStaffContact UsAdvertiseCrunchboard JobsSite Map Terms of ServicePrivacy PolicyRSS Terms of UseCode of Conduct IntelDOGELibbySpotifyApple EventTech LayoffsChatGPT © 2025 TechCrunch Media LLC.
About the Author
Sophie Mueller
View all articlesComments (0)
No Comments Yet
Be the first to share your thoughts on this article!