Just one week after Anthropic’s Glasswing initiative, OpenAI fires back: GPT-5.4-Cyber is a model specifically fine-tuned for cybersecurity tasks. The timing is no coincidence — a brand new battlefield is opening up between the two AI giants.
What Can GPT-5.4-Cyber Do?
The model is a fine-tuned variant of GPT-5.4, purpose-built for defensive cybersecurity work. OpenAI calls it ‘cyber-permissive’ — the model has lower refusal boundaries for legitimate security tasks. In practice, this means it can reverse-engineer compiled binaries without needing source code. Malware analysis, vulnerability hunting, security audits — all things a standard LLM would reflexively refuse to help with.
Not for Everyone
Just like Anthropic with Mythos, OpenAI is going with controlled access. GPT-5.4-Cyber runs through the ‘Trusted Access for Cyber’ program, which OpenAI launched in February alongside a $10 million grant program for cybersecurity. Now there are tiered verification levels — only the highest tier gets access to the Cyber model.
But there’s an important difference from Anthropic: OpenAI is planning a much broader rollout. Thousands of individual security researchers and hundreds of teams are supposed to get access. Anthropic, by contrast, is keeping Mythos locked down within a small consortium.
Two Philosophies, One Goal
Simon Willison compared both approaches on his blog and nailed the distinction: both companies acknowledge that AI-powered cyber defense has a gating problem. But while Anthropic bets on ‘elite access for the few,’ OpenAI is trying to democratize defense — at least within a verification framework.
My Take
This is a fascinating race. Anthropic set the narrative with Mythos and Glasswing: our model is so powerful that we can’t just release it. OpenAI counters with a more pragmatic approach: let’s get the tools into the hands of as many defenders as possible.
Both strategies have trade-offs. But one thing is clear: cybersecurity is becoming the most important differentiator in the AI industry. And that’s, frankly, more meaningful than most benchmarks.
Sources: