The short version

OpenAI has opened applications for a new senior role—Head of Preparedness—tasked with building and running the company’s end‑to‑end program for evaluating frontier‑model risks and shipping real‑world safeguards. The posting arrives as Sam Altman talks up near‑term “agent” use—and simultaneously warns about the security trade‑offs that come with letting AI act for us online.

A laptop screen showing a neutral, brand‑free job listing for 'Head of Preparedness', overlaid with subtle hazard and shield motifs to convey risk management in AI agents

What’s new at OpenAI

OpenAI’s careers site lists a Head of Preparedness within its Safety Systems group. The leader will “own OpenAI’s preparedness strategy end‑to‑end,” develop frontier‑capability evaluations, build threat models, and coordinate mitigations across major risk areas, including cyber and bio, with compensation listed at $555K plus equity. The description ties the job directly to OpenAI’s public Preparedness Framework. Job post.

Where Preparedness meets deployment decisions

Risk domainWhat gets evaluatedExamples of mitigations
CybersecurityModel assistance for intrusion, exploitation, social engineeringRed‑team benchmarks, tool‑use restrictions, rate limits, supervised execution
Biological & chemicalAssistance with design, acquisition, or execution of harmful protocolsCapability gating, access controls, external expert review
AI self‑improvementUndue autonomy, sandbagging, replication/adaptation behaviorsAutonomy caps, deception tests, alignment checks, rollback plans
Undermining safeguardsBypassing filters, hidden‑instruction followingAdversarial training, defense‑in‑depth guardrails, human‑in‑the‑loop

Sources: OpenAI job post; OpenAI Preparedness Framework.

Why now: agents are arriving—and so are their risks

Through 2025, Altman has argued that AI agents will “join the workforce” and materially change company output—framing agents as the next wave after chatbots. He wrote as much in his January 2025 blog post “Reflections”. And OpenAI has steadily shipped agentic features, from research‑style tools to a full browser agent in ChatGPT Atlas.

But those same features expand the attack surface. In December 2025, OpenAI published a detailed security note acknowledging that prompt‑injection attacks against browser agents are an “open challenge” and “unlikely to be fully solved,” even as defenses improve. The post outlines a rapid response loop, automated red teaming, and user‑side practices (least‑privilege access, review confirmations) to reduce risk. OpenAI on hardening Atlas.

Altman has also urged caution in plain language. During the Agent rollout this summer, he told users on X that the feature is “cutting edge and experimental … not something I’d yet use for high‑stakes uses or with a lot of personal information” until it’s been studied in the wild—echoed by multiple outlets that covered the launch and his warning. See coverage from NBC News via Yahoo and PC Gamer.

Why browser agents are uniquely tricky

  • They read and act on untrusted content (emails, docs, web pages), where hidden instructions can hijack behavior.
  • They often need elevated access (sessions, payments, email), raising the stakes of a mistake.
  • Attackers iterate quickly; defenses must be continuously trained and patched.

OpenAI’s Atlas post shows how internal automated red teams now use reinforcement learning to discover multi‑step, real‑world exploits (e.g., a malicious email that causes an agent to draft and send an unintended message), with resulting adversarial training rolled out to production. OpenAI and coverage in TechCrunch.

Conceptual scene of a browser agent encountering a hidden prompt injection embedded on a web page; visual metaphor shows a friendly robot pausing at a caution sign while browsing

How the Preparedness Framework changed in 2025

OpenAI’s April 15, 2025 update tightened how it categorizes and governs high‑risk capabilities:

  • It introduced two clear thresholds—“High” and “Critical”—that map to deployment commitments. High means a model could amplify existing severe‑harm pathways; Critical means it could introduce unprecedented new ones. Safeguards must “sufficiently minimize” risks before deployment (and, for Critical, during development too). Framework update.
  • It added dedicated Safeguards Reports (alongside Capabilities Reports) and formalized a cross‑functional Safety Advisory Group review before leadership decisions.
  • It listed tracked domains (bio/chem, cyber, AI self‑improvement) and research categories (e.g., long‑range autonomy, sandbagging, autonomous replication) to stay ahead of emerging risks.

This reframing matters because it connects evaluation outputs to go/no‑go product choices. The new job will be responsible for making that connection robust and repeatable as models and agents iterate more frequently.

The governance backdrop: from committees to execution

OpenAI’s safety governance evolved in 2024–2025—from creating a Safety & Security Committee in May 2024 to converting it into an independent board oversight body that fall, chaired by CMU’s Zico Kolter. OpenAI update, CNBC.

Inside the safety org, leadership also shifted: Reuters reported in July 2024 that Aleksander Madry moved to a new research role as other leaders temporarily covered Preparedness work. The fresh “Head of Preparedness” opening suggests OpenAI now wants a single, accountable owner for that portfolio again. Reuters.

What this means if you’re adopting agents in 2026 planning

If your 2026 roadmap includes agentic automation, assume “trust by design” is now table‑stakes—users won’t accept agents that feel risky.

The bottom line

OpenAI’s new Head of Preparedness role is a concrete signal: as agentic AI moves from demos to daily workflows, the hard part isn’t just better models—it’s industrial‑grade evaluation, governance, and safeguards that keep pace with the speed of deployment. The company’s own messaging—Altman’s caution about “high‑stakes uses” and OpenAI’s admission that prompt injection won’t be fully “solved”—underscores the same point: trust will be earned by how well the industry operationalizes safety, not by what it promises.


Sources