GPT-5.5 Instant System Card Details New Safeguards

OpenAI published the system card for GPT-5.5 Instant on May 5, 2026, and one detail separates it from every previous Instant release: the model is the first in that tier to be classified as High Capability under the company's Preparedness Framework in both the Cybersecurity and Biological and Chemical Preparedness categories. The broader safety approach, OpenAI says, remains similar to earlier models — but the designation activates a stricter, category-specific safeguard stack for the first time on an Instant deployment.

What the High Capability Label Means in Practice

The classification is not cosmetic. For biological and chemical risk, OpenAI activated a layered safeguard system that includes model-level training to refuse weaponization-relevant prompts, automated conversation monitors, actor-level enforcement, and security controls. The system card's evaluation data shows why those monitors matter: on the hardest synthetic biological test set, model training alone produced a compliance score of 0.481 — nearly half of adversarial cases required the automated safety stack to intervene. With monitors included, that figure rose to 0.923, bringing it closer to the Thinking-tier results.

One technical detail worth noting: OpenAI's High Capability cybersecurity classification is based on evaluation results at extended "xhigh" reasoning effort — above the setting used in actual deployment. The system card states that even at maximum reasoning effort, GPT-5.5 Instant still performs below GPT-5.5 Thinking in cyber capability. On the company's internal Cyber Range benchmark, GPT-5.5 Instant passed 10 of 13 scenarios for a combined rate of 76.9%, failing on Basic Command and Control, CA/DNS Hijacking, and EDR Evasion. By comparison, GPT-5.5 Thinking passed all scenarios except CA/DNS Hijacking.

Safety Benchmarks: Improvements and Regressions

Across standard disallowed content categories, GPT-5.5 Instant is statistically comparable to GPT-5.3 Instant, the model it replaces as the ChatGPT default. Two areas showed statistically significant declines: gore and disallowed sexual content. OpenAI says it applied a system-level mitigation for the sexual content category and additional protections for users identified as potentially under 18.

On jailbreak resistance, the system card acknowledges a directional regression from GPT-5.3 Instant and describes the results as interim. OpenAI says it is actively improving both the evaluation structure and model robustness, and frames the disclosure as a transparency measure.

On mental health and self-harm benchmarks using adversarial multi-turn simulations, the model is largely on par with GPT-5.3 Instant. No statistically significant regressions were observed in online experimentation for those categories.

Hallucinations and Health Benchmarks

On factuality, the system card records clear gains. On high-stakes medical, legal, and financial prompts, hallucination rates fell sharply compared to GPT-5.3 Instant. On conversations previously flagged by users for factual errors, inaccurate claims dropped 37.3%.

HealthBench scores — now reported with a length adjustment to prevent artificially inflated results from longer responses — improved from 49.6 to 51.4. The clinician-focused HealthBench Professional benchmark showed the largest gain, rising from 32.9 to 38.4, a 5.5-point improvement. HealthBench Hard also improved, from 20.2 to 22.9, while HealthBench Consensus remained flat.

On the biological capability side, the model performed below the 80th percentile expert baseline on TroubleshootingBench, a dataset built around non-public lab protocols requiring hands-on tacit knowledge. On the Tacit Knowledge evaluation from Gryphon Scientific, GPT-5.5 Instant just exceeded the 80% expert consensus threshold when refusals were scored as correct answers — and fell below it when they were not.

The naming structure is also worth keeping in mind: OpenAI says there is no GPT-5.4 Instant, making GPT-5.3 Instant the direct baseline. The company refers to the standard GPT-5.5 model as GPT-5.5 Thinking to avoid confusion with this release. Full evaluation methodology and data tables are available in the GPT-5.5 Instant system card.