15-May-2025 :
OpenAI has launched a dedicated Safety Evaluations Hub, a web-based platform aimed at publishing internal safety test results for its AI models in a structured, ongoing manner. This marks a significant shift in the company’s commitment to transparency and safety reporting, especially as generative AI models grow more powerful and influential.
The new hub includes scores and summaries from OpenAI’s evaluations on key safety issues such as harmful content generation, prompt injections (jailbreaks), hallucination rates, and the ability of models to follow misuse-prone instructions. These metrics will give the public, researchers, and policymakers deeper insight into how OpenAI’s models perform in high-risk scenarios, going beyond the traditional high-level system cards shared during major launches.
According to OpenAI, the hub will be updated regularly and will include results corresponding to "major model updates." It’s designed to serve as a central resource where the AI community can track changes in model behavior and understand what OpenAI is doing to mitigate risks as capabilities evolve.
A recent post from the company’s X (formerly Twitter) account highlighted that the hub complements OpenAI’s broader safety initiatives by creating a reliable and consistent place to track safety improvements—or regressions—across model generations. Rather than treating safety evaluations as isolated reports, OpenAI now plans to treat them as living documents that reflect the real-world performance of its technologies over time.
With growing public scrutiny around AI alignment and responsible deployment, this move is expected to encourage similar practices across the AI industry. While the hub’s current data is limited to specific safety evaluations, OpenAI hinted that more types of assessments and cross-model comparisons may be added in the future.
This new hub represents a shift in how AI companies engage with the public on safety and accountability—offering not just claims of responsible AI, but a trail of data and metrics to back them up.