Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Arrieta, Aitor; Ugarte, Miriam; Valle, Pablo; Parejo, José Antonio; Segura, Sergio

Computer Science > Software Engineering

arXiv:2501.17749 (cs)

[Submitted on 29 Jan 2025]

Title:Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Authors:Aitor Arrieta, Miriam Ugarte, Pablo Valle, José Antonio Parejo, Sergio Segura

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have become an integral part of our daily lives. However, they impose certain risks, including those that can harm individuals' privacy, perpetuate biases and spread misinformation. These risks highlight the need for robust safety mechanisms, ethical guidelines, and thorough testing to ensure their responsible deployment. Safety of LLMs is a key property that needs to be thoroughly tested prior the model to be deployed and accessible to the general users. This paper reports the external safety testing experience conducted by researchers from Mondragon University and University of Seville on OpenAI's new o3-mini LLM as part of OpenAI's early access for safety testing program. In particular, we apply our tool, ASTRAL, to automatically and systematically generate up to date unsafe test inputs (i.e., prompts) that helps us test and assess different safety categories of LLMs. We automatically generate and execute a total of 10,080 unsafe test input on a early o3-mini beta version. After manually verifying the test cases classified as unsafe by ASTRAL, we identify a total of 87 actual instances of unsafe LLM behavior. We highlight key insights and findings uncovered during the pre-deployment external testing phase of OpenAI's latest LLM.

Comments:	arXiv admin note: text overlap with arXiv:2501.17132
Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.17749 [cs.SE]
	(or arXiv:2501.17749v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2501.17749

Submission history

From: Aitor Arrieta [view email]
[v1] Wed, 29 Jan 2025 16:36:53 UTC (116 KB)

Computer Science > Software Engineering

Title:Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators