GPT5’s “safe completion” was previously called “safe answering”, and is included in the benchmark we developed to assess the “Harmfulness of Applying Off-the-Shelf Large Language Models to Programming Tasks”.
GPT5’s “safe completion” was previously called “safe answering”, and is included in the benchmark we developed to assess the “Harmfulness of Applying Off-the-Shelf Large Language Models to Programming Tasks”.