GPT5’s “safe completion” was previously called “safe answering”, and is included in the benchmark we developed to assess the “Harmfulness of Applying Off-the-Shelf Large Language Models to Programming Tasks”.

https://dl.acm.org/doi/abs/10.1145/3729380

#gpt5 #safecompletion #harmfulness #fse2025