There is a lot to say about #OpenAI 's call to create a "Red Team Network" to "enhance the safety" of its products:
1. I can't know for certain but "safety" in this context is almost surely a dogwhistle: they are signalling to #TESCREAL people that they intend to work on the bogus misdirections like "existential risk" and not the real harms #AI has already caused and is continuing to cause right now
2. They intend to compensate participants in the red teaming exercise. Who wants to wager that these red team participants will be paid orders of magnitude more than the taggers in Kenya who have been harmed so much by working on this tech?
3. Speaking as someone whose PhD research has been used in designing and interpreting red team scenarios, including automating the scenario-generating process and interpreting results: unless this is done well, it's doubtful a red teaming exercise will result in substantive improvements in even the bogus "safety" metrics they're aiming to improve
I haven't seen any detail about what they intend to do so I cannot comment either way about 3., but there are many things that can go wrong in a co-optimization setting like this one and if you run it naively you're likely to end up with confusing and uninterpretable outcomes. Hey OpenAI, email me if you're serious lol!
#AI #OpenAI #LLM #ChatGPT #TESCREAL #redteaming #coevolutionaryalgorithms
1. I can't know for certain but "safety" in this context is almost surely a dogwhistle: they are signalling to #TESCREAL people that they intend to work on the bogus misdirections like "existential risk" and not the real harms #AI has already caused and is continuing to cause right now
2. They intend to compensate participants in the red teaming exercise. Who wants to wager that these red team participants will be paid orders of magnitude more than the taggers in Kenya who have been harmed so much by working on this tech?
3. Speaking as someone whose PhD research has been used in designing and interpreting red team scenarios, including automating the scenario-generating process and interpreting results: unless this is done well, it's doubtful a red teaming exercise will result in substantive improvements in even the bogus "safety" metrics they're aiming to improve
I haven't seen any detail about what they intend to do so I cannot comment either way about 3., but there are many things that can go wrong in a co-optimization setting like this one and if you run it naively you're likely to end up with confusing and uninterpretable outcomes. Hey OpenAI, email me if you're serious lol!
#AI #OpenAI #LLM #ChatGPT #TESCREAL #redteaming #coevolutionaryalgorithms