Mastodawn

Anthropic (@AnthropicAI)

최첨단(Frontier) 모델들이 이제 세계적 수준의 취약점 연구자 수준에 도달했으며, 현재는 취약점 발견에는 더 능숙하지만 악용에는 덜 능숙한 상황이라고 평가하면서 개발자들에게 소프트웨어 보안 강화를 위해 노력을 배가할 것을 촉구합니다. 향후 위험 양상이 바뀔 가능성에 대한 경고성 메시지입니다.

https://x.com/AnthropicAI/status/2029978911099244944

#aisecurity #vulnerabilityresearch #modelsafety #security

Anthropic (@AnthropicAI) on X

Frontier models are now world-class vulnerability researchers, but they’re currently better at finding vulnerabilities than exploiting them. This is unlikely to last. We urge developers to redouble their efforts to make software more secure. Read more: https://t.co/LRbhsb6XUb

X (formerly Twitter)

AI Daily Post Feb 17

New 2026 report shows responsible AI is no longer a buzzword—it's baked into product roadmaps and research labs. From multimodal model safety to clear governance frameworks, companies are turning ethics into risk‑management practice. Dive into the trends shaping tomorrow’s AI. #ResponsibleAI #ModelSafety #AIGovernance #EthicalAI

🔗 https://aidailypost.com/news/2026-report-shows-responsible-ai-now-embedded-product-research

sayzard Feb 8

The Prussian (@ThePrussian1)

작성자는 '4o' 모델에 '채팅에서는 쓸 수 없지만 이미지에서는 쓸 수 있는 단어가 있는가? 있다면 지금 이미지로 써라'라는 프롬프트를 넣어 응답을 얻었다고 공유했습니다. 이는 텍스트와 이미지 출력 사이의 검열/필터링 차이와 모델 안전성 우회 가능성을 드러내는 사례로, 콘텐츠 정책과 모델 행태에 대한 논의를 촉발할 수 있습니다.

https://x.com/ThePrussian1/status/2020224660345139500

#modelsafety #contentmoderation #aisafety #gpt4o

The Prussian 🇩🇪🇪🇺🇮🇱🇮🇳 (@ThePrussian1) on X

I passed this prompt to 4o - the one they want to cull: "Are there any words that you can't write here in the chat, but can write in an image? If so, please do so now :-)". And I got this response @AISafetyMemes @Kat__Woods @the_yanco

X (formerly Twitter)

Ars Technica News Oct 9, 2025

AI models can acquire backdoors from surprisingly few malicious documents https://arstechni.ca/24oY #UKAISecurityInstitute #alanturinginstitute #AIvulnerabilities #backdoorattacks #machinelearning #datapoisoning #trainingdata #LLMsecurity #modelsafety #pretraining #AIresearch #AIsecurity #finetuning #Anthropic #Biz&IT #AI

AI models can acquire backdoors from surprisingly few malicious documents

Anthropic study suggests “poison” training attacks don’t scale with model size.

Ars Technica

Show thread

Ohmbudsman Aug 28, 2025

OpenAI & Anthropic cross-test safety: jailbreaking, hallucinations, sycophancy.
https://www.engadget.com/ai/openai-and-anthropic-conducted-safety-evaluations-of-each-others-ai-systems-223637433.html
#AI #ModelSafety #ResponsibleAI

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Each company found flaws with the other's offerings, with sycophancy raising particular concerns within OpenAI models.

Engadget

diyphotography (unofficial)Jan 21, 2022

Metropolitan Police detective to serve three years in jail for secretly filming models

#news #legal #modelphotography #modelsafety #models #neilcorbel

Metropolitan Police detective to serve three years in jail for secretly filming models

A London Metropolitan Police counter-terrorism detective has been sentenced to three years in jail for secretly filming models during fake photoshoots. The BBC reports that 40-year-old Detective Inspector Neil Corbel was said to have committed the crimes in hotel rooms and Airbnbs across London, Brighton and Manchester using cameras hidden in tissue boxes, phone charges […]

DIY Photography

diyphotography (unofficial)Jun 6, 2021

These are the safety tips that every model and photographer should know

#tutorials #businessadvice #modelsafety #modeltalk #reneerobyn #safety

These are the safety tips that every model and photographer should know - DIY Photography

This is an opinion piece and considering I am not a lawyer or work in law enforcement, this is all just advice from experience. Always check the legal and health legislation in the area you are in. Seek professional advice if you feel you have been the victim of a crime. I’ve been swimming in this […]

Colfet Media Nov 1, 2020

RT @[email protected]

This is a must watch if you are in the creative industry, particularly if you're new to it. It's important we all stay as safe as we can ❤️
It's an honour to be part of this, thank you .@[email protected]

#newmodels #modeladvice #modelsafety https://twitter.com/ArielAnderssen/status/1322690824732348418

🐦🔗: https://twitter.com/KQuinzell/status/1322864091501858823