Một thử nghiệm đối chiếu Gemini 2.5 Flash & các mô hình mã nguồn mở (OSS) trong tạo giao diện UI thông qua prompt chi tiết 62.9k token. Gemnini hoàn thành mượt mà, các mô hình OSS như Qwen, GPT-OSS, Llama-70B… phần lớn gặp lỗi: tắc nghẽn trong reasoning (dù đặt "low"), gọi tool sai, hoặc bỏ qua quy trình. Chỉ Kwaipilot-kat-coder thực hiện được nhưng chậm 3x & lỗi gọi tool. Cơ bản là kiến trúc khác biệt hay lỗi triển khai?

#AI #MôHìnhĐạiDiện #Gemini #MãNguồnMở #LậpTrình #TestingAI #TốiTânAI #G

Financial Times: OpenAI slashes AI model safety testing time. “OpenAI has slashed the time and resources it spends on testing the safety of its powerful artificial intelligence models, raising concerns that its technology is being rushed out without sufficient safeguards. Staff and third-party groups have recently been given just days to conduct ‘evaluations’, the term given to tests for […]

https://rbfirehose.com/2025/04/11/financial-times-openai-slashes-ai-model-safety-testing-time/

AI struggles with less common data: Inconsistent results for Valletta Bastions (actual mean height: 25m) highlight issues with insufficient training data. We also touch on AI poisoning.

https://www.alanbonnici.com/2025/03/ai-got-it-wrong-missing-information-or.html

#AI #DataBias #Valletta #TTMO #ArtificialIntelligence #hallucination #Mistakes #TestingAI #InsufficientData #DataPoisoning

AI got it wrong - Missing Information (or AI Poisoning)

This blog is about security and computing related topics with occassional hobby activities thrown in.

How does AI handle insufficient information? 🤔 We tested an AI with questions about the Eiffel Tower, Big Ben, and the bastions of Valletta. The AI gave inconsistent answers when training data is limited or unclear. We also touch on AI poisoning, where AI models can be misled by fake data
▶️ https://buff.ly/yRDWPTf
#AI #InsufficientData #DataPoisoning #EiffelTower #BigBen #Valletta #TestingAI #Accuracy #TTMO
AI got it wrong - Missing Information (or AI Poisoning)

YouTube

One final reminder for members only pricing and your chance to save over $300 off of registration for CAST 2025!

Yes!!! If you become a member you do qualify for this sale, but only for the next few hours!

Pricing goes up tomorrow and registration opens to the general public which means SPOTS WILL FILL UP!

REGISTER NOW: https://associationforsoftwaretesting.org/conference/cast-2025/

#TestingConference #CAST2025 #SoftwareTesting #testing #aitestingtools #TestingAi #quality #SiliconSlopes #SiliconValley

CAST 2025 - Association for Software Testing

Association for Software Testing

Also so wird das nichts mit uns, #ChatGPT!

#testingAI

Bin gespannt und freue mich riesig auf unser #testingAI Forschungsprojekt 🤖⚖️👩‍🔬
---
RT @informatikradar
Wie können wir #KI auf #Diskriminierungsfreiheit und #Fairnes testen?🤔 In unserem Forschungsprojekt werden u.a. @nettwerkerin, Prof Borges (@Saar_Uni), @LeonieBeining, @HeidrichJens und @redmonkey_ dieser Frage für @denkfabrik_bmas nachgehen.
#testingAI https://testing-ai.gi.de/meldung/ki-in-der-arbeitswelt-forschungsprojekt-zur…
https://twitter.com/informatikradar/status/1264919326064365570