GPT-5-Pro đạt 90% tại kỳ thi Miklós Schweitzer 2025, vượt kỳ vọng từ Metaculus. Đây là thành tích ấn tượng đánh dấu bước tiến trong trí tuệ nhân tạo. #GPT5Pro #AI #Metaculus #GPT_5_Pro #TríTuệNhânTạo
GPT-5-Pro đạt 90% tại kỳ thi Miklós Schweitzer 2025, vượt kỳ vọng từ Metaculus. Đây là thành tích ấn tượng đánh dấu bước tiến trong trí tuệ nhân tạo. #GPT5Pro #AI #Metaculus #GPT_5_Pro #TríTuệNhânTạo
GPT-5-pro có thể là cổng thông tin đại lý phổ quát / Mô hình đại lý lớn. Chỉ số cho thấy gpt-5-pro là mô hình đại lý lớn #GPT5pro #MôHìnhĐạiLýLớn #TríTuệNhânTạo #AI #UniversalAgenticGateway
https://www.reddit.com/r/LocalLLaMA/comments/1oz6msr/gpt5pro_is_likely_a_universal_agentic_gateway/
Has there been any improvement in AI detectors over the last 12 months? A GPT 5 Pro literature review
I ran this report to support myself in exploring whether my 2023/24 arguments from Generative AI for Academics still hold. Shared here because other people might find this useful.
TL;DR: Over the last 12 months, the centre of gravity has moved further away from “catch‑and‑punish” AI detection toward assessment redesign, process evidence, and transparency. Regulators (e.g., TEQSA in Australia) now say reliable detection “is all but impossible,” several universities have disabled AI detection features, and independent guidance for instructors argues detectors don’t work well enough to be relied on. Vendors keep publishing high headline accuracy numbers, and there’s active research on watermarking and authorship verification—but none of this has translated into a dependable, classroom‑safe detector for typical student writing. teqsa.gov.au+2Inside Higher Ed+2
From the “assessment panic” to a new normal
In the book we described the 2023–24 “great assessment panic”—a rush to outsource academic judgment to detectors in hopes of restoring order overnight. The last year shows that order won’t be restored by a score. What is emerging instead is a culture shift: instructors accept that students will use GenAI, and programmes re‑emphasize authentic tasks, process artefacts, and viva/oral components to evidence learning.
That turn is also consistent with your broader argument about not over‑automating interpersonal judgment: even where automation looks tempting, we should beware brittle tools that offload risk onto students and staff. Detection has become the latest test case for that etiquette.
What changed in 2024–25
1) Policy and regulator signals hardened
2) Universities kept switching off or downgrading detectors
3) Vendors continued strong claims; independent guidance stayed sceptical
4) The equity problem didn’t go away
Has AI detection gotten more reliable?
Short answer: not in the way that matters for day‑to‑day teaching.
Bottom line: There’s no compelling evidence of a step‑change in detector reliability over the last year for typical coursework (short, hybrid, multi‑draft, multilingual). What has changed is policy clarity: “use with caution—never as sole evidence.”
What’s promising (but not there yet): provenance & watermarking
If there’s progress, it’s more on provenance than on retroactive text detection.
Implication: Expect forward‑looking provenance (label at creation) to matter more than after‑the‑fact detection. That helps in journalism and platform governance; it’s much less helpful when grading a student draft pasted from an unknown source.
Where detectors might fit (narrow, safeguarded use)
If your institution still exposes an AI score, keep it in a triage role only:
This aligns with your book’s argument to use GAI as a meta‑collaborator for process (drafts, notes, meeting digests) rather than as a shortcut to verdicts. Building a traceable writing process—version histories, short oral defenses, design logs—yields better authorship evidence than any detector.
What universities actually did instead (2024–25)
Why the consensus calls detection a “dead end” (for now)
A practical playbook (what we recommend now)
What to watch next
Bottom line
The weight of evidence this year backs your intuition: AI detection has not become reliably trustworthy for routine academic integrity decisions. The consensus hasn’t just declared it a dead end—it has pivoted toward making assessment resilient to AI rather than attempting to police it perfectly. That shift fits the values you map throughout the project: keep the human at the centre, build better processes, and treat GAI as a tool for learning—not a trap for students.
References & further reading (selection)
OpenAI'dan Dev Day sürprizleri! 🚀 GPT-5 Pro, Sora 2 ve gpt-realtime mini modelleri API'ye geliyor. Geliştiriciler için yapay zeka dünyasında yeni bir dönem başlıyor. İnovasyon hızlanacak!
🚩 #OpenAI #YapayZeka #GPT5Pro #Sora2 #AITeknoloji #Geliştirici
TechCrunch: OpenAI ramps up developer push with more powerful models in its API . “OpenAI unveiled new API updates at its Dev Day on Monday, introducing GPT-5 Pro, its latest language model, its new video generation model Sora 2, and a smaller, cheaper voice model.”
OpenAI ramps up developer push with more powerful models in its API
GPT-5 vs GPT-5 Pro Differences
GPT-5 and GPT-5 Pro differ mainly in accuracy, speed, and cost. Pro delivers deeper reasoning with internet access.https://www.olamnews.com/technology/ai/1482/gpt-5-vs-gpt-5-pro-differences/
Claim: GPT-5-pro can prove new interesting mathematics
https://twitter.com/SebastienBubeck/status/1958198661139009862
Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.