Aman Sanger (@amanrsanger)

Kimi k2.5를 여러 베이스 모델과 perplexity 기반 평가로 비교한 결과, 가장 강력한 모델로 평가했다고 언급했습니다. 이어서 continued pre-training과 고비용 RL을 4배 규모로 확장해 성능을 끌어올렸다고 밝혀, 최신 모델 평가와 학습 전략 측면에서 중요한 내용입니다.

https://x.com/amanrsanger/status/2035079293257359663

#kimi #llm #reinforcementlearning #pretraining #evaluations

Aman Sanger (@amanrsanger) on X

We've evaluated a lot of base models on perplexity-based evals and Kimi k2.5 proved to be the strongest! After that, we do continued pre-training and high-compute RL (a 4x scale-up). The combination of the strong base, CPT and RL, and Fireworks' inference and RL samplers make

X (formerly Twitter)

via #AIFoundry : Foundry Agent Service is GA: private networking, Voice Live, and enterprise-grade evaluations

https://ift.tt/T189MBt
#FoundryAgentService #GA #PrivateNetworking #VoiceLive #Evaluations #AzureFoundry #OpenAIResponsesAPI #MCP #EntraIdentity #OAuthPassthrough #P

Foundry Agent Service is GA: private networking, Voice Live, and enterprise-grade evaluations | Microsoft Foundry Blog

Explore the Microsoft Foundry Agent Service for seamless production AI deployment with enhanced networking and compliance features.

Microsoft Foundry Blog
Découvrez le tuto n°07 : paramétrer une évaluation par compétence dans Pronote. Pas‑à‑pas clair pour enseignant·e·s d'Économie‑Gestion utilisant Moodle — gagnez du temps et améliorez l'évaluation des compétences ! #Pronote #Moodle #Evaluations #Compétences #Enseignement #EdTech #EconomieGestion #French
https://tube-sciences-technologies.apps.education.fr/videos/watch/072a4792-4195-41c1-ada0-a5ac385988c0
07 Paramétrer une évaluation par compétence dans Pronote

PeerTube

How bad smells, hand sanitizer, and Israeli judges affect your evaluation of an event. Yes, our bodies guide our judgments!

https://www.conferencesthatwork.com/index.php/event-design/2012/10/how-bad-smells-hand-sanitizer-and-israeli-judges-affect-your-evaluation-of-an-event

#meetings #evaluations #bias #psychology

Here are two ways to take a hard look at your conference evaluations. You may be surprised by what you find.

https://www.conferencesthatwork.com/index.php/event-design/2016/03/look-conference-evaluations

#meetings #EventDesign #evaluations #surveys #FacilitatingChange #eventprofs

Short-term traditional meeting evaluations are unreliable. They tell you nothing about the long-term effects of a session. We can do better.

https://www.conferencesthatwork.com/index.php/event-design/2015/11/why-meeting-evaluations-are-unreliable-and-how-we-can-improve-them

#meetings #EventDesign #evaluations #bias #unreliable #HowToImprove #eventprofs

Can conference organizers get evaluative feedback on the long-term outcomes of their events? Try The Reminder and find out!

https://www.conferencesthatwork.com/index.php/event-design/2015/11/the-reminder

#meetings #EventDesign #evaluations #FacilitatingChange #TheReminder #eventprofs