Dan McAteer (@daniel_mac8)

Gemini 3.1 Pro가 동영상 이해 능력에서 뛰어난 성능을 보인다는 보고. 유튜브 영상을 입력하면 원하는 정보를 정확히 추출해 학습용 도구로 유용하며, 속도가 더 빨라지고 로봇 등에 연결되면 큰 영향이 있을 것으로 예상된다고 평가하고 있음.

https://x.com/daniel_mac8/status/2026485892504789039

#gemini #videounderstanding #multimodal #youtube

Dan McAteer (@daniel_mac8) on X

Gemini 3.1 Pro is incredible at video understanding. Give it any YT video and it can flawlessly pull out any information you want from it. Great learning tool. Can't imagine what the world looks like when it gets fast enough and hooked up to a robot.

X (formerly Twitter)

Tsinghua University (@Tsinghua_Uni)

Xu가 ICML 2025(CCF A급)에서 발표한 논문에서 Xattention을 도입해 알고리즘에서 13.5× 속도 향상을 달성했고, 실시간·무한 길이 비디오 이해를 위한 StreamingVLM을 개발했다고 보고했습니다. 해당 연구는 100+ 인용과 300k+ 미디어 조회를 기록했습니다.

https://x.com/Tsinghua_Uni/status/2018565287906517400

#xattention #streamingvlm #icml2025 #videounderstanding #efficientattention

Tsinghua University (@Tsinghua_Uni) on X

As the first author, Xu published a paper at @icmlconf 2025, a CCF A–ranked conference. He introduced Xattention, achieving 13.5× speedup for algorithms & developed StreamingVLM for real-time, infinite-length video understanding, gaining 100+ citations & 300k+ media views.

X (formerly Twitter)

cedric (@cedric_chee)

작성자는 앞으로 테스트할 항목들을 나열했습니다: 게임 개발(gamedev), 비디오 이해 능력(비교 대상: Gemini 3 Pro/Flash), 대규모 복셀 렌더링, 핵심 추론(IMO 2025 문제 및 SimpleBench 퍼즐). 멀티모달, 그래픽스, 추론 성능을 다양한 벤치마크와 비교하려는 계획입니다.

https://x.com/cedric_chee/status/2016207172175331445

#gamedev #videounderstanding #gemini #voxelrendering

cedric (@cedric_chee) on X

Planning to test: - gamedev - video understanding capabilities (compares to Gemini 3 Pro/Flash) - large scale voxel rendering - core reasoning: IMO 2025, puzzles from SimpleBench

X (formerly Twitter)

UC Berkeley develops a groundbreaking language model with video understanding!

Researchers at UC Berkeley have made a significant advancement in Gen AI with their new "World Model on Million-Length Video and Language". Such models could develop a understanding of both human textual knowledge and the physical world, enabling broader AI capabilities for assisting humans.

https://largeworldmodel.github.io/

#AI #NLP #languagemodel #videounderstanding #research #opensourcing

Large World Models