Mastodawn

fly51fly (@fly51fly)

LG 소속의 G. Gülmez(2026) 연구 'DynaMoE'는 토큰 단위로 전문가(Expert)를 동적으로 활성화하고 레이어별 적응형 용량(layer-wise adaptive capacity)을 적용해 Mixture-of-Experts(MoE) 신경망의 효율성·자원 할당을 개선하는 방법을 제안한다. arXiv에 공개된 논문으로 MoE의 활성화 패턴과 계산 비용 최적화에 중점.

https://x.com/fly51fly/status/2030759672681279967

#moe #mixtureofexperts #dynamoe #arxiv #deeplearning

fly51fly (@fly51fly) on X

[LG] DynaMoE: Dynamic Token-Level Expert Activation with Layer-Wise Adaptive Capacity for Mixture-of-Experts Neural Networks G Gülmez (2026) https://t.co/zEc7F82uSw

X (formerly Twitter)

sayzard Mar 3

Gökdeniz Gülmez (@ActuallyIsaak)

새 연구 논문 발표: DynaMoE라는 Mixture-of-Experts 프레임워크를 소개합니다. DynaMoE는 토큰별로 활성화되는 전문가(expert)의 수를 동적으로 결정하고, 전체 전문가 수를 상황에 따라 스케줄링할 수 있는 구조를 제안합니다. MoE 계열의 효율성·확장성 개선을 목표로 한 아키텍처 연구입니다.

https://x.com/ActuallyIsaak/status/2028770014883418222

#dynamoe #mixtureofexperts #moe #research

Gökdeniz Gülmez (@ActuallyIsaak) on X

Today I’m sharing a new research paper that explores a new idea in mixture of experts architecture called “DynaMoE”. DynaMoE is a Mixture-of-Experts framework where: - the number of active experts per token is dynamic. - the number of all experts can be scheduled differently

X (formerly Twitter)