Rohan Paul (@rohanpaul_ai)
의료 비전-언어 모델에서 강화학습(RL) 후학습은 완전히 새로운 능력을 학습시키기보다 기존 능력을 더 정교하게 다듬고, 출력 분포를 최적화해 효율성을 높이는 역할이 크다는 연구 결과를 제시한다.
https://x.com/rohanpaul_ai/status/2036653802204561594
#reinforcementlearning #medai #visionlanguage #machinelearning #research

Rohan Paul (@rohanpaul_ai) on X
This research shows that reinforcement learning (RL) in medical vision-language models mostly sharpens existing skills rather than teaching entirely new ones. Reinforcement learning post-training primarily refines output distributions to improve efficiency, while supervised
