Recent studies have indicated that effectively utilizing inference-time compute is crucial for attaining better performance from large language models (LLMs). In this work, we propose a novel inference-aware fine-tuning paradigm, in which the model is fine-tuned in a manner that directly optimizes the performance of the inference-time strategy. We study this paradigm using the simple yet effective Best-of-N (BoN) inference strategy, in which a verifier selects the best out of a set of LLM-generated responses. We devise the first imitation learning and reinforcement learning~(RL) methods for BoN-aware fine-tuning, overcoming the challenging, non-differentiable argmax operator within BoN. We empirically demonstrate that our BoN-aware models implicitly learn a meta-strategy that interleaves best responses with more diverse responses that might be better suited to a test-time input -- a process reminiscent of the exploration-exploitation trade-off in RL. Our experiments demonstrate the effectiveness of BoN-aware fine-tuning in terms of improved performance and inference-time compute. In particular, we show that our methods improve the Bo32 performance of Gemma 2B on Hendrycks MATH from 26.8% to 30.8%, and pass@32 from 60.0% to 67.0%, as well as the pass@16 on HumanEval from 61.6% to 67.1%.
Learn over 60 terms in our artificial intelligence glossary. #AIterms
Hashtags: #AIterms #AIvocabulary #AIjargon Summery: AI Glossary: Understanding the Jargon and Terms Artificial Intelligence (AI) is a complex field with a growing list of jargon and scientific terms that can be difficult to keep up with. This glossary aims to provide a resource for both newcomers to AI and those looking to refresh their vocabulary. Agent: An intelligent agent is an AI system that canβ¦
https://webappia.com/learn-over-60-terms-in-our-artificial-intelligence-glossary-aiterms/
Hashtags: #AIterms #AIvocabulary #AIjargon Summery: AI Glossary: Understanding the Jargon and Terms Artificial Intelligence (AI) is a complex field with a growing list of jargon and scientific terms that can be difficult to keep up with. This glossary aims to provide a resource for both newcomers to AI and those looking to refresh their vocabulary.