[thread] Small language models
see also: https://en.wikipedia.org/wiki/Large_language_model
https://www.ibm.com/think/topics/small-language-models

* machine learning models
* processing, understanding, generating natural language content
* SLM more compact/efficient than LLM: large language models
* few million to few billion parameters vs LLM: 100B's - trillions
* parameters: internal variables that a model learns during training
* influence how model behaves/performs

#LLM #SLM #SmallLanguageModels #LanguageModels #NLP #ML #AI
#Microsoft #Phi3 #Phi4

Large language model - Wikipedia

Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning
https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft%E2%80%99s-newest-small-language-model-specializing-in-comple/4357090
https://arxiv.org/abs/2412.08905
https://news.ycombinator.com/item?id=42405323

* most language models" pre-training based primarily on organic data sources such as web content or code
* phi-4 strategically incorp. synthetic data throughout training
* strong performance rel. its size, esp. on reasoning-focused benchmarks

#LLM #SLM #SmallLanguageModels #LanguageModels #NLP #ML #AI
#Microsoft #Phi3 #Phi4 #SyntheticData