AI agents — what they are, and how they’ll change the way we work
An agent takes the power of generative AI a step further, because instead of just assisting you, agents can work alongside you or even on your behalf.
Bigger is not always necessary in the rapidly evolving world of AI, and that is true in the case of small language models (SLMs). SLMs are compact AI systems designed for high volume processing that developers might apply to simple tasks. SLMs are optimized for efficiency and performance on resource-constrained devices or environments with limited connectivity, memory, and electricity—which make them an ideal choice for on-device deployment.1
Researchers at The Center for Information and Language Processing in Munich, Germany found that “… performance similar to GPT-3 can be obtained with language models that are much ‘greener’ in that their parameter count is several orders of magnitude smaller.”2 Minimizing computational complexity while balancing performance with resource consumption is a vital strategy with SLMs. Typically, SLMs are sized at just under 10 billion parameters, making them five to ten times smaller than large language models (LLMs).
Tiny yet mighty, and ready to use off-the-shelf to build more customized AI experiences
While there are many benefits of small language models, here are three key features and benefits.
An advantage SLMs have over LLMs is that they can be more easily and cost-effectively fine-tuned with repeated sampling to achieve a high level of accuracy for relevant tasks in a limited domain—fewer graphics processing units (GPUs) required, less time consumed. Thus, fine-tuning SLMs for specific industries, such as customer service, healthcare, or finance, makes it possible for businesses to choose these models for their efficiency and specialization while at the same time benefiting from their computational frugality.
build a strategic plan for AI
Get startedBenefit: This task-specific optimization makes small models particularly valuable in industry-specific applications or scenarios where high accuracy is more important than broad general knowledge. For example, a small model fine-tuned for an online retailer running sentiment analysis in product reviews might achieve higher accuracy in this specific task than if they deployed a general-purpose large model.
SLMs have a lower parameter count than LLMs and are trained to discern fewer intricate patterns from the data they work from. Parameters are a set of weights or biases used to define how a model handles and interprets information inputs before influencing and producing outputs. While LLMs might have billions or even trillions of parameters, SLMs often range from several million to a few hundred million parameters.
Here are several key benefits derived from a reduced parameter count:
Look for a small language model that provides streamlined full-stack development and hosting across static content and serverless application programming interfaces (APIs) that empower your development teams to scale productivity—from source code through to global high availability.
Benefit: For example, Microsoft Azure hosting for your globally deployed network enables faster page loads, enhanced security, and helps increase worldwide delivery of your cloud content to your users with minimal configuration or copious code required. Once your development team enables this feature for all required production applications in your ecosystem, we will then migrate your live traffic (at a convenient time for your business) to our enhanced global distributed network with no downtime.
Azure AI and Machine learning blogs
Read the latestTo recap, when deploying an SLM for cloud-based services, smaller organizations, resource constrained environments, or smaller departments within larger enterprises, the main advantages are:
These features and benefits mentioned above make small language models such as the Phi model family and GPT-4o mini on Azure AI attractive options for businesses seeking efficient and cost-effective AI solutions. It is worth noting that these compact yet powerful tools play a role in democratizing AI technology, enabling even smaller organizations to leverage advanced language processing capabilities.
Choose SLMs over LLMs when processing specific language and vision tasks, more focused training is needed, or you are managing multiple applications—especially where resources are limited or where specific task performance is prioritized over broad capabilities. Because of their different advantages, many organizations find the best solution is to use a combination of SLMs and LLMs to suit their needs.
Organizations across industries are leveraging Microsoft Azure OpenAI Service and Microsoft Copilot services and capabilities to drive growth, increase productivity, and create value-added experiences. From advancing medical breakthroughs to streamlining manufacturing operations, our customers trust that their data is protected by robust privacy protections and data governance practices. As our customers continue to expand their use of our AI solutions, they can be confident that their valuable data is safeguarded by industry-leading data governance and privacy practices in the most trusted cloud on the market today.
At Microsoft, we have a long-standing practice of protecting our customers’ information. Our approach to responsible AI is built on a foundation of privacy, and we remain dedicated to upholding core values of privacy, security, and safety in all our generative AI products and solutions.
1MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices, Cornell University.
2It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners, The Center for Information and Language Processing in Munich Germany.