Use (Almost) Any Language Model Locally with Ollama and Hugging Face Hub

Image source: Hugging Face

Ollama, an application built on llama.cpp, now offers easy integration with a huge vault of GGUF format language models hosted on Hugging Face. This new feature allows users to run any of the 45,000+ public GGUF checkpoints on their local machines using a single command, eliminating the need for any setup procedure whatsoever. The integration provides flexibility in model selection, quantization schemes, and customization options, making this arguably the easiest way to acquire and run language models on your local machine.

The new functionality extends beyond model compatibility, offering users the ability to fine-tune (pun intended) their interaction with these models. Custom quantization options allow for optimized performance based on available hardware, while user-defined chat templates and system prompts enable personalized conversational workflows. Additionally, the ability to adjust sampling parameters allows for granular control over model output. This combination of accessibility and customization empowers users to leverage state-of-the-art language models locally, and makes AI-driven application development and research easier than ever.

Getting started is as easy as this:

Our Top 3 Partner Recommendations

1. Best VPN for Engineers – 3 Months Free – Stay secure online with a free trial

2. Best Project Management Tool for Tech Teams – Boost team efficiency today

4. Best Password Management for Tech Teams – zero-trust and zero-knowledge security

# Run Ollama with specified model
# ollama run hf.co/{username}/{repository}
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF

# Run Ollama with specified model and desired quantization
# ollama run hf.co/{username}/{repository}:{quantization}
ollama run hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:IQ3_M

That’s it. After this, chat with the model at the command line or create your own programs that leverage the locally-running models.

Find out more here, then get started with the fantastic development right away.

Matthew Mayo (@mattmayo13) holds a master’s degree in computer science and a graduate diploma in data mining. As managing editor of KDnuggets & Statology, and contributing editor at Machine Learning Mastery, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, language models, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.

Use (Almost) Any Language Model Locally with Ollama and Hugging Face Hub

Our Top 3 Partner Recommendations

Recent Articles

Shopify Summer ’25 Edition Introduces Horizon, a New Standard for Creative Control

Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

Divorce by coffee grounds, and why AI robots need your brain • Graham Cluley

Lyma Laser Review: Clinical Results Without the Clinic

What the Most Detailed Peer-Reviewed Study on AI in the Classroom Taught Us

Related Stories

Leave A Reply Cancel reply