How to Use Hugging Face: Beginner's Guide
Hugging Face hosts over 2 million models, 500,000 datasets, and 1 million Spaces. That scale makes it the largest open-source ML platform available to individual practitioners and enterprise teams alike. This guide walks you through the full setup: creating an account, installing the Transformers library, running your first inference pipeline, and deploying an interactive Space. Every step uses verified commands from official Hugging Face documentation.
Prerequisites
Before writing your first line of Hugging Face code, confirm you have these four pieces in place. Missing any one of them will cause errors during installation or inference.
python --version to verify. If you see a version below 3.8, upgrade before continuing.python -m venv hf-env and activate it. Isolating dependencies prevents conflicts with system packages.git lfs install after installing Git LFS from git-lfs.com.If you already have an active Python environment with PyTorch installed, you can skip ahead to account creation. The prerequisites checklist above persists your progress in your browser, so you can return to it later.
Create Your Account
A Hugging Face account is free and unlocks the full platform: model downloads, dataset access, Spaces hosting, and Inference API calls. You can sign up in under two minutes.
Step 1: Register. Go to huggingface.co/join and create an account with your email, Google, or GitHub credentials. No credit card required.
Step 2: Generate an access token. Navigate to huggingface.co/settings/tokens. Click "New token," give it a name, and select the read role for downloading models or write if you plan to push models or datasets.
Step 3: Authenticate your CLI. Open your terminal and run:
huggingface-cli login
Paste your token when prompted. This stores the credential locally so every Transformers and Hub operation authenticates automatically. Your token is saved at ~/.cache/huggingface/token.
Security note: Treat your access token like a password. Never commit it to version control. Use environment variables (HF_TOKEN) in CI/CD pipelines rather than hardcoding the value. Organizations operating under AI governance policies should store tokens in a centralized secrets manager.
Explore the Hub
The Hugging Face Hub is a Git-based registry for models, datasets, and Spaces. Think of it as GitHub specifically designed for machine learning artifacts. Every model card includes architecture details, training data provenance, evaluation metrics, and usage examples.
Start at huggingface.co/models. The filtering system lets you narrow by task (text generation, image classification, speech recognition), library (Transformers, Diffusers, spaCy), and license. This is the fastest way to find a pre-trained model that fits your use case without training from scratch.
Datasets. The Datasets hub hosts over 500,000 datasets with built-in streaming support. You can load any public dataset in two lines of Python without downloading the entire file first. The Datasets library uses Apache Arrow as its backend, which means columnar storage and memory-mapped access for large datasets.
Spaces. Spaces are hosted web applications built with Gradio or Streamlit. Over 1 million Spaces run on the platform, from text-to-image demos to full chatbot interfaces. You can fork any public Space and modify it within minutes. Hardware options range from free CPU instances to paid GPU environments including ZeroGPU.
AI Risk Management Template
Identify, assess, and mitigate AI deployment risks
Download Free →Install Transformers
The Transformers library is the core execution engine. It provides a unified API for running inference with thousands of pre-trained models through AutoModel, AutoTokenizer, and the pipeline() abstraction. One install gives you access to the entire model ecosystem.
Core Installation
With your virtual environment activated, run:
pip install transformers torch
This installs the Transformers library and PyTorch together. If you are on a machine with an NVIDIA GPU and CUDA configured, PyTorch will automatically detect and use it.
Optional Libraries
Depending on your workflow, add these companion packages:
pip install datasets # Load and stream datasets
pip install huggingface_hub # Hub CLI and Python client
pip install accelerate # Distributed training
pip install evaluate # Standardized metrics
Verify the installation by checking the library version:
python -c "import transformers; print(transformers.__version__)"
python -m venv hf-env or conda to isolate dependencies..pkl, .pt, or .bin format can execute arbitrary Python code during deserialization. Prefer models that use the safetensors format, which is memory-safe and avoids pickle-based code execution risks.Your First Pipeline
The pipeline() function is the fastest path from zero to inference. It bundles tokenization, model forward pass, and post-processing into a single callable. You specify the task and optionally a model; it handles everything else.
Sentiment Analysis
This is the "hello world" of Hugging Face. Two lines, no configuration:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("Hugging Face makes ML accessible to everyone.")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998}]
The function downloads a default model (distilbert-base-uncased-finetuned-sst-2-english), tokenizes your input, runs inference, and returns a structured result. The model is cached locally after the first download.
Summarization with a Specific Model
For more control, specify the model explicitly:
from transformers import pipeline
summarizer = pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")
text = """Hugging Face is a platform that provides tools for building,
training, and deploying machine learning models. It hosts over 2 million
models and 500,000 datasets. The Transformers library offers a unified
API for working with pre-trained models across NLP, vision, and audio."""
summary = summarizer(text, max_length=50, min_length=25)
print(summary[0]["summary_text"])
- ✓Set up Python environment
- ✓Create Hugging Face account
- ✓Install Transformers + PyTorch
- ✓Run first pipeline inference
- ✓Deploy a Space
Loading Models Directly
When you need more control than pipeline() offers, load models and tokenizers directly with the Auto classes:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello, Hugging Face!", return_tensors="pt")
outputs = model(**inputs)
This approach gives you access to the raw model outputs (hidden states, attention weights) for custom downstream tasks like fine-tuning or feature extraction.
Spaces and Deployment
Spaces turn your Python scripts into interactive web apps. The platform handles hosting, SSL, and scaling. You write the application logic; Hugging Face runs it.
Create a Space. On the Hub, click "New Space," choose Gradio or Streamlit as your SDK, and select a hardware tier. Free CPU Spaces cost nothing. GPU options range from $0.40/hr (T4) up to $23.50/hr for high-end hardware.
Minimal Gradio App
Create an app.py file with the following code:
import gradio as gr
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
def analyze(text):
result = classifier(text)
return f"{result[0]['label']} ({result[0]['score']:.4f})"
demo = gr.Interface(fn=analyze, inputs="text", outputs="text",
title="Sentiment Analysis")
demo.launch()
Push this file to your Space repository. Hugging Face detects the app.py, installs dependencies from requirements.txt, and deploys the interface automatically. Your app gets a public URL within minutes.
For production workloads, consider Inference Endpoints. These are dedicated auto-scaling GPU environments with scale-to-zero capability. CPU endpoints start at $0.03/hr, and GPU endpoints (A100, H100) scale up to $10/hr depending on the hardware. Unlike Spaces, Inference Endpoints are designed for API-level traffic with SLA guarantees.
Cost control: The free tier includes unlimited public repos and basic CPU Spaces. The Pro plan ($9/month) adds 1 TB private storage, 10 ZeroGPU Spaces, and 20x Inference Providers quota. Enterprise Hub starts around $20/user/month for teams with custom contracts.
Next Steps
With the fundamentals covered, here is where to go deeper depending on your goals:
- Fine-tuning. Use the Trainer API to fine-tune models on your own data. Start with a small dataset and a LoRA adapter via the PEFT library to reduce compute costs.
- Diffusion models. The Diffusers library provides a framework for text-to-image, image-to-image, and inpainting tasks using models like Stable Diffusion.
- Distributed training. The Accelerate library abstracts multi-GPU and TPU training. It requires minimal code changes to scale from one GPU to a cluster.
- Evaluation. The Evaluate library provides standardized metric computation (BLEU, ROUGE, accuracy, F1) so your benchmarks are reproducible.
- Inference Providers. Hugging Face offers pass-through access to third-party inference providers with no markup. This gives you vendor flexibility without API integration overhead. Teams building agentic AI workflows can chain these providers into multi-step pipelines.
Troubleshooting
These are the most common issues beginners encounter. Each solution comes from official Hugging Face documentation and verified community fixes.
Run import torch; print(torch.cuda.is_available()) to verify. If False, your PyTorch installation does not include CUDA bindings. Reinstall PyTorch with the correct CUDA version from pytorch.org/get-started/locally/. Check that your NVIDIA drivers are up to date with nvidia-smi.
Reduce batch size first. If that is not enough, enable gradient accumulation, switch to mixed precision training (fp16=True in Trainer), or apply model quantization with bitsandbytes. For very large models, use device_map="auto" to spread layers across available GPUs.
Some models (Llama, Gemma) require you to accept their license on the model page before downloading. Visit the model card, accept the terms, then run huggingface-cli login with a valid token from huggingface.co/settings/tokens.
Confirm you are in the correct virtual environment. Run which python (Linux/Mac) or where python (Windows) to verify the active interpreter. If the path does not point to your venv, activate it with source hf-env/bin/activate or hf-env\Scripts\activate on Windows.
Large models (7B+ parameters) can be several gigabytes. Ensure Git LFS is installed (git lfs install) and that your network connection is stable. You can also set HF_HUB_ENABLE_HF_TRANSFER=1 and install the hf_transfer package for faster downloads using the Rust-based transfer client.
This typically happens when Transformers, PyTorch, and other ML libraries have overlapping dependency requirements. The fix is to always use a dedicated virtual environment. Run pip install --upgrade transformers torch in a clean environment. If using conda, prefer conda install pytorch -c pytorch -c nvidia to get a pre-resolved dependency set.
Video Resources
Go Deeper
Resources from across Tech Jacks Solutions
FREEAI Risk Management Template
Identify, assess, and mitigate AI deployment risks
EU AI Act Guide
Check your compliance obligations under the EU AI Act
FREEAI Bias Assessment
Evaluate bias risks before deploying any AI system
What Is Agentic AI?
Understand the architecture behind autonomous AI agents
AI Career Paths
Explore roles that work with these tools daily