Preparatory Steps
Before using pre-trained models in PyTorch on Ubuntu, ensure your environment is correctly configured. Install Python (≥3.8) and pip via sudo apt update && sudo apt install python3 python3-pip. For GPU acceleration, install the appropriate NVIDIA driver (use ubuntu-drivers devices to check compatibility) and CUDA/cuDNN versions aligned with your PyTorch version (refer to the PyTorch official website for guidance). Install PyTorch with the correct CUDA support—for example, for CUDA 11.8, use pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118. Common utilities like matplotlib and numpy can be added via pip3 install matplotlib numpy.
Loading a Pre-trained Model
PyTorch’s torchvision.models module provides a range of pre-trained models (e.g., ResNet, VGG, DenseNet). To load a model with pre-trained weights, specify pretrained=True when instantiating the model. For example, to load ResNet-18:
import torchvision.models as models
model = models.resnet18(pretrained=True) # Downloads and loads ImageNet-pretrained weights
This command automatically downloads the model weights (if not already present) and configures the model for inference.
Preparing Input Data
Pre-trained models expect input images to be preprocessed in a specific way (e.g., resized to 224x224 pixels, normalized using ImageNet’s mean/std). Use torchvision.transforms to define a preprocessing pipeline:
from torchvision import transforms
preprocess = transforms.Compose([
transforms.Resize(256), # Resize to 256 pixels on the shorter side
transforms.CenterCrop(224), # Crop the center to 224x224
transforms.ToTensor(), # Convert PIL image to tensor
transforms.Normalize( # Normalize with ImageNet stats
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
])
Load an image using PIL (from PIL import Image) and apply the preprocessing:
img = Image.open('path_to_image.jpg') # Replace with your image path
img_tensor = preprocess(img)
img_tensor = img_tensor.unsqueeze_(0) # Add batch dimension (required by models)
```.
**Running Inference**
With the model loaded and input preprocessed, perform inference in evaluation mode (to disable dropout/batch normalization) using `torch.no_grad()` to avoid unnecessary gradient calculations:
```python
import torch
model.eval() # Set model to evaluation mode
with torch.no_grad(): # Disable gradient computation
output = model(img_tensor) # Forward pass
The output is a tensor of logits (raw scores) for each class. Convert these to probabilities using torch.nn.functional.softmax:
import torch.nn.functional as F
probabilities = F.softmax(output[0], dim=0) # Compute class probabilities
Print the top-5 predicted classes with their probabilities:
_, top5_indices = torch.topk(probabilities, 5) # Get top 5 indices
for idx, prob in zip(top5_indices, probabilities[top5_indices]):
print(f"Class {idx}: {prob.item():.4f}")
```.
**Fine-tuning for Custom Tasks (Transfer Learning)**
To adapt a pre-trained model to a new task (e.g., classifying 10 custom classes), modify the final fully connected (FC) layer to match the new number of output classes. For ResNet-18:
```python
import torch.nn as nn
num_classes = 10 # Number of custom classes
model = models.resnet18(pretrained=True)
model.fc = nn.Linear(model.fc.in_features, num_classes) # Replace FC layer
Adjust the data preprocessing to match the new dataset’s requirements (e.g., resize to 224x224, normalize with dataset-specific stats). Define a loss function (e.g., CrossEntropyLoss for classification) and optimizer (e.g., SGD with momentum):
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
Train the model using your custom dataset (split into training/validation sets) and evaluate performance periodically. Save the model’s state dictionary after training to reuse it later:
torch.save(model.state_dict(), 'custom_model.pth') # Save only weights
To reload the model, instantiate the same architecture and load the saved weights:
model = models.resnet18(pretrained=False) # Do not load default weights
model.fc = nn.Linear(model.fc.in_features, num_classes) # Reapply custom FC layer
model.load_state_dict(torch.load('custom_model.pth')) # Load saved weights
model.eval() # Set to evaluation mode
```.
**Key Notes**
- Always set the model to `eval()` mode during inference or fine-tuning validation to ensure correct behavior of layers like dropout and batch normalization.
- For transfer learning, freezing earlier layers (e.g., `for param in model.parameters(): param.requires_grad = False`) can speed up training and prevent overfitting when you have limited data.
- Normalize input data using the same mean and standard deviation as the original pre-trained model to avoid performance degradation.