在CentOS上使用PyTorch进行机器学习实战

Installing Python and Dependencies
Before installing PyTorch, ensure your CentOS system has Python 3.6+ and essential build tools. Run the following commands to update the system, install Python, and set up development dependencies:

sudo yum update -y
sudo yum install -y python3 python3-pip python3-devel
sudo yum groupinstall -y "Development Tools"  # Includes gcc, make, etc.

These steps ensure compatibility with PyTorch and its dependencies.

Creating a Virtual Environment
Isolate PyTorch and project dependencies by creating a virtual environment. This prevents conflicts with system-wide packages:

python3 -m venv pytorch_env  # Create environment
source pytorch_env/bin/activate  # Activate (run this in every new terminal)

For advanced isolation, use conda (if installed):

conda create -n pytorch_env python=3.9 -y
conda activate pytorch_env

Virtual environments are optional but highly recommended for clean project management.

Installing PyTorch
Choose between CPU and GPU versions based on your hardware. For CPU-only systems, use:

pip install torch torchvision torchaudio

For GPU acceleration, you need NVIDIA drivers, CUDA Toolkit, and cuDNN. After installing these, use:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117  # Replace cu117 with your CUDA version (e.g., cu113, cu116)

Verify installation with:

import torch
print(torch.__version__)  # Check PyTorch version
print(torch.cuda.is_available())  # Should return True for GPU version

This confirms PyTorch is correctly installed and accessible.

Preparing Data
Use torchvision for common datasets (e.g., MNIST, CIFAR-10) or pandas/numpy for custom data. A typical data pipeline includes loading, preprocessing, and loading into a DataLoader for batch processing:

from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Define preprocessing (convert images to tensors and normalize)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # Normalize pixel values to [-1, 1]
])

# Load MNIST dataset (download if not exists)
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoaders (shuffle training data, batch size 64)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

This setup is efficient for small to medium datasets. For larger datasets, consider tf.data or custom data generators.

Defining a Model
Use PyTorch’s nn.Module to define a neural network. A simple feedforward model for MNIST (28x28 images → 10 classes) looks like this:

import torch.nn as nn

class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)  # Input layer (flatten 28x28 to 784)
        self.relu = nn.ReLU()             # Activation function
        self.fc2 = nn.Linear(128, 10)     # Output layer (10 classes)

    def forward(self, x):
        x = x.view(-1, 28*28)  # Flatten input tensor
        x = self.relu(self.fc1(x))  # First layer + activation
        x = self.fc2(x)          # Output layer
        return x

model = SimpleNet()  # Instantiate model

For complex tasks (e.g., CNNs for images, LSTMs for sequences), use specialized layers like nn.Conv2d or nn.LSTM.

Training the Model
Training involves a loop over epochs, where the model learns from data by minimizing a loss function. Key steps include:

Defining a loss function (e.g., cross-entropy for classification).
Selecting an optimizer (e.g., SGD, Adam).
Iterating over batches, computing loss, backpropagating gradients, and updating weights.

Here’s a complete training loop for the MNIST example:

import torch.optim as optim

criterion = nn.CrossEntropyLoss()  # Loss function (classification)
optimizer = optim.SGD(model.parameters(), lr=0.01)  # Optimizer (SGD with learning rate 0.01)

num_epochs = 5
for epoch in range(num_epochs):
    model.train()  # Set model to training mode
    running_loss = 0.0
    for images, labels in train_loader:
        optimizer.zero_grad()  # Clear previous gradients
        outputs = model(images)  # Forward pass
        loss = criterion(outputs, labels)  # Compute loss
        loss.backward()  # Backpropagate gradients
        optimizer.step()  # Update weights

        running_loss += loss.item()  # Track loss

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}")

This loop trains the model for 5 epochs, printing the average loss per epoch.

Evaluating the Model
After training, evaluate the model on a test set to measure accuracy. Use torch.no_grad() to disable gradient computation (saves memory):

model.eval()  # Set model to evaluation mode
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)  # Get predicted class
        total += labels.size(0)  # Total samples
        correct += (predicted == labels).sum().item()  # Correct predictions

print(f"Test Accuracy: {100 * correct / total:.2f}%")

This gives the percentage of correctly classified images out of the total test images.

Saving and Loading the Model
Save the trained model’s parameters to a file for later use (e.g., inference or further training):

# Save model
torch.save(model.state_dict(), 'simple_net.pth')

# Load model (in a new script or after restarting)
model = SimpleNet()  # Recreate model architecture
model.load_state_dict(torch.load('simple_net.pth'))  # Load saved weights
model.eval()  # Set to evaluation mode

Saving/loading ensures you don’t lose trained models and can reuse them efficiently.

最新问答

相关标签