Python深度学习迁移学习:核心概念与实践指南
迁移学习是深度学习中利用预训练模型(在大规模数据集如ImageNet上训练好的模型)的知识,迁移到新任务的技术,旨在解决数据不足、计算资源有限等问题,显著提升模型性能。其核心思想是“知识转移”——预训练模型的底层网络能提取通用特征(如边缘、纹理),而顶层网络针对源任务设计,新任务可通过保留通用特征并微调顶层适应自身需求。
数据集需分为训练集、验证集和测试集(如8:1:1),并进行预处理:
根据任务类型选择合适的预训练模型:
torchvision.models.resnet18(pretrained=True)或tensorflow.keras.applications.VGG16(weights='imagenet', include_top=False));fc层替换为nn.Linear(in_features, num_classes),VGG的Flatten后添加Dense(256, activation='relu')和Dense(num_classes, activation='softmax'))。设置损失函数、优化器和评估指标:
CrossEntropyLoss(多分类),优化器用Adam(学习率0.001)或SGD(学习率0.01,动量0.9);CrossEntropyLoss,优化器用AdamW(学习率2e-5);accuracy)、精确率(precision)、召回率(recall)。layer.trainable = False),仅训练新添加的顶层;layer4),用较小的学习率(如0.0001)继续训练;model.fit()(TensorFlow)或trainer.train()(PyTorch Lightning)迭代训练,监控验证集性能。在测试集上评估模型性能,若效果不佳,可调整:
StepLR调度器降低学习率);import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# 数据预处理
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(224, 224),
batch_size=32,
class_mode='categorical'
)
# 加载预训练模型(不含顶层)
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# 冻结预训练模型层
for layer in base_model.layers:
layer.trainable = False
# 添加新顶层
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x) # 假设10个类别
# 构建模型
model = Model(inputs=base_model.input, outputs=predictions)
# 编译模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# 训练模型(特征提取)
model.fit(train_generator, epochs=10, steps_per_epoch=100)
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import models, transforms
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
# 数据预处理
transform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
train_dataset = ImageFolder(root='data/train', transform=transform)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
# 加载预训练模型
model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
# 冻结预训练模型层
for param in model.parameters():
param.requires_grad = False
# 修改顶层
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10) # 假设10个类别
# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.001)
# 训练模型(特征提取)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
for epoch in range(10):
model.train()
running_loss = 0.0
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f'Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}')
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。