CentOS上PyTorch模型部署最佳实践
一 环境准备与版本选择
sudo yum install -y python3 python3-pippython3 -m venv venv && source venv/bin/activatepip install torch torchvision torchaudio二 模型导出与优化
traced = torch.jit.trace(model, example_input); traced.save("traced.pt")scripted = torch.jit.script(model); scripted.save("scripted.pt")torch.onnx.export(model, dummy_input, "model.onnx", opset_version=14, dynamic_axes=...)onnx.checker.check_model(onnx_model)ort_session = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider" if torch.cuda.is_available() else "CPUExecutionProvider"])model.qconfig = quantization.get_default_qconfig('fbgemm'); quantized = quantization.prepare(model, inplace=False); quantization.convert(quantized, inplace=True); quantized.save("quantized.pt")三 服务化部署与进程管理
pip install fastapi uvicorn[standard]uvicorn app:app --host 0.0.0.0 --port 8000[Unit] Description=Model API Service[Service] ExecStart=/path/to/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8000WorkingDirectory=/path/to/app User=appuser Restart=always[Install] WantedBy=multi-user.targetsudo systemctl daemon-reload && sudo systemctl enable --now model-api && sudo systemctl status model-apisudo firewall-cmd --zone=public --add-port=8000/tcp --permanent && sudo firewall-cmd --reload四 容器化与交付
FROM nvidia/cuda:12.3.2-base-centos7RUN yum install -y python3 python3-pip && pip install --upgrade pipCOPY requirements.txt /app/ && pip install -r /app/requirements.txtCOPY . /app && WORKDIR /appCMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000", "--timeout", "120"]docker build -t model-api:latest .docker run --gpus all -p 8000:8000 model-api:latest五 上线检查清单与常见问题
nvidia-smi输出、驱动版本、容器是否启用--gpus。onnxruntime-gpu包,并使用正确的ExecutionProvider。journalctl -u model-api -xe与容器日志,核对工作目录、依赖与端口占用。