Debian 定时器文档化实践
一 文档化目标与产出
二 推荐的元数据字段与模板
| 字段 | 含义 | 示例 |
|---|---|---|
| Name | 定时器/任务唯一标识 | backup-db.timer |
| Purpose | 业务目的与适用范围 | 每日全量备份 PostgreSQL 到远端存储 |
| Owner | 负责人与联系方式 | alice@example.com |
| Schedule | 触发规则与时区 | OnCalendar=*-- 02:00:00;TZ=Asia/Shanghai |
| OnFailure | 失败后的处理策略 | Restart=on-failure;OnFailure=notify-failed@%n |
| RunAs | 执行用户/组 | User=backup;Group=backup |
| Logs | 日志目标与保留 | journalctl -u backup-db.timer -u backup-db.service;保留 30 天 |
| Metrics | 监控与告警 | Prometheus: up{job=“backup-db”};阈值:失败率 > 1%/天 |
| Concurrency | 并发与错峰 | RandomizedDelaySec=5m;避免与 db-cleanup 同时运行 |
| Artifacts | 产出物与保留 | /var/backups/db-YYYYMMDD.sql.gz;保留 7 天 |
| Rollback | 回滚与应急 | 手动回滚最近备份;紧急开关:systemctl stop backup-db.timer |
| ChangeLog | 变更记录 | 2025-09-01:由 03:00 调整为 02:00,减少业务高峰 |
| Related | 关联单元/依赖 | Requires=network-online.target;Wants=postgresql.service |
# Name: backup-db.timer
# Purpose: Daily PostgreSQL full backup at 02:00 (CST)
# Owner: alice@example.com
# Schedule: OnCalendar=*-*-* 02:00:00; TZ=Asia/Shanghai
# RunAs: User=backup; Group=backup
# Logs: journalctl -u backup-db.timer -u backup-db.service
# Metrics: up{job="backup-db"}; alert if failure > 1%/day
# Concurrency: RandomizedDelaySec=5m
# Artifacts: /var/backups/db-*.sql.gz (7d)
# ChangeLog:
# 2025-09-01: Shift schedule to 02:00 to avoid peak load
三 在 systemd 单元中记录关键信息
# /etc/systemd/system/backup-db.timer
[Unit]
Description=Daily PostgreSQL backup at 02:00 (CST) | Owner: alice@example.com
Documentation=https://git.example.com/ops/backup-db
[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
AccuracySec=1s
RandomizedDelaySec=5m
Unit=backup-db.service
[Install]
WantedBy=timers.target
# /etc/systemd/system/backup-db.service
[Unit]
Description=Run PostgreSQL backup | Owner: alice@example.com
After=postgresql.service network-online.target
Requires=network-online.target
[Service]
Type=oneshot
User=backup
Group=backup
ExecStart=/usr/local/bin/backup-db.sh
ExecStartPre=/usr/local/bin/backup-db-prepare.sh
ExecStartPost=/usr/local/bin/backup-db-verify.sh
StandardOutput=journal
StandardError=journal
SuccessExitStatus=0 1 # 1 表示“备份已存在/无需重复”,不视为失败
Restart=on-failure
RestartSec=30
# 可选:对接监控/告警
# Environment=NOTIFY_SOCKET=/run/systemd/notify
四 变更与验证流程
五 可视化与审计建议