在Linux环境下,监控GitLab项目的性能需结合系统级工具、GitLab内置功能及第三方可视化平台,实现对CPU、内存、磁盘、网络及项目指标的全面覆盖。以下是具体实施步骤:
GitLab需先开启指标导出功能,才能被外部监控工具(如Prometheus)抓取数据。
/etc/gitlab/gitlab.rb),添加以下配置:gitlab_rails['prometheus_export_address'] = 'localhost' # 指标导出地址
gitlab_rails['prometheus_export_port'] = '9090' # 指标导出端口(默认9090)
gitlab['monitoring'] = {
'enable' => true # 启用监控
}
sudo gitlab-ctl reconfigure
sudo gitlab-ctl restart
此时,GitLab会在localhost:9090/metrics暴露Prometheus格式的性能指标。Prometheus是开源时序数据库,负责收集GitLab及Linux系统的指标;Grafana是可视化工具,可将指标转化为直观的仪表盘。
prometheus.yml配置文件,添加GitLab监控目标:scrape_configs:
- job_name: 'gitlab' # 监控任务名称
static_configs:
- targets: ['gitlab.example.com:9090'] # GitLab指标地址(替换为实际域名/IP)
启动Prometheus:./prometheus --config.file=prometheus.yml
Prometheushttp://<prometheus-server-ip>:90904379,涵盖CPU、内存、作业状态等指标),或自定义PromQL查询:
sum(rate(process_cpu_seconds_total[1m])) by (instance)(sum(process_resident_memory_bytes) / sum(process_virtual_memory_bytes)) * 100(node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100sum(gitlab_runner_jobs_status{status="success"}) by (project_name)。通过Prometheus Alertmanager或Grafana Alerting设置告警规则,当指标超过阈值时触发通知(邮件、Slack等)。
alerts.yml):groups:
- name: gitlab_performance
rules:
- alert: HighMemoryUsage
expr: (sum(process_resident_memory_bytes) / sum(process_virtual_memory_bytes)) * 100 > 80 # 内存使用率>80%
for: 5m # 持续5分钟
labels:
severity: warning
annotations:
summary: "High memory usage in GitLab ({{ $labels.instance }})"
description: "Memory usage is above 80% for 5 minutes."
- alert: HighCPUUsage
expr: sum(rate(process_cpu_seconds_total[1m])) by (instance) > 0.8 # CPU使用率>80%
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage in GitLab ({{ $labels.instance }})"
description: "CPU usage is above 80% for 5 minutes."
prometheus.yml中添加rule_files:rule_files:
- "alerts.yml"
重启Prometheus后,进入Grafana的Alerting → Alert rules,可查看并管理告警规则。通过.gitlab-ci.yml文件,在CI/CD管道中收集项目指标(如构建时间、测试覆盖率),并将结果存储到GitLab或外部系统(如Prometheus)。
.gitlab-ci.yml配置:stages:
- test
- metrics
test_job:
stage: test
script:
- echo "Running tests..."
- pytest --cov=./ # 使用pytest-cov收集测试覆盖率
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage.xml
metrics_job:
stage: metrics
script:
- echo "Collecting system metrics..."
- curl -s http://localhost:9090/metrics > metrics.txt # 收集GitLab指标
artifacts:
paths:
- metrics.txt
通过GitLab的CI/CD → Pipelines页面,可查看每次构建的测试覆盖率、构建时间及指标数据。htop(实时进程监控)、vmstat(内存/磁盘/CPU统计)、iostat(磁盘I/O监控)等命令行工具,快速查看Linux服务器的资源使用情况。/var/log/gitlab目录)发送到ELK Stack(Elasticsearch+Logstash+Kibana),实现日志的集中存储、搜索与可视化,帮助定位性能瓶颈(如慢查询、异常请求)。通过以上步骤,可实现GitLab项目在Linux环境下的全链路性能监控,从系统资源到项目指标均能实时掌握,及时发现并解决问题。