prometheus安装和使用记录-环球热议

您的位置: 首页 > 国际 > > 内容页

prometheus安装和使用记录-环球热议

时间：2023-06-19 19:31:05 来源：博客园

Getting started | Prometheus

Configuration | PrometheusDownload | PrometheusDownload Grafana | Grafana Labs

# prometheusmkdir -m=777 -p /data/{download,app_logs,app/prometheus}cd /data/downloadwget https://github.com/prometheus/prometheus/releases/download/v2.45.0-rc.0/prometheus-2.45.0-rc.0.linux-amd64.tar.gztar xvfz prometheus-*.tar.gzln -s /data/download/prometheus-2.45.0-rc.0.linux-amd64/prometheus /usr/bin/prometheuscp /data/download/prometheus-2.45.0-rc.0.linux-amd64/prometheus.yml /data/app/prometheus/prometheus.yml prometheus --config.file=/data/app/prometheus/prometheus.yml --web.listen-address=:9090 --web.enable-lifecycle --storage.tsdb.path=/data/app/prometheus/data >>/data/app_logs/prometheus.log 2>&1 &# node_exporter 在需要监控的服务器里安装mkdir -m=777 -p /data/{download,app_logs,app/prometheus}cd /data/downloadwget https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gztar xvfz node_exporter*ln -s /data/download/node_exporter-1.6.0.linux-amd64/node_exporter /usr/bin/node_exporter# 启动node_exporter,服务器暴露的端口是8080，同时服务器里有其他服务占用了8080端口，可以使用nginx将node_exporter获取指标的api暴露出去# location /metrics {#     proxy_pass http://127.0.0.1:9000/metrics;# }node_exporter --web.listen-address 127.0.0.1:9000 >>/data/app_logs/node_exporter.log 2>&1 &# 添加node_exporter之后，需要更新prometheus.xml添加targets，然后运行：curl -X PUT http://server_address:port/-/reload重新加载配置文件# alert_manager可以和prometheus安装到同一台服务器cd /data/downloadwget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gztar xvfz alertmanager*ln -s /data/download/alertmanager-0.25.0.linux-amd64/alertmanager /usr/bin/alertmanagercp /data/download/alertmanager-0.25.0.linux-amd64/alertmanager.yml /data/app/prometheus/alertmanager.ymlalertmanager --config.file=/data/app/prometheus/alertmanager.yml --web.listen-address 127.0.0.1:9001 >>/data/app_logs/node_exporter.log 2>&1 &# 将alert_manager的地址添加到prometheus.yml里的alertmanagers的targets里，然后运行：curl -X PUT http://server_address:port/-/reload重新加载配置文件

测试报警邮件功能：设置如果安装exporter的服务器内存占用率超过50%或者tcp timewait超过10的时候就发邮件（在实际工作中需要设置一个合适的条件）：

prometheus.yml里添加rule_files的路径：

(资料图片仅供参考)

# my global configglobal:  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.  # scrape_timeout is set to the global default (10s).# Alertmanager configurationalerting:  alertmanagers:    - static_configs:        - targets:          - 127.0.0.1:9001# Load rules once and periodically evaluate them according to the global "evaluation_interval".rule_files:  # - "first_rules.yml"  # - "second_rules.yml"  - "/data/app/prometheus/alert.rules.yml"# A scrape configuration containing exactly one endpoint to scrape:# Here it"s Prometheus itself.scrape_configs:  # The job name is added as a label `job=` to any timeseries scraped from this config.  - job_name: "prometheus"    # metrics_path defaults to "/metrics"    # scheme defaults to "http".    scrape_interval: 5s    static_configs:            - targets: ["node1_ip:8080"]            - targets: ["node2_ip:8080"]              labels:                groups: "container"

alert.rules.yml里添加具体的rule，node_socket_TCP_tw这些具体的指标通过http://node_exporter_ip:port/metrics可以获取到

groups:- name: tcp-alert-group  rules:  - alert: TcpTimeWait    expr: node_sockstat_TCP_tw > 10    for: 10m    labels:      severity: warning    annotations:      summary: tcp time wait more than 10      description: please check node_sockstat_TCP_tw metric  - alert: MemoryUse    expr: (node_memory_MemTotal_bytes-node_memory_MemFree_bytes-node_memory_Buffers_bytes-node_memory_Cached_bytes)/node_memory_MemTotal_bytes > 0.5    for: 10m    labels:      severity: warning    annotations:      summary: memory use more than 50% for 10 min      description: please check memory use

alertmanager.yml里配置告警邮件的信息：

global:  resolve_timeout: 5m  smtp_smarthost: your_smpt_host:port  smtp_from: alertmanager@your_email_domain  smtp_require_tls: falseroute:  group_by: ["alertname"]  group_wait: 30s  group_interval: 5m  repeat_interval: 10m  receiver: "email"receivers:  - name: "email"    email_configs:    - to: "receiver_email"      send_resolved: true

yml文件一旦更新，需要重新加载配置：curl -X PUT http://server_address:port/-/reload

在Prometheus的界面可以看到添加的alert：

当alert的条件满足后，alertmanager就会发邮件

grafana的安装和启动：

# grafana可以和prometheus里安装到同一台服务器yum install -y https://dl.grafana.com/enterprise/release/grafana-enterprise-10.0.0-1.x86_64.rpm# grafana默认启动的端口号是3000，如果服务器没有暴露3000端口的话，需要修改grafana的配置文件sed -i "s/3000/8080/g" /usr/share/grafana/conf/defaults.inigrafana server >> /data/app_logs/grafana.log 2>&1 &# grafana数据保存地址：/var/lib/grafana.db

grafana启动之后就可以在浏览器上打开对应的地址，初次登录用户名和密码：admin/admin

Data sources里添加prometheus，grafana和prometheus启动在同一台服务器里的话，地址就可以用localhost

添加dashboard，在Explore里可以查询指标并且添加到dashboard

cpu使用率：avg(1-irate(node_cpu_seconds_total{mode="idle"}[1m])) by(instance)

内存使用率：(node_memory_MemTotal_bytes-node_memory_MemFree_bytes-node_memory_Buffers_bytes-node_memory_Cached_bytes)/node_memory_MemTotal_bytes

tcp连接数：node_sockstat_TCP_alloc

dashboard：

注意点：

1.prometheus启动的时候添加--web.enable-lifecycle才允许通过调用/-/reload接口重新加载配置文件2.prometheus启动的时候指定一个固定的数据存放位置--storage.tsdb.path=/data/app/prometheus/data，如果数据存放位置不一致，启动后查不到历史数据，历史数据做备份的话，prometheus启动的服务器还可以变更3.grafana的数据保存地址：/var/lib/grafana.db，定期做备份，服务器发生系统错误无法使用的时候，在新的服务器里同步/var/lib/grafana.db文件之后，启动grafana之前的配置不会丢失

关键词：

聚焦更多+

如何防止戒指在手指上转圈？正月打小麦除草气温多少能打？全球短讯

如何防止戒指在手指上转圈?戒子在手指上转圈圈，那肯定是因为你所佩

重点聚焦!MUMU模拟器卡在60%不动怎么办?MUMU模拟器卡在60%的解决流程屏幕保护动画设置教程详细介绍（屏幕保护程序等待时间怎么设置）|当前速读全球讯息：千纸鹤怎么折？千纸鹤的寓意当前讯息：35岁该学习理财吗？35岁该学习理财会不会太迟？天天新消息丨华夏族是什么意思？华裔是什么意思：华侨在侨居国生下的子女