构建prometheus+grafana企业级监控环境(一)

Author：愚人
发布时间：February 26, 2019
5486views
2 comments
6338 words
Categories：原创

### 一、概述：

prometheus+grafana 是一组很流行的监控程序，主要为 docker 为云环境而生，配合 K8S 可以轻松实现企业级的监控。这次搭建一半是为了练手，更多的学习 linux 和 docker 的知识；一半也希望进行一次实践，将搭建好的程序用于生产环境。

### 二、对比一些常见的监控平台

对比就不得不提鼎鼎有名的 ZABBIX 这个非常著名的监控程序了，ZABBIX 出现得很早，在物理机的监控上非常有优势，经过多年的发展也拓展到能监控应用程序了，例如监控 数据库 nginx等等。

但是：

虽然出现的早，但毕竟是二代服务器环境时代的产物，我给的印象就是老，这个程序太老了，不符合我的择偶标准，我都是喜新厌旧的。现在大规模的云环境下，应运而生的监控平台 prometheus 应该更符合潮流，prometheus 完全开源，所以发展非常的快，虽然出发点是为了监控云环境，但是在监控基础环境下一点也不差，反而因为用更现代化的 go 语言开发，使得程序小而流畅，总之它就是非常好！！(文笔有限)

### 三、prometheus+grafana架构

![官方说明架构](https://wuyn.net/usr/uploads/sina/5cd978a7602f6.jpg)

prometheus server ：主程序，主要通过 http 协议抓取各种数据  (zabbix主要采用node推送的方式获取数据)

Pushgeteway：临时网关，用于批量，短期的监控数据的汇总节点，主要用于业务数据汇报等。

jobs/exporters ：部署在客户端的程序，go语言开发非常轻量，由于采用server抓取的形式，所以对客户机的影响非常小

Alertmanager ：收到警告的时候，可以根据配置，聚合，去重，降噪，最后发送警告。

grafana ：第三方前端展示界面，提供非常绚丽的前端渲染(prometheus 本身只有简单的监控界面)

prometheus 采用自开发的 PromQL ，用于对抓取到的数据对外展现和查询，相比 mysql 查询方式更简单高效；

prometheus和grafana都自带web界面，无需依赖 PHP nginx等环境

### 四、部署过程

#####环境

server：192.168.10.1 （安装prometheus:9090+grafana:3000）

node:192.168.10.2 （安装node_exporter:9100）

OS：Centos 7 x64

默认关闭防火墙，selinux

#####1.server上安装prometheus

官方推荐 docker 安装，如果对 docker 比较熟练倒是非常推荐，我还是个菜鸡，就用官方提供的二进制安装

参考 [官方安装文档](https://prometheus.io/docs/prometheus/latest/installation/)

```
#从官方github拉取安装包
mkdir data
cd data
wget https://github.com/prometheus/prometheus/releases/download/v2.7.1/prometheus-2.7.1.linux-amd64.tar.gz
tar zxvf prometheus-2.7.1.linux-amd64.tar.gz prometheus
cd prometheus
#后台运行prometheus
nohup ./prometheus &
#创建 prometheus 用户并将程序写入systemd服务
cd ..
sudo groupadd prometheus
sudo useradd -g prometheus -m -d /data/prometheus -s /sbin/nologin prometheus
sudo chown -R prometheus:prometheus /data/prometheus
sudo vim /etc/systemd/system/prometheus.service 
#写入以下内容
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/data/prometheus/prometheus --config.file=/data/prometheus/prometheus.yml --storage.tsdb.path=/data/prometheus
Restart=on-failure
[Install]
WantedBy=multi-user.target
```

访问http://ip:9090（prometheus自带的一个简单web界面）

主要配置

##### 2.server安装grafana

参考 [官方安装文档](http://docs.grafana.org/installation/rpm/)

仍然建议用 docker 安装，这里用 rpm 包安装

```
wget https://dl.grafana.com/oss/release/grafana-6.0.0-1.x86_64.rpm 
sudo yum localinstall grafana-6.0.0-1.x86_64.rpm 
```

启动 grafana 程序

```
$ systemctl daemon-reload
$ systemctl start grafana-server
```

访问：http://ip:3000 (默认用户名/密码：admin/admin)

###### 2.1配置启用systemd服务时以启动grafana

```
sudo systemctl enable grafana-server.service
```

######2.2如果图像显示文字不全，安装字体包解决：

```
yum install fontconfig
yum install freetype*
yum install urw-fonts
```

###### 2.3备注：

- 默认数据库位置：/var/lib/grafana/grafana.db  (默认为sqlite3,可更改为mysql或Postgres)

- 默认配置使用日志文件 `/var/log/grafana/grafana.log`

##### 3.node(192.168.10.2)安装node_exporter程序

```
mkdir data
cd data
wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
tar zxvf node_exporter-0.17.0.linux-amd64.tar.gz node_exporter
rm node_exporter-0.17.0.linux-amd64.tar.gz
cd node_exporter
nohup /data/node_exporter/node_exporter &
#创建node_exporter用户并将node_exporter程序写入systemd服务
sudo groupadd node_exporter
sudo useradd -g node_exporter -m -d /data/node_exporter -s /sbin/nologin node_exporter
sudo chown -R node_exporter:node_exporter /data/node_exporter
sudo vim /etc/systemd/system/node_exporter.service
#写入以下配置
[Unit]
Description=Node_exporter
DefaultDependencies=no
 
[Service]
Type=simple
User=node_exporter
RemainAfterExit=yes
ExecStart=/data/node_exporter/node_exporter --collector.textfile.directory=/data/node_exporter/ --web.listen-address=:9100
Restart=on-failure
 
[Install]
WantedBy=multi-user.target
 
sudo systemctl start node_exporter.service
sudo systemctl enable node_exporter.service
```

##### 4.集成node_exporter+prometheus server+grafana

##### 4.1在server端配置监控节点

```
vim /data/prometheus/prometheus.yml
#添加如下配置
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

# metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

static_configs:
    - targets: ['127.0.0.1:9090']

- job_name: 'test'

# metrics_path defaults to '/metrics'
    #scheme defaults to 'http'.
   
    static_configs:
     - targets: ['192.168.10.2:9100']
       labels:
         instance: server1
```

###### 4.2访问 http://192.168.10.1:9090 界面如下

![web界面](https://wuyn.net/usr/uploads/sina/5cd978a7b6044.jpg)

######4.2grafana添加数据源

访问 http://192.168.10.1:3000 （admin/admin）

选择添加 DataSource 选择 prometheus 类型

![data source](https://wuyn.net/usr/uploads/sina/5cd978a8652ba.jpg)

如图所示添加数据源服务器

![添加数据源](https://wuyn.net/usr/uploads/sina/5cd978a8b61a8.jpg)

###### 4.3数据视图

我比较懒，懒得一个一个去写规则构建视图了，我需要的监控主要是cpu内存等基础，所以直接摸了一个网上现成的模板：

[官网提供的现成模板(其他人贡献)](https://grafana.com/dashboards)

只需要记住模板的 ID 号填入 grafana 的 dashboards 里面即可

###### 4.4成品

![成品](https://wuyn.net/usr/uploads/sina/5cd978a940dcc.jpg)

#### 五、附-参考资源

★ [prometheus.io官方 Doc](https://prometheus.io/docs/prometheus/latest/installation/)

★ [grafana.org官方 Doc](http://docs.grafana.org/installation/rpm/#start-the-server-init-d-service)

[www.zhukun.net 博客教程](https://www.zhukun.net/?s=prometheus&submit=%E6%90%9C%E7%B4%A2)

[Prometheus操作指南](https://www.ctolib.com/docs/sfile/prometheus-book/index.html)

[Prometheus 非官方中文手册](https://www.bookstack.cn/books/prometheus-manual)

Last modification：June 27th, 2019 at 08:12 pm

2 comments

Lutzow
June 27th, 2019 at 07:11 pm

http://IP:9090 Not https://IP:9090

Reply
1. 愚人
  June 27th, 2019 at 08:13 pm
  
  @Lutzow
  感谢指出，以更正；本以为小破站没人看，写给自己做个存档，看来还是抽空认真整理一下的好。
  
  Reply

构建prometheus+grafana企业级监控环境(一)

愚人 • 2019 年 02 月 26 日

### 一、概述：

### 二、对比一些常见的监控平台

但是：

### 三、prometheus+grafana架构

![官方说明架构](https://wuyn.net/usr/uploads/sina/5cd978a7602f6.jpg)

prometheus server ：主程序，主要通过 http 协议抓取各种数据  (zabbix主要采用node推送的方式获取数据)

Pushgeteway：临时网关，用于批量，短期的监控数据的汇总节点，主要用于业务数据汇报等。

jobs/exporters ：部署在客户端的程序，go语言开发非常轻量，由于采用server抓取的形式，所以对客户机的影响非常小

Alertmanager ：收到警告的时候，可以根据配置，聚合，去重，降噪，最后发送警告。

grafana ：第三方前端展示界面，提供非常绚丽的前端渲染(prometheus 本身只有简单的监控界面)

prometheus 采用自开发的 PromQL ，用于对抓取到的数据对外展现和查询，相比 mysql 查询方式更简单高效；

prometheus和grafana都自带web界面，无需依赖 PHP nginx等环境

### 四、部署过程

#####环境

server：192.168.10.1 （安装prometheus:9090+grafana:3000）

node:192.168.10.2 （安装node_exporter:9100）

OS：Centos 7 x64

默认关闭防火墙，selinux

#####1.server上安装prometheus

官方推荐 docker 安装，如果对 docker 比较熟练倒是非常推荐，我还是个菜鸡，就用官方提供的二进制安装

参考 [官方安装文档](https://prometheus.io/docs/prometheus/latest/installation/)

访问http://ip:9090（prometheus自带的一个简单web界面）

主要配置

##### 2.server安装grafana

参考 [官方安装文档](http://docs.grafana.org/installation/rpm/)

仍然建议用 docker 安装，这里用 rpm 包安装

```
wget https://dl.grafana.com/oss/release/grafana-6.0.0-1.x86_64.rpm 
sudo yum localinstall grafana-6.0.0-1.x86_64.rpm 
```

启动 grafana 程序

```
$ systemctl daemon-reload
$ systemctl start grafana-server
```

访问：http://ip:3000 (默认用户名/密码：admin/admin)

###### 2.1配置启用systemd服务时以启动grafana

```
sudo systemctl enable grafana-server.service
```

######2.2如果图像显示文字不全，安装字体包解决：

```
yum install fontconfig
yum install freetype*
yum install urw-fonts
```

###### 2.3备注：

- 默认数据库位置：/var/lib/grafana/grafana.db  (默认为sqlite3,可更改为mysql或Postgres)

- 默认配置使用日志文件 `/var/log/grafana/grafana.log`

##### 3.node(192.168.10.2)安装node_exporter程序

##### 4.集成node_exporter+prometheus server+grafana

##### 4.1在server端配置监控节点

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

static_configs:
    - targets: ['127.0.0.1:9090']

- job_name: 'test'

# metrics_path defaults to '/metrics'
    #scheme defaults to 'http'.
   
    static_configs:
     - targets: ['192.168.10.2:9100']
       labels:
         instance: server1
```

###### 4.2访问 http://192.168.10.1:9090 界面如下

![web界面](https://wuyn.net/usr/uploads/sina/5cd978a7b6044.jpg)

######4.2grafana添加数据源

访问 http://192.168.10.1:3000 （admin/admin）

选择添加 DataSource 选择 prometheus 类型

![data source](https://wuyn.net/usr/uploads/sina/5cd978a8652ba.jpg)

如图所示添加数据源服务器

![添加数据源](https://wuyn.net/usr/uploads/sina/5cd978a8b61a8.jpg)

###### 4.3数据视图

我比较懒，懒得一个一个去写规则构建视图了，我需要的监控主要是cpu内存等基础，所以直接摸了一个网上现成的模板：

[官网提供的现成模板(其他人贡献)](https://grafana.com/dashboards)

只需要记住模板的 ID 号填入 grafana 的 dashboards 里面即可

###### 4.4成品

![成品](https://wuyn.net/usr/uploads/sina/5cd978a940dcc.jpg)

#### 五、附-参考资源

★ [prometheus.io官方 Doc](https://prometheus.io/docs/prometheus/latest/installation/)

★ [grafana.org官方 Doc](http://docs.grafana.org/installation/rpm/#start-the-server-init-d-service)

[www.zhukun.net 博客教程](https://www.zhukun.net/?s=prometheus&submit=%E6%90%9C%E7%B4%A2)

[Prometheus操作指南](https://www.ctolib.com/docs/sfile/prometheus-book/index.html)

[Prometheus 非官方中文手册](https://www.bookstack.cn/books/prometheus-manual)

构建prometheus+grafana企业级监控环境(一)

2 comments

Leave a Comment Cancel reply

白嫖1Password & 密码迁移

联通光猫PT952G破解超级密码

机顶盒贝尔S-010W-A破解

mount挂载后本地用户无权限问题

LG V30 刷机备忘录

构建prometheus+grafana企业级监控环境(一)

记录2018

Hello Windows(二令人耳目一新的1903)

备用机短信同步转发解决方案

来自未来的文字记录工具：notion

构建prometheus+grafana企业级监控环境(一)