16-Zabbix监控配置详解本文档详细介绍Zabbix监控系统的部署和配置实现对3节点Docker集群的全面监控。概述Zabbix是一个企业级开源监控解决方案支持主机和容器监控网络设备监控应用程序监控告警和通知架构设计┌─────────────────────────────────────────────────────────────────┐ │ manage-net (172.20.5.0/24) │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Zabbix Server │ │ │ │ 172.20.5.31:10051 │ │ │ └────────────────────────┬────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────┴────────────────────────────────┐ │ │ │ Zabbix Web │ │ │ │ 172.20.5.32:8080 │ │ │ │ (Apache PHP) │ │ │ └────────────────────────┬────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────┴────────────────────────────────┐ │ │ │ Zabbix MySQL │ │ │ │ 172.20.5.33:3306 │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │Agent-Node1 │ │Agent-Node2 │ │Agent-Node3 │ │ │ │172.20.5.41 │ │172.20.5.42 │ │172.20.5.43 │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ └─────────┼──────────────────┼──────────────────┼─────────────────┘ │ │ │ 监控Node1 监控Node2 监控Node3IP规划组件IP地址节点端口说明Zabbix Server172.20.5.31Node310051Zabbix主服务Zabbix Web172.20.5.32Node38080Web界面Zabbix MySQL172.20.5.33Node33306数据库Zabbix Agent172.20.5.41Node110050Agent2Zabbix Agent172.20.5.42Node210050Agent2Zabbix Agent172.20.5.43Node310050Agent2部署步骤步骤1创建配置目录在所有节点执行mkdir -p /opt/cluster-deploy/config/{zabbix,zabbix-mysql}步骤2创建Zabbix MySQL配置在Node3执行cat /opt/cluster-deploy/config/zabbix-mysql/my.cnf EOF [mysqld] server-id 100 bind-address 0.0.0.0 port 3306 datadir /var/lib/mysql socket /var/lib/mysql/mysql.sock ​ log_bin mysql-bin binlog_format ROW expire_logs_days 7 ​ character-set-server utf8mb4 collation-server utf8mb4_bin ​ max_connections 200 max_allowed_packet 64M ​ innodb_buffer_pool_size 256M innodb_log_file_size 64M innodb_flush_log_at_trx_commit 2 innodb_flush_method O_DIRECT ​ [client] socket /var/lib/mysql/mysql.sock ​ [mysql] socket /var/lib/mysql/mysql.sock EOF步骤3创建Zabbix Server配置在Node3执行cat /opt/cluster-deploy/config/zabbix/zabbix_server.conf EOF ListenPort10051 LogTypeconsole DBHostzabbix-mysql DBPort3306 DBNamezabbix DBUserzabbix DBPasswordZabbixStr0ng!Pass HANodeNameZabbixServer NodeAddress172.20.5.31:10051 EOF步骤4创建Zabbix Agent配置在所有节点执行对应的配置Node1 Agent配置cat /opt/cluster-deploy/config/zabbix/zabbix_agentd-node1.conf EOF Server172.20.5.31 ServerActive172.20.5.31 HostnameNode1-Agent ​ BufferSend5 BufferSize100 MaxLinesPerSecond20 Timeout10 ​ LogTypeconsole EOFNode2 Agent配置cat /opt/cluster-deploy/config/zabbix/zabbix_agentd-node2.conf EOF Server172.20.5.31 ServerActive172.20.5.31 HostnameNode2-Agent ​ BufferSend5 BufferSize100 MaxLinesPerSecond20 Timeout10 ​ LogTypeconsole EOFNode3 Agent配置cat /opt/cluster-deploy/config/zabbix/zabbix_agentd-node3.conf EOF Server172.20.5.31 ServerActive172.20.5.31 HostnameNode3-Agent ​ BufferSend5 BufferSize100 MaxLinesPerSecond20 Timeout10 ​ LogTypeconsole EOF步骤5创建Docker Compose文件Node1 Zabbix Agentcat /opt/cluster-deploy/docker-compose-zabbix-node1.yml EOF services: zabbix-agent: image: zabbix/zabbix-agent2:alpine-7.0-latest container_name: zabbix-agent networks: manage-net: ipv4_address: 172.20.5.41 volumes: - ./config/zabbix/zabbix_agentd-node1.conf:/etc/zabbix/zabbix_agent2.conf:ro - /var/run/docker.sock:/var/run/docker.sock:ro environment: - ZABBIX_SERVER_HOST172.20.5.31 restart: unless-stopped ​ networks: manage-net: external: true EOFNode2 Zabbix Agentcat /opt/cluster-deploy/docker-compose-zabbix-node2.yml EOF services: zabbix-agent: image: zabbix/zabbix-agent2:alpine-7.0-latest container_name: zabbix-agent networks: manage-net: ipv4_address: 172.20.5.42 volumes: - ./config/zabbix/zabbix_agentd-node2.conf:/etc/zabbix/zabbix_agent2.conf:ro - /var/run/docker.sock:/var/run/docker.sock:ro environment: - ZABBIX_SERVER_HOST172.20.5.31 restart: unless-stopped networks: manage-net: external: true EOFNode3 Zabbix Server Web Agentcat /opt/cluster-deploy/docker-compose-zabbix-node3.yml EOF services: zabbix-agent: image: zabbix/zabbix-agent2:alpine-7.0-latest container_name: zabbix-agent networks: manage-net: ipv4_address: 172.20.5.43 volumes: - ./config/zabbix/zabbix_agentd-node3.conf:/etc/zabbix/zabbix_agent2.conf:ro - /var/run/docker.sock:/var/run/docker.sock:ro environment: - ZABBIX_SERVER_HOST172.20.5.31 restart: unless-stopped zabbix-mysql: image: mysql:8.0 container_name: zabbix-mysql hostname: zabbix-mysql networks: manage-net: ipv4_address: 172.20.5.33 volumes: - zabbix-mysql-data:/var/lib/mysql - ./config/zabbix-mysql/my.cnf:/etc/mysql/conf.d/my.cnf:ro environment: - MYSQL_ROOT_PASSWORDRootStr0ng!Pass - MYSQL_DATABASEzabbix - MYSQL_USERzabbix - MYSQL_PASSWORDZabbixStr0ng!Pass command: - --default-authentication-pluginmysql_native_password restart: unless-stopped zabbix-server: image: zabbix/zabbix-server-mysql:alpine-7.0-latest container_name: zabbix-server hostname: zabbix-server networks: manage-net: ipv4_address: 172.20.5.31 volumes: - zabbix-server-data:/var/lib/zabbix - ./config/zabbix/zabbix_server.conf:/etc/zabbix/zabbix_server.conf:ro environment: - DB_SERVER_HOSTzabbix-mysql - MYSQL_DATABASEzabbix - MYSQL_USERzabbix - MYSQL_PASSWORDZabbixStr0ng!Pass - ZBX_CACHESIZE128M - ZBX_HISTORYCACHESIZE64M - ZBX_TRENDCACHESIZE32M - ZBX_VALUECACHESIZE64M ports: - 10051:10051 depends_on: - zabbix-mysql restart: unless-stopped zabbix-web: image: zabbix/zabbix-web-apache-mysql:alpine-7.0-latest container_name: zabbix-web hostname: zabbix-web networks: manage-net: ipv4_address: 172.20.5.32 volumes: - zabbix-web-data:/etc/zabbix/web - zabbix-web-logs:/var/log/httpd environment: - DB_SERVER_HOSTzabbix-mysql - MYSQL_DATABASEzabbix - MYSQL_USERzabbix - MYSQL_PASSWORDZabbixStr0ng!Pass - ZBX_SERVER_HOST172.20.5.31 - PHP_TZAsia/Shanghai ports: - 8080:8080 depends_on: - zabbix-mysql - zabbix-server restart: unless-stopped networks: manage-net: external: true volumes: zabbix-mysql-data: zabbix-server-data: zabbix-web-data: zabbix-web-logs: EOF步骤6启动Zabbix服务# Node1 - 启动Agent cd /opt/cluster-deploy docker compose -f docker-compose-zabbix-node1.yml up -d # Node2 - 启动Agent cd /opt/cluster-deploy docker compose -f docker-compose-zabbix-node2.yml up -d # Node3 - 启动Server Web Agent cd /opt/cluster-deploy docker compose -f docker-compose-zabbix-node3.yml up -d初始化Zabbix Web界面首次访问打开浏览器访问http://192.168.64.130:8080默认登录信息用户名Admin密码zabbix初始配置向导欢迎点击 Next step检查依赖确认所有检查项通过点击 Next step配置数据库保持默认设置点击 Next stepZabbix服务器详情Host:172.20.5.31Port:10051Name:Zabbix server时区配置选择Asia/Shanghai完成点击 Finish添加主机监控添加Node1主机进入Configuration→Hosts→Create host填写主机信息Host name:Node1Groups: 选择Linux serversInterfaces:Type:AgentIP address:172.20.5.41Port:10050点击TemplatesLink new templates:Linux by Zabbix agentLink new templates:Docker(如需要)点击Add添加Node2主机同上Host name为Node2IP为172.20.5.42添加Node3主机同上Host name为Node3IP为172.20.5.43配置监控项常用监控项监控项键值说明CPU使用率system.cpu.utilCPU总使用率内存使用vm.memory.size内存总量磁盘使用vfs.fs.size磁盘使用情况网络流量net.if.in/out网络接口流量容器数量docker.infoDocker容器数量添加自定义监控项进入Configuration→Hosts点击主机名称进入详情点击Items→Create item填写监控项信息Name:容器总数Type:Zabbix agentKey:docker.infoType of information:Numeric (unsigned)配置告警创建媒体类型进入Administration→Media types点击Email配置SMTP服务器信息SMTP server:smtp.example.comSMTP server port:587SMTP helo:zabbixSMTP email:zabbixexample.com点击Update创建触发器进入Configuration→Hosts点击触发器的主机点击Triggers→Create trigger填写触发器信息Name:CPU使用率过高Severity:WarningExpression:{Node1:system.cpu.util.last()}80点击Add创建动作进入Configuration→Actions选择Trigger actions→Create action填写动作信息Name:CPU告警通知Conditions:Trigger CPU使用率过高点击OperationsOperation type:Send messageSend to: 选择用户组Media type:Email点击Add使用Zabbix监控Docker启用Docker监控模板Zabbix Agent2内置了Docker监控支持。需要配置以下内容在Agent配置中添加监控插件cat /opt/cluster-deploy/config/zabbix/zabbix_agentd-node1.conf EOF # Docker监控 Plugins.Docker.Endpointunix:///var/run/docker.sock EOF重启Agentdocker restart zabbix-agent在Zabbix Web中导入Docker模板下载模板https://git.zabbix.com/projects/ZBX/repos/zabbix/raw/templates/app/docker.yaml进入Configuration→Templates→Import选择下载的yaml文件点击Import监控容器状态可用的Docker监控项docker.container_info容器信息docker.container_stats容器统计docker.container.list容器列表docker.image.list镜像列表验证监控检查主机状态进入Monitoring→Hosts确认所有3个节点的ZBX图标为绿色查看最新数据进入Monitoring→Latest data选择主机查看监控数据查看图表进入Monitoring→Graphs选择主机和监控项查看趋势图常用查询查看容器数量docker exec zabbix-agent zabbix_agent2 -t docker.container.list查看系统负载docker exec zabbix-agent zabbix_agent2 -t system.cpu.load查看内存使用docker exec zabbix-agent zabbix_agent2 -t vm.memory.size故障排除Zabbix Server无法启动# 查看日志 docker logs zabbix-server # 检查MySQL连接 docker exec zabbix-server nc -zv zabbix-mysql 3306Agent无法连接Server# 查看Agent日志 docker logs zabbix-agent # 测试连通性 docker exec zabbix-agent zabbix_agent2 -t agent.ping数据库初始化失败首次启动时Zabbix会自动初始化数据库。如果失败# 删除数据库卷重新初始化 docker compose down -v docker compose up -dWeb界面显示Services are not running# 检查Zabbix Server进程 docker exec zabbix-server pgrep -a zabbix_server # 重启服务 docker restart zabbix-server zabbix-web性能优化调整Housekeeper设置# 编辑zabbix_server.conf HousekeepingFrequency4 MaxHousekeeperDelete5000配置数据存储周期进入Administration→General→Housekeeper调整历史数据和趋势的保留天数调整缓存大小# 编辑zabbix_server.conf CacheSize64M StartPollers10 StartPollersUnreachable5备份与恢复备份Zabbix数据# 备份MySQL数据 docker exec zabbix-mysql mysqldump -uroot -pRootStr0ng!Pass zabbix zabbix_backup.sql # 备份配置文件 tar czf zabbix_config_backup.tar.gz /opt/cluster-deploy/config/zabbix恢复Zabbix数据# 恢复MySQL数据 docker exec -i zabbix-mysql mysql -uroot -pRootStr0ng!Pass zabbix zabbix_backup.sql ​ # 恢复配置文件 tar xzf zabbix_config_backup.tar.gz -C /