首页 > 学院 > 综合知识 > 正文

使用supervisor监控mha masterha_manager进展

2022-07-13 12:57:07
字体:
来源:转载
供稿:网友
  我们在用mha自带的masterha_manager脚本做mysql主库故障自动切换时,需要考虑如何让masterha_manager监控进程一直处于正常运行的状态。而supervisor可以很好地解决这个问题,它可以将一个普通的命令行进程变为后台daemon,并监控进程状态,异常退出时能自动重启。
 
  这里列一下部署要点和管理命令
 
  一,supervisor 安装:
  sudo pip install supervisor
 
  二,supervisor配置:
 
  mkdir -p /etc/supervisor/conf.d/
 
  生成配置文件
 
  # echo_supervisord_conf > /etc/supervisor/supervisord.conf
 
  这一步可能会遇到以下报错
 
  Traceback (most recent call last):
    File "/usr/bin/echo_supervisord_conf", line 5, in <module>
      from pkg_resources import load_entry_point
    File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
      working_set.require(__requires__)
    File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
      needed = self.resolve(parse_requirements(requirements))
    File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
      raise DistributionNotFound(req)
  pkg_resources.DistributionNotFound: meld3>=0.6.5
  在网上查了一下原因,大概和python或者pip版本相关,通过源码安装一次meld3好了,简单三步搞定:
 
  git clone https://github.com/Supervisor/meld3
  cd meld3
  python setup.py install
  查看配置文件
 
  cat /etc/supervisor/supervisord.conf
 
  [unix_http_server]
  file=/tmp/supervisor.sock ; the path to the socket file
 
  [supervisord]
  logfile=/tmp/supervisord.log ; main log file; default $CWD/supervisord.log
  logfile_maxbytes=50MB ; max main logfile bytes b4 rotation; default 50MB
  logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
  loglevel=info ; log level; default info; others: debug,warn,trace
  pidfile=/tmp/supervisord.pid ; supervisord pidfile; default supervisord.pid
  nodaemon=false ; start in foreground if true; default false
  minfds=1024 ; min. avail startup file descriptors; default 1024
  minprocs=200 ; min. avail process descriptors;default 200
  user=dbadmin ; default is current user, required if root
 
  [rpcinterface:supervisor]
  supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
 
  [supervisorctl]
  serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
 
  [include]
  files = /etc/supervisor/conf.d/*.conf
 
  配置要点:
  1,其它的配置将可以使用生成的默认配置,但是user需要改成做免密码登陆的用户,比如这里的dbadmin,不然masterha_manager启动会出错,因为mha的免密码登陆全部是用的dbadmin的帐号
  2,管理进程的配置可以直接放在supervisor的主配置文件中的[program:xxx]段,但是最好每个进程准备一个配置文件,以方便管理,通过[include]段的file配置指定配置文件目录。
 
  三,以masterha_manager监控test为例来看supervisor的操作
  1,准备masterha_manager监控test的配置文件
  cat /etc/supervisor/conf.d/masterha_manager_test.conf
  [program:masterha_manager_test]
  command=masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover ; 启动命令
  stdout_logfile=/tmp/manager.log ; stdout 日志输出位置
  stderr_logfile=/tmp/manager.log ; stderr 日志输出位置
  autostart=true ; 在 supervisord 启动的时候自动启动
  autorestart=true ; 程序异常退出后自动重启
  startsecs=10 ; 启动 10 秒后没有异常退出,就当作已经正常启动
 
  2,启动supervisord进程
  # supervisord -c /etc/supervisor/supervisord.conf
  # ps -ef | grep super
  dbadmin 11892 1 0 02:56 ? 00:00:00 /usr/bin/python /usr/bin/supervisord
  root 13340 31610 0 02:56 pts/0 00:00:00 grep super
 
  3,查看监控的进程状态
  # supervisorctl status
 
  masterha_manager_test RUNNING pid 11912, uptime 0:03:08
 
  # ps -ef | grep master
  root 1343 31610 0 02:59 pts/0 00:00:00 grep master
  root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
  dbadmin 11912 11892 0 02:56 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
  可以看到masterha_manager已经启起来了
 
 
  4,测试
  直接杀掉masterha_manager进程模拟masterha_manager异常退出:
  # ps -ef | grep master
  root 1343 31610 0 02:59 pts/0 00:00:00 grep master
  root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
  dbadmin 11912 11892 0 02:56 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
 
  # kill -9 11912
 
  # ps -ef | grep master
  dbadmin 1707 11892 5 03:30 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
  root 2054 31610 0 03:30 pts/0 00:00:00 grep master
  root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
 
  可以看到supervisor又重新启了masterha_manager监控进程
 
  5,常用管理命令
  supervisord: 初始启动Supervisord,启动、管理配置中设置的进程;
  supervisorctl stop(start, restart) xxx,停止(启动,重启)某一个进程(xxx);
  supervisorctl reread: 只载入最新的配置文件, 并不重启任何进程;
  supervisorctl reload: 载入最新的配置文件,停止原来的所有进程并按新的配置启动管理所有进程;
  supervisorctl update: 根据最新的配置文件,启动新配置或有改动的进程,配置没有改动的进程不会受影响而重启;
 
  6,添加supervisord为Linux系统服务,开机自动启动
  准备启动脚本supervisord.sh
 
  # chmod +x supervisord.sh
  # mv supervisord.sh  /etc/init.d/supervisord
  # chkconfig --add  supervisord
  # chkconfig --level 345 supervisord on
 
 
  cat /etc/rc.d/init.d/supervisord
  #!/bin/sh
  #
  # /etc/rc.d/init.d/supervisord
  #
  # Supervisor is a client/server system that
  # allows its users to monitor and control a
  # number of processes on UNIX-like operating
  # systems.
  #
  # chkconfig: - 64 36
  # description: Supervisor Server
  # processname: supervisord
  # Source init functions
  . /etc/rc.d/init.d/functions
  prog="supervisord"
  prog_bin="/usr/bin/supervisord"
  PIDFILE="/tmp/supervisord.pid"
  CONFILE="/etc/supervisor/supervisord.conf"
  start()
  {
   echo -n $"Starting $prog: "
   daemon $prog_bin -c $CONFILE --pidfile $PIDFILE
   [ -f $PIDFILE ] && success $"$prog startup" || failure $"$prog startup"
   echo
  }
    
  stop()
  {
   echo -n $"Shutting down $prog: "
   [ -f $PIDFILE ] && killproc $prog || success $"$prog shutdown"
   echo
  }
    
  case "$1" in
   start)
   start
   ;;
   stop)
   stop
   ;;
   status)
   status $prog
   ;;
   restart)
   stop
   start
   ;;
   *)
   echo "Usage: $0 {start|stop|restart|status}"
   ;;
  esac

(编辑:错新网)

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表