经验首页 前端设计 程序设计 Java相关 移动开发 数据库/运维 软件/图像 大数据/云计算 其他经验
当前位置:技术经验 » 数据库/运维 » Linux/Shell » 查看文章
ntpd dead but pid file exists
来源:cnblogs  作者:潇湘隐者  时间:2021/5/17 12:59:54  对本文有异议

Zabbix监控的一台Linux主机告警:System time is out of sync (diff with Zabbix server > 60s),一检查发现时间居然滞后一个多小时了。这台Linux设置过ntpd服务,ssh登录主机,检查ntpd服务,发现报下面错误:

 

  1. # service ntpd status
  1. ntpd dead but pid file exists

 

    ntpd服务居然挂了。然后启动ntpd服务后,不到一分钟的样子,又挂了,再次启动ntpd服务后正常了,但是时间同步依然不正常。

 

  1. # service ntpd start
  1. Starting ntpd: [  OK  ]
  1. # service ntpd status
  1. ntpd (pid  14956) is running...
  1. # service ntpd status
  1. ntpd dead but pid file exists

 

检查日志,发现如下错误:time correction of 4988 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time.。 在默认设置下,ntpd对时间差距超过1000秒的情况下,拒绝对其进行时间同步操作。这个是ntpd有一个自我保护设置.

 

  1. May 16 04:02:03 xxxxx syslogd 1.4.1: restart.
  1. May 17 00:38:03 xxxxx last message repeated 5 times
  1. May 17 07:07:18 xxxxx ntpd[14955]: ntpd 4.2.2p1@1.1570-o Fri Jul 22 18:07:53 UTC 2011 (1)
  1. May 17 07:07:18 xxxxx ntpd[14956]: precision = 1.000 usec
  1. May 17 07:07:18 xxxxx ntpd[14956]: Listening on interface wildcard, 0.0.0.0#123 Disabled
  1. May 17 07:07:18 xxxxx ntpd[14956]: Listening on interface lo, 127.0.0.1#123 Enabled
  1. May 17 07:07:18 xxxxx ntpd[14956]: Listening on interface eth1, 192.168.xxx.xxx#123 Enabled
  1. May 17 07:07:18 xxxxx ntpd[14956]: kernel time sync status 0040
  1. May 17 07:07:18 xxxxx ntpd[14956]: getaddrinfo: "::1" invalid host address, ignored
  1. May 17 07:07:18 xxxxx ntpd[14956]: frequency initialized 26.675 PPM from /var/lib/ntp/drift
  1. May 17 07:10:33 xxxxx ntpd[14956]: synchronized to LOCAL(0), stratum 10
  1. May 17 07:10:33 xxxxx ntpd[14956]: kernel time sync enabled 0001
  1. May 17 07:12:41 xxxxx ntpd[14956]: synchronized to 192.168.xxx.xxx, stratum 5
  1. May 17 07:19:12 xxxxx ntpd[14956]: time correction of 4988 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time.
  1. May 17 07:25:21 xxxxx ntpd[15681]: ntpd 4.2.2p1@1.1570-o Fri Jul 22 18:07:53 UTC 2011 (1)
  1. May 17 07:25:21 xxxxx ntpd[15682]: precision = 1.000 usec
  1. May 17 07:25:21 xxxxx ntpd[15682]: Listening on interface wildcard, 0.0.0.0#123 Disabled
  1. May 17 07:25:21 xxxxx ntpd[15682]: Listening on interface lo, 127.0.0.1#123 Enabled
  1. May 17 07:25:21 xxxxx ntpd[15682]: Listening on interface eth1, 192.168.xxx.xxx#123 Enabled
  1. May 17 07:25:21 xxxxx ntpd[15682]: kernel time sync status 0040
  1. May 17 07:25:21 xxxxx ntpd[15682]: getaddrinfo: "::1" invalid host address, ignored
  1. May 17 07:25:21 xxxxx ntpd[15682]: frequency initialized 26.675 PPM from /var/lib/ntp/drift
  1. May 17 07:28:37 xxxxx ntpd[15682]: synchronized to LOCAL(0), stratum 10
  1. May 17 07:28:37 xxxxx ntpd[15682]: kernel time sync enabled 0001
  1. May 17 07:29:43 xxxxx ntpd[15682]: synchronized to 192.168.xxx.xxx, stratum 5

 

对于这种时间差距过大的时间进行同步可以用ntpdate同步,也可以手工使用ntpd同步

 

1:停止ntpd服务

 

2:运行ntpd -gnpd

 

3: 启动ntpd服务

 

个人测试了一下,即使不停止ntpd服务,手工运行ntpd -gnqd命令,依然可以同步时间,问题不大。但是会报addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor错误,所以最好是停止ntpd服务然后运行命令。

 

  1. # ntpd -gnqd
  1. ntpd 4.2.2p1@1.1570-o Fri Jul 22 18:07:53 UTC 2011 (1)
  1. addto_syslog: precision = 1.000 usec
  1. create_sockets(123)
  1. addto_syslog: no IPv6 interfaces found
  1. addto_syslog: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
  1. addto_syslog: bind() fd 16, family 2, port 123, addr 0.0.0.0, in_classd=0 flags=9 fails: Address already in use
  1. addto_syslog: bind() fd 16, family 2, port 123, addr 127.0.0.1, in_classd=0 flags=5 fails: Address already in use
  1. addto_syslog: bind() fd 16, family 2, port 123, addr 192.168.xxx.xxx, in_classd=0 flags=25 fails: Address already in use
  1. init_io: maxactivefd 0
  1. local_clock: time 0 base 0.000000 offset 0.000000 freq 0.000 state 0
  1. addto_syslog: getaddrinfo: "::1" invalid host address, ignored
  1. getaddrinfo: "::1" invalid host address, ignored.
  1. key_expire: at 0
  1. peer_clear: at 0 next 1 assoc ID 8694 refid INIT
  1. newpeer: 192.168.xxx.xxx->192.168.xxx.xxx mode 3 vers 4 poll 6 10 flags 0x281 0x1 ttl 0 key 00000000
  1. key_expire: at 0
  1. peer_clear: at 0 next 2 assoc ID 8695 refid INIT
  1. newpeer: 127.0.0.1->127.127.1.0 mode 3 vers 4 poll 6 10 flags 0x1221 0x1 ttl 0 key 00000000
  1. addto_syslog: frequency initialized 26.675 PPM from /var/lib/ntp/drift
  1. local_clock: time 0 base 0.000000 offset 0.000000 freq 26.675 state 1
  1. report_event: system event 'event_restart' (0x01) status 'sync_alarm, sync_unspec, 1 event, event_unspec' (0xc010)
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 1 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. auth_agekeys: at 1 keys 1 expired 0
  1. timer: refresh ts 0
  1. refclock_transmit: at 2 127.127.1.0
  1. refclock_receive: at 2 127.127.1.0
  1. peer LOCAL(0) event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014)
  1. refclock_sample: n 1 offset 0.000000 disp 0.010000 jitter 0.000001
  1. clock_filter: n 1 off 0.000000 del 0.000000 dsp 7.937500 jit 0.000001, age 0
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 3 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 5 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 7 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 9 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 11 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 13 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 15 192.168.xxx.xxx->192.168.xxx.xxx mode 3
  1. addto_syslog: sendto(192.168.xxx.xxx) (fd=-1): Bad file descriptor
  1. transmit: at 17 192.168.xxx.xxx->192.168.xxx.xxx mode 3

 

执行上面命令后,时间同步到正常情况,ntpd服务也正常。那么回到问题的根源:为什么ntpd服务莫名挂了呢?那么要弄清楚ntpd挂掉的原因,就必须通过日志分析,但是ntpd如果没有特别设置,它的日志信息一般位于/var/log/messages里面.我查了一下message日志,但是发现写入的日志信息非常少。并没有搜索到相关日志信息。所以很遗憾,最终依然不清楚最初是啥原因导致ntpd服务挂掉。至于Zabbix告警,因为告警信息较多,最近事情也有点多,导致这个些告警信息被忽略了。

原文链接:http://www.cnblogs.com/kerrycode/p/14776429.html

 友情链接:直通硅谷  点职佳  北美留学生论坛

本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728

W3xue 的所有内容仅供测试,对任何法律问题及风险不承担任何责任。通过使用本站内容随之而来的风险与本站无关。
关于我们  |  意见建议  |  捐助我们  |  报错有奖  |  广告合作、友情链接(目前9元/月)请联系QQ:27243702 沸活量
皖ICP备17017327号-2 皖公网安备34020702000426号