无论硬件watchdog还是由软件模拟的硬件watchdog,最终都是管理的/dev/watchdog文件。Linux内核自身有watchdog驱动的实现,其主要代码在drivers/watchdog下。其工作的大概流程是,先注册watchdog驱动,系统通过start启动/dev/watchdog设备后,由定时的ping来保证喂狗,如果在timeout时间内检测到都没有喂狗的话,就会系统复位修复。
注册watchdog驱动
通过watchdog_register_device注册watchdog_device为对象的watchdog驱动。Watchdog的结构如下:
truct watchdog_device {
int id;
struct cdev cdev;
struct device *dev;
struct device *parent;
const struct watchdog_info *info;
const struct watchdog_ops *ops;
unsigned int bootstatus;
unsigned int timeout;
unsigned int min_timeout;
unsigned int max_timeout;
void *driver_data;
struct mutex lock;
unsigned long status;
/* Bit numbers for status flags */
#define WDOG_ACTIVE 0 /* Is the watchdog running/active */
#define WDOG_DEV_OPEN 1 /* Opened via /dev/watchdog ? */
#define WDOG_ALLOW_RELEASE 2 /* Did we receive the magic char ? */
#define WDOG_NO_WAY_OUT 3 /* Is 'nowayout' feature set ? */
#define WDOG_UNREGISTERED 4 /* Has the device been unregistered */
};
其中关键的是要实现ops操作集,ops包含了watchdog的start/ping等在内的关键操作接口。
struct watchdog_ops {
struct module *owner;
/* mandatory operations */
int (*start)(struct watchdog_device *);
int (*stop)(struct watchdog_device *);
/* optional operations */
int (*ping)(struct watchdog_device *);
unsigned int (*status)(struct watchdog_device *);
int (*set_timeout)(struct watchdog_device *, unsigned int);
unsigned int (*get_timeleft)(struct watchdog_device *);
void (*ref)(struct watchdog_device *);
void (*unref)(struct watchdog_device *);
long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long);
};
通过ioctl访问并设置watchdog
内核对外统一封装了watchdog的设置接口,以此屏蔽不同watchdog驱动的差异。通过ioctl向/dev/watchdog写入不同命令,可以设置超时时间或是Ping等操作。Watchdog提供的ioctl接口是watchdog_ioctl。
static long watchdog_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
比如WDIOC_SETTIMEOUT可以设置超时时间;WDIOC_KEEPALIVE是调用ping操作;WDIOC_SETOPTIONS可以控制watchdog的开启与关闭。
int fd = open("/dev/watchdog", O_WRONLY);
int timeout = 10;
int options = WDIOS_ENABLECARD;
ioctl(fd, WDIOC_SETOPTIOS, &options);//open watchdog
ioctl(fd, WDIOC_SETTIMEOUT, &timeout);//set watchdog timer
ioctl(fd, WDIOC_KEEPALIVE);//ping watchdog
Watchdog运行
虽然watchdog具体的工作方式涉及到硬件和驱动的不同会有所不一样,但基本都是会在watchdog设备probe时启动一个timer,定时例测更新,一旦被watchdog硬件检测到规定时间内没有更新,会由硬件将系统拉复位。
通过ioctl调用watchdog驱动提供的接口,可以由用户自己控制例测条件。
在/lib/modules/内核名/kernel/drivers/watchdog下有各个厂商的watchdog驱动。以softdog为例示意一下watchdog的使用(代码路径在drivers/watchdog/softdog.c):
1.insmod softdog.ko 安装softdog驱动。
2.wdctl /dev/watchdog* 可以查看当前系统下安装的所有watchdog驱动
3.insmod后watchdog的功能默认是没有开启的,要先使能,然后持续喂狗操作。一旦使能后,如果停止喂狗,会触发watchdog修复。需要注意的是,在有些厂商的实现里,使能了watchdog后再停止watchdog,也会触发watchdog修复。
#include <stdlib.h>
#include <stdio.h>
#include <linux/watchdog.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()
{
int fd = open("/dev/watchdog", O_WRONLY);
int options = WDIOS_ENABLECARD;
int timeout = 10;
if (fd == -1)
{
printf("Failed to open device watchdog!\n");
return 0;
}
ioctl(fd, WDIOC_SETTIMEOUT, &timeout);//modify watchdog timeout (seconds)
ioctl(fd, WDIOC_SETOPTIONS, &options);//open/close watchdog routine
while (1)
{
ioctl(fd, WDIOC_KEEPALIVE);//keep watchdog alive
sleep(1);
}
return 0;
}