In my last blog post, I walked through building NUT from source to monitor my new APC UPS. Why monitor a UPS? While one benefit is that I can get a bunch of pretty graphs for my Kibana dashboard, there’s a much more practical reason: monitoring the UPS allows my servers to shut themselves off if necessary on a set schedule. In this blog post I’ll explain how I set this up.
There are many reasons to use a UPS, and one is to allow for a clean shutdown when the power goes out, instead of just hoping for the best as power drops with no warning. This can work fine, but in my case I have a goal of maximizing the runtime of my UPS for my network.
For my setup, I have my fiber modem, router, and switches on my UPS in addition to a couple of servers. I could just let them all run on battery power until it runs out, but instead I want to shut my servers down if the power is out for more than a few minutes as opposed to just a temporary blip. Fortunately, this is a doable task, and below is how I did it.
Basic configuration
There are a couple of binaries you need to set up to monitor UPS status: upsd and upsmon. Upsd is the daemon that serves data to clients that are monitoring a UPS, and upsmon does the monitoring, natch. First comes upsd: since I want other servers to be able to get data on the UPS over the network, I added the following to upsd.conf in /etc/nut to allow upsd to listen on the network interface:
LISTEN 0.0.0.0 3493
Next came upsd.users, which allows for some minimal amount of access control. Nothing fancy here nor a strong password, since this is all internal and I’m not terribly worried about security. This is the user I added to upsd.users:
[upsmon]
password = secret
upsmon primary
actions = SET
instcmds = ALL
After that I moved on to the upsmon configuration. There is some good documentation available which came in handy, since I needed to set up a few options here. First was the MONITOR block:
MONITOR apcups@localhost 1 upsmon secret primary
This says I want to monitor my APC UPS on localhost, using the username and super-secret password above. “Primary” means that the UPS is connected to the host that upsmon is running on, versus a different host. Easy!
Getting fancy with upssched
Next, I needed to modify how we wanted to handle shutdowns. By default, upsmon will not signal a shutdown until the battery reaches a critical level. This is not what I wanted however: I want to shut down my servers well before that. As a result, I needed to do something advanced: use upssched to schedule a shutdown after X minutes on battery power. This required a bit of extra work.
Step one: set the NOTIFYCMD in upsmon.conf to upssched. For whatever reason, the example in the file they gave had the wrong path, so make sure you run “which upssched” to get the right one! In my case it was this:
NOTIFYCMD /usr/sbin/upssched
Now we need to have upsmon run that binary when we get a few key messages: when the UPS is on battery, when it is back on line power, and when the battery is critical (just for funsies). I did this by modifying the NOTIFYFLAG lines as such to add EXEC to those flags, which as the documentation says will signal to NOTIFYCMD when those events happen:
NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
That brought me to the upssched.conf file, which needed a few changes as well. The CMDSCRIPT stayed the same, but I modified it with the changes below. The PIPEFN and LOCKFN lines gave me a lot of trouble until I modified those lines as well to provide the right permissions for the stated files, since the wrong permissions resulted in upssched failing to run for my first tests. This is what I used:
PIPEFN /run/nut/upssched/upssched.pipe
LOCKFN /run/nut/upssched/upssched.lock
Finally, the heart of the work. For my monitoring server, I wanted it to shut down after 5 minutes on battery power. Upssched has timers to do this, so I added this pair of lines to start the timer when running on battery, and crucially cancel it when back on line power:
AT ONBATT * START-TIMER timedshutdown 300
AT ONLINE * CANCEL-TIMER timedshutdown
What is timedshutdown? That’s essentially the argument that is passed to the CMDSCRIPT when the timer is done. Remember a few paragraphs up when I said I changed this? As the doc says, it’s basically one big case..esac block, so I put the following block in to handle that timedshutdown signal:
timedshutdown)
# Set shutdown flag
/usr/sbin/upsmon -c fsd
That last line is what sets forced shutdown mode.
To sum up what I did:
- If the UPS is on battery, start a timer for 5 minutes
- When the timer reaches 0, set the forced shutdown flag
- If the UPS gets back on line power before 5 minutes is up, then cancel the timer
Monitoring remotely
That took care of my primary monitoring server, but I have more than one server I want to monitor the UPS. For remote monitoring from my other server, I did the same thing with a few tweaks:
- the MONITOR line in upsmon.conf has the server name instead of localhost, and “secondary” instead of “primary”.
- I want the secondary servers to shut down first, so I set a shorter timer (3 minutes).
The test
After all of this was configured and I figured out the pipe and lock file permissions, it was time for the test: I pulled the power plug for the UPS and waited. As configured, one server shut down at 3 minutes and the other at 5. Success!
With this setup, if I experience a lengthy power outage, I will lose my nice Elastic dashboards, Home Assistant, and other apps running on those servers, but hopefully I will be able to use my network for as long as possible.
