dmraid is a software raid/fakeraid/onboard raid tool.
As far as I can tell, the only error reporting that dmraid does is by hooking into logwatch – which emails me a very long file I don’t often read and I would like to know immediately if my raid array is degraded.
This works for me on CentOS 5.6 with dmraid installed – no guarantees on other flavours/combinations. My dmraid version is dmraid-1.0.0.rc13-63.el5. I haven’t tested the output from other versions of dmraid, but it would be pretty trivial to update the script if they are different (or your path is different).
So what we are going to do is write a simple shell script that checks the array status and emails if there is a problem. Then we run the script every 15 (or whatever) minutes.
The dmraid needs to be run as root, so you might as well
su - for the whole of this.
To create the file just
vi /raid_status.sh, hit i to insert and paste this:
# check raid status and email if not ok
STATUS=`/sbin/dmraid -s | grep "status"`
if [ "$STATUS" != "status : ok" ]
/sbin/dmraid -s | mail -s "RAID ERROR ON `hostname`: $STATUS" email@example.com
Hit Esc, ZZ to save, then make the file executable:
chmod 755 /raid_status.sh
Now add it to cron so that it runs regularly:
…and insert the line (i to insert – Esc, ZZ to save):
00,15,30,45 * * * * /raid_status.sh
Voila! dmraid with email error reporting.
Pingback: RAID error email reporting with the 3ware 9550SXU-8L and tw_cli | Oracle of Geek
very cool script it helped me
Cool script, but… but when i type this command I get this output:
*** Group superset .ddf1_disks
–> Active Subset
name : ddf1_4035305a8680222820202020202020209b7dac183a354a45
size : 312237824
stride : 128
type : mirror
status : ok
devs : 2
spares : 0
OK, but when i remove one drive simulating it’s failure i get status ok, and only difference is that it says “devs : 1″… i don’t have really faulty drive to check what happens when you have faulty drive, but when drive is totaly missing it says that status is OK which is not good. 🙁
If you are only interested in status, you might use “dmraid -s -csta” which returns the status only: “ok” for example.
Any updates on the not changed status but reduced number of devices?