Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Howto: Email updates for 3ware controller (3dm2 fails)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
Crash_Maxed
n00b
n00b


Joined: 28 Apr 2006
Posts: 17

PostPosted: Wed Dec 10, 2008 7:37 pm    Post subject: Howto: Email updates for 3ware controller (3dm2 fails) Reply with quote

So here's the basic problem. I have a 3Ware 9690SA-8I SATA/SAS controller and everything works perfectly, but one thing. 3DM2, the management package that AMCC/3Ware provides has e-mail alert support, but I have yet to ever get it to work in Linux. 3DM2 does all of the other jobs flawlessly (remote management, unit migration, etc) but this one hole (no e-mail alerts) didn't rest well with me. I'd like to be able to know the status of my arrays and be e-mailed if something goes wrong. So for the past few days I've been working on some shell scripts to complement 3DM2. These do NOT replace 3DM2 rather they just fill in the hole of poor alert updates.

Since these are shell scripts they require a few programs to work:
crontab (schedule these scripts to run at intervals)
smartctl - part of the sys-apps/smartmontools package (used to pull information about drives behind a 3ware controller)
sed (used for text reformatting)
awk (used to strip information out of the smartctl dumps)
mutt (used to email the results)
3dm2 (optional) + tw_cli command line interface application (required) for your 3ware controller

In addition to the above you will need to have mutt correctly configured to use an SMTP server of your choice to e-mail the alerts. Should you have your own SMTP servers available you can use those as well, but you will need to get mutt setup for it. If you have some other command line based e-mail sender you can also edit the script and substitute that instead. These scripts do have everything they need, but I couldn't be troubled to turn every single thing that would vary in other peoples' setup into variables at the top. Also, I haven't done that many big shell scripts so if some of my "programming" practices are out of line please do comment as I would like to make this script better in the future.

Without further ado here are the scripts. Up first is the main script which is responsible for getting the SMART status of all drives on the 3ware controller as well as the overall controller status and send nice formatted output of the results to an e-mail address of your choosing. Please note there are two major changes everyone will have to do if they wish to use this. First they will have to change every instance of "/c4" in the file to whatever their controller is. I'll change it to a variable sometime. The second is the location of the tw_cli command line interpreter program for 3ware cards. The standard install (not available in portage, get the 3dm2+cli package from the AMCC/3Ware site) package installs into /opt. What I did was make a symlink to my tw_cli file in /usr/bin called "cli". This is why all cli calls don't us tw_cli in my script. Just to show my symlink:
Code:

ragnarok crash # ls -l /usr/bin/cli
lrwxrwxrwx 1 root root 20 Nov 22 17:33 /usr/bin/cli -> /opt/AMCC/CLI/tw_cli


Last but not least, the script:
Code:

# Drive Report Generator Script v1.0
# 12/10/2008

#!/bin/bash

# Crontab doesn't export any environment variables... set the paths variable
export PATH=/sbin:/bin:/usr/sbin:/usr/bin

export EMAILTO=xxxxxxx@yyyyyy.zzz # Who to e-mail reports to
export DRV_ID=(0 1 2 3 4 5) # For multiple drives use DRV_ID=(0 1 2 3 etc)
export DEV=/dev/twa0 # Device your 3ware controller is behind (usually /dev/twaX or /dev/tweX)

# These are the attributes via smartctl that will be read.  The "AWK" and "NOAWK" are used
# on the status attribs as some of these don't need to pipe to awk to get what we need.
# If it is found under "Vendor Specific SMART Attributes with Thresholds:" in a smartctl
# report you may add that attribute to the respective array here.  Remember the underscores!
# If you wish to put one of the attributes from the above mentioned area into the "status" array
# put it into the ATTRIBS_STATUS_AWK array.  For prefail and life obviously put them
# in the ATTRIBS_PREFAIL and ATTRIBS_LIFE array.
export ATTRIBS_STATUS_NOAWK=(Model: Number: overall-health)
export ATTRIBS_STATUS_AWK=(Temperature_Celsius)
export ATTRIBS_PREFAIL=(Raw_Read_Error_Rate Reallocated_Sector_Ct Reallocated_Event_Count Current_Pending_Sector Offline_Uncorrectable)
export ATTRIBS_LIFE=(Start_Stop_Count Power_On_Hours Power_Cycle_Count)

if [ "$DRV_ID" == "" ]
then
  echo "No drive IDs set."
  exit
fi

if [ "$DEV" == "" ]
then
  echo "No device node set."
  exit
fi

for id in ${DRV_ID[@]}
do #Do for the main function
   touch /var/tmp/3wdev."$id".raw
   touch /var/tmp/3wdev."$id".extracted
   smartctl --all --device=3ware,"$id" $DEV >> /var/tmp/3wdev."$id".raw

echo "###################################################" >> /var/tmp/3wdev."$id".extracted
echo "Report for Drive ID: $id" >> /var/tmp/3wdev."$id".extracted
echo "--> Drive Status: <--------------------------------" >> /var/tmp/3wdev."$id".extracted

for attrib in ${ATTRIBS_STATUS_NOAWK[@]}
do
        cat /var/tmp/3wdev."$id".raw | grep $attrib >> /var/tmp/3wdev."$id".extracted
done

for attrib in ${ATTRIBS_STATUS_AWK[@]}
do
   cat /var/tmp/3wdev."$id".raw | grep $attrib | awk '{print $2, $10}' >> /var/tmp/3wdev."$id".extracted
done

echo "--> Pre-Fail Statisitics: <------------------------" >> /var/tmp/3wdev."$id".extracted
for attrib in ${ATTRIBS_PREFAIL[@]}
do
   cat /var/tmp/3wdev."$id".raw | grep $attrib | awk '{print $2, $10}' >> /var/tmp/3wdev."$id".extracted
done

echo "--> Lifetime Statistics: <-------------------------" >> /var/tmp/3wdev."$id".extracted
for attrib in ${ATTRIBS_LIFE[@]}
do
        cat /var/tmp/3wdev."$id".raw | grep $attrib | awk '{print $2, $10}' >> /var/tmp/3wdev."$id".extracted
done

echo "###################################################" >> /var/tmp/3wdev."$id".extracted

# Formatting via sed
sed -i s/"Device Model:"/"Device Model:      "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Serial Number:"/"Serial Number:      "/g /var/tmp/3wdev."$id".extracted
sed -i s/"SMART overall-health self-assessment test result:"/"SMART Status:          "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Temperature_Celsius"/"Temperature [C]:       "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Raw_Read_Error_Rate"/"Raw Read Error Rate:   "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Reallocated_Sector_Ct"/"Reallocated Sectors:   "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Reallocated_Event_Count"/"Reallocation Events:   "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Current_Pending_Sector"/"Pending Sectors:       "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Offline_Uncorrectable"/"Offline Uncorrectable: "/g /var/tmp/3wdev."$id".extracted
sed -i s/"Start_Stop_Count"/"Start\/Stop Count:      "/g /var/tmp/3wdev."$id".extracted
# Formatting spaces not added on Power_On_Hours in order to make the Yrs/Mo/Days operation work
sed -i s/"Power_On_Hours"/"Power On Hours:"/g /var/tmp/3wdev."$id".extracted
sed -i s/"Power_Cycle_Count"/"Power Cycle Count:     "/g /var/tmp/3wdev."$id".extracted

# Make a fancy Yrs/Mo/Days for Power On Hours
export hours=`cat /var/tmp/3wdev."$id".extracted | grep -i "Power On Hours" | awk '{print $4}'`
export years=$(($hours/24/365))
export days=$((($hours/24)-(365*$years)))
export months=$(($days/30))
export days=$(($days-($months*30))) #Approximation since we only have intergers
sed -i s/"Power On Hours: $hours"/"Power On Hours:         $hours ($years Yrs $months Mo $days Days)"/g /var/tmp/3wdev."$id".extracted

done #Done for the smartctl data extraction

# Generate 3ware controller report
echo "###################################################" >> /var/tmp/3wcont.extracted
echo "--> 3Ware 9690SA-8I Statistics: <------------------" >> /var/tmp/3wcont.extracted
cli /c4 show allunitstatus >> /var/tmp/3wcont.extracted
cli /c4 show unitstatus >> /var/tmp/3wcont.extracted
cli /c4 show drivestatus >> /var/tmp/3wcont.extracted
echo "###################################################" >> /var/tmp/3wcont.extracted
sed -i '/^$/d' /var/tmp/3wcont.extracted
sed -i s/"\/c4 Total Optimal Units ="/"Total Optimal Units:   "/g /var/tmp/3wcont.extracted
sed -i s/"\/c4 Not Optimal Units ="/"Non-Optimal Units:     "/g /var/tmp/3wcont.extracted

# Get the unit status - This will show in email subject
export status=`cli /c4 show unitstatus | grep u0 | awk '{print $3}'`

# If status is init/verify/rebuild/migrate lets grab the process percentage
# The if is inverted as they are the only states that don't show a percentage
export percentage="" # We need to initialize it in case it doesn't get used
if [[ "$status" != "DEGRADED" && "$status" != "OK" ]]
then
        export percentage=`cli /c4 show unitstatus | grep u0 | awk '{print $4 $5}' | sed -r 's/(\(.\))|-//g'`
   # The MIGRATING state doesn't show a percent sign for some reason
   if [ "$status" == "MIGRATING" ]
   then
      export percentage="$percentage%"
   fi
fi

# Concatenate all the extracted log files
cat /var/tmp/3wdev.*.extracted > /var/tmp/3wstatus.txt
cat /var/tmp/3wcont.extracted >> /var/tmp/3wstatus.txt

# Generate e-mail body
echo "Report for `date +%Y%m%d-%H:%M` attached." > /var/tmp/3wmailbody.txt
echo "" >> /var/tmp/3wmailbody.txt
cat /var/tmp/3wstatus.txt >> /var/tmp/3wmailbody.txt

# Email that beautiful bean footage
mutt -s "3WStatus Report for `date +%Y%m%d-%H:%M` STATUS: $status $percentage" -a /var/tmp/3wstatus.txt $EMAILTO < /var/tmp/3wmailbody.txt

# Clean up files and archive log file
mv /var/tmp/3wstatus.txt /var/log/3wstatus-logs/3wstatus-`date +%Y%m%d-%H:%M`.log
rm /var/tmp/3wdev.*.extracted
rm /var/tmp/3wdev.*.raw
rm /var/tmp/3wcont.extracted
rm /var/tmp/3wmailbody.txt


You can invoke this manually via command line or via crontab at whatever desired interval you wish. Logs will be e-mailed to you and saved in /var/logs/3wstatus-logs. On that note make sure to either comment out the mv line at the end, change the folder location to one you want, or make the directory /var/log/3wstatus-logs if you wish to keep the default location I set.

The second script is optional, but allows you to get an alert e-mail every time the unit on your controller switches status. At the moment it's hard coded for unit 0 (u0 on the card), but I'll probably re-code it as a variable and add a loop so that it can monitor multiple units individually.

Code:

# Unit Alert Script
# Sends an e-mail if an array has changed states
# 12/10/2008

#!/bin/bash

# Crontab doesn't export any environment variables... set the paths variable
export PATH=/sbin:/bin:/usr/sbin:/usr/bin
export EMAILTO=xxxxxxx@yyyyyy.zzz

export new_status=`cli /c4/u0 show status | awk '{print $4}'`
export old_status=`cat /var/state/.3wstate`
echo $new_status > /var/state/.3wstate

if [ "$new_status" != "$old_status" ]
then
   echo "Report for `date +%Y%m%d-%H:%M`." > /var/tmp/3walertbody.txt
   echo "" >> /var/tmp/3walertbody.txt
   echo "There has been a status change on the 3Ware controller." >> /var/tmp/3walertbody.txt
   echo "Previous system status: $old_status" >> /var/tmp/3walertbody.txt
   echo "Updated system status: $new_status" >> /var/tmp/3walertbody.txt
   echo "" >> /var/tmp/3walertbody.txt
   mutt -s "3WStatus Report - STATUS CHANGE: $new_status" $EMAILTO < /var/tmp/3walertbody.txt
   rm /var/tmp/3walertbody.txt
fi


Ask questions if you have them, hopefully this script is of use to somone ;).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum