This post is highly influenced by SysAdvent 2011 Day 6, the title of which is Always Be Hacking (Rated R for language). This is my attempt at looking for feedback and offering my methods up for public consumption/debate.

This is something that becomes ever-increasingly true. I am never happier than when I sit down, shut the world out and start busting out line after line of code. There is something really peaceful and pleasant about it. You feel your mind turn, spin and whir a little with every press of a key. You sit back after a couple hours and think, “Ok… that’s decent. It will work but how will it break? Does it matter? Should I keep going or work on this more?”

When I write scripts I try to think of how they will be used. I think back to my English classes at Northeastern, which is odd because I didn’t especially find English useful.  “Know your audience.”  If I’m writing a one-off for just me, I don’t worry about making it print nice usage statements or help details.  If I’m writing something that I want people to use, I spend time writing it to accept certain arguments and choke on others.

Several weeks ago, I had to gather a bunch of logs together across a number of our servers. This is something we have to do every so often, so I wanted to make it faster.  I wrote the following bash script (see Scratch an Itch from SysAdvent Day 6).

#!/bin/bash
# iFactory log reprocess script
# Authored by: Matt Warren on 12/4/2011
#
# Given a log directory and month, this script will
#  * copy $MONTH's zip file
#  * unzip logs
#  * dump them into one file
#  * backup -daily.log
#  * move giant file to -daily.log
# Some generic variables here
# Change as needed
source /etc/profile
EXPECTED_ARGS=2
TMPDIR=/tmp/oldlogs
SOURCEDIR=$1
MONTH=$2
YEAR=`date -d today +%Y`
MONTHNUM=`date -d "$MONTH 1" +%m`
rm -rf $TMPDIR
mkdir $TMPDIR
cp $SOURCEDIR/*acctinfo_$MONTH$YEAR.zip $TMPDIR
cd $TMPDIR
for SITE in `ls -1 | awk -F "_" '{print $1}' | uniq` ; do
 unzip $SITE*$MONTH*.zip
 rm $SITE*$MONTH*.zip
 cat $SITE_acctinfo*$YEAR*$MONTHNUM* > $SITE-month.log
 rm $SITE_acctinfo*$YEAR*$MONTHNUM*
 cp $SOURCEDIR/$SITE-daily.log $TMPDIR
 cp $SITE-month.log $SOURCEDIR/$SITE-daily.log
 rm $SITE-month.log
 echo "$SITE" >> $TMPDIR/record.txt
done

Notice in the last few lines there are some rm’s but for sanity if things go wrong, some files are left behind.  Looking back, it would be good to point that out in an echo statement or comment at the top.

Now, in no way is that an amazing script.  I’m more than certain it has bugs and won’t work for every use case.  But it focused my mind on the task. I thought about the files and how I would have to manipulate them to get all the data. I knew it was going to take all day to get the logs reprocessed but putting in some work before getting to the task allowed me, and those who follow, to perform it faster. Hopefully this will also lead to fewer errors.

I’m very open to feedback on my script.  It seems every time I use Google to find how to do one thing in bash it’s different than the last time I saw it. Are there other ways you focus yourself on a task? Going back to the idea of an English class, do you have pre-writing exercises?  Should I rewrite this in ruby?

Share:
  • Twitter
  • Facebook
  • LinkedIn
  • del.icio.us
  • Google Bookmarks
  • Technorati