Archive for August 2009

Buying time   1 comment

Wearing both hats of developer and SysAdmin, I believe you shouldn’t work hard as a SysAdmin. It is for a reason that sometimes developers look at SysAdmins as inferior. It’s not because SysAdmins are really inferior, or has an easier job. I think it is mainly because a large portion of their workload could be automated. And if it can’t be automated (very little things), you should at least be able to do it quickly.

Obviously, we cannot buy time, but we can at least be efficient. In the following post I’ll introduce a few of my most useful aliases (or functions) I’d written and used in the last few years. Some of them are development related while the others could make you a happy SysAdmin.
You may find them childish – but trust me, sometimes the most childish alias is your best friend.

Jumping on the tree

Our subversion trunk really reminds me of a tree sometimes. The trunk is very thick, but it has many branches and eventually many leaves. While dealing with the leaves is rare, jumping on the branches is very common. Many times i have found myself typing a lot of ‘cd’ commands, where some are longer than others, repeatedly, just to get to a certain place. Here my stupid aliases come to help me.

lib takes me straight away to our libraries sub directory, where sim takes me to the simulators sub directory. Not to mention tr (shortcut for trunk), which takes me exactly to the sub directory where the sources are checked out. Oh, and pup which takes me to my puppet root, on my puppet master. Yes, I’m aware to the fact that you probably wonder now “Hey, is this guy going to teach me something new today? – Aliases are for babies!!”, I can identify. I didn’t come to teach you how to write aliases, I’m here to preach you to start using aliases. Ask yourself how many useful day-to-day aliases you really have defined. Do you have any at all? – Don’t be shy to answer no.

*nix jobs are diverse and heterogeneous, but let’s see if I can encourage you to write some useful aliases after all.
In case you need some ideas for aliases, run the following:

$ history | tr -s ' ' | cut -d' ' -f3- | sort | uniq -c | sort -n

Yes, this will show the count of the most recent commands you have used. Still not getting it? – OK, I’ll give you a hint, it should be similar to this:

$ alias recently_used_commands="history | tr -s ' ' | cut -d' ' -f3- | sort | uniq -c | sort -n"

If you did it – you’ve just kick-started your way to liberation. Enjoy.
As a dessert – my last childish alias:

$ alias rsrc='source ~/.bashrc'

Always useful if you want to re-source your .bashrc while working on some new aliases.

Two more things I must mention though:

  1. Enlarge your history size, I guess you can figure out alone how to do it.
  2. If you’re feeling generous – periodically collect the history files from your fellow team members (automatically of course, with another alias) and create aliases that will suit them too.

The serial SSHer

Our network is on Many times I’ve found myself issuing commands like:

$ ssh root@
$ ssh root@
$ ssh root@

Dozens of these in a single day. It was really frustrating. One day I decided to make an end to it:

# function for easier ssh
# $1 - network
# $2 - subnet
# $3 - host in subnet
_ssh_specific_network() {
	local network=$1; shift
	local subnet=$1; shift
	local host=$1; shift
	ssh root@$network.$subnet.$host

# easy ssh to
# $1 - host
_ssh_net_192_168_8() {
	local host=$1; shift
	_ssh_specific_network 192.168 8 $host
alias ssh8='_ssh_net_192_168_8'

Splendid, now I can run the following:

$ ssh8 1

Which is equal to:

$ ssh root@

Childish, but extremely efficient. Do it also for other commands you would like to use, such as ping, telnet, rdesktop and many others.

The cop

Using KDE’s konsole? – I really like DCOP, let’s add some spice to the above function, we’ll rename the session name to the host we’ll ssh to, and then restore the session name back after logging out:

# returns session name
_konsole_get_session_name() {
	# using dcop - obtain session name
	dcop $KONSOLE_DCOP_SESSION sessionName

# renames a konsole session
# $1 - session name
_konsole_rename_session_name() {
	local session_name=$1; shift
	_konsole_store_session_name `_konsole_get_session_name`
	dcop $KONSOLE_DCOP_SESSION renameSession "$session_name"

# store the current session name
_konsole_store_session_name() {

# restores session name
_konsole_restore_session_name() {
	if [ x"$STORED_SESSION_NAME" != x ]; then
		_konsole_rename_session_name "$STORED_SESSION_NAME"

# function for easier ssh
# $1 - network
# $2 - subnet
# $3 - host in subnet
_ssh_specific_network() {
	local network=$1; shift
	local subnet=$1; shift
	local host=$1; shift
	# rename the konsole session name
	_konsole_rename_session_name .$subnet.$host
	ssh root@$network.$subnet.$host
	# restore the konsole session name

Extend it as needed, this is only the tip of the iceberg! I can assure you that my aliases are much more complex than these.

Finders keepers

For the next one I’m not taking full credit – this one belongs to Uri Sivan – obviously one of the better developers I’ve met along the way.
Grepping cpp files is essential, many times I’ve found myself looking for a function reference on all of our cpp files.
The following usually does it:

$ find . -name "*.cpp" | xargs grep -H -r 'HomeCake'

But seriously, do I look like someone that likes to work hard?

# $* - string to grep
grepcpp() {
	local grep_string="$*";
	local filename=""
	find . -name "*.cpp" -exec grep -l "$grep_string" "{}" ";" | while read filename; do
		echo "=== $filename"
		grep -C 3 --color=AUTO "$grep_string" "$filename"
		echo ""

OK, let’s generalize it:

# grepping made easy, taken from the suri
# grepext - grep by extension
# $1 - extension of file
# $* - string to grep
_grepext() {
	local extension=$1; shift
	local grep_string="$*"
	local filename=""
	find . -name "*.${extension}" -exec grep -l "$grep_string" "{}" ";" | while read filename; do
		echo "=== $filename"
		grep -C 3 --color=AUTO "$grep_string" "$filename"
		echo ""

# meta generate the grepext functions
declare -r GREPEXT_EXTENSIONS="h c cpp spec vpj sh php html js"
_meta_generate_grepext_functions() {
	local tmp_grepext_functions=`mktemp`
	local extension=$1; shift
	for extension in $GREPEXT_EXTENSIONS; do
		echo "grep$extension() {" >> $tmp_grepext_functions
		echo '  local grep_string=$*' >> $tmp_grepext_functions
		echo '  _grepext '"$extension"' "$grep_string"' >> $tmp_grepext_functions
		echo "}" >> $tmp_grepext_functions
	source $tmp_grepext_functions
	rm -f $tmp_grepext_functions

After this, you have all of your C++/Bash/PHP/etc developers happy!

Time to showoff

My development environment is my theme park, here is my proof:

$ (env; declare -f; alias) | wc -l

I encourage you to run this as well, if my line count is small and I’m bragging about something I shouldn’t – let me know!


Posted August 21, 2009 by malkodan in Bash, Linux, System Administration

Tagged with , , , , , , , , ,

Feeling at /home…   3 comments

The following took place more than a year ago, but it is still fresh in my mind. After a few colleagues urged me to write about it, I decided to finally do it. If the output of the commands does not match exactly whatever I had while dealing with it – bear with it. It’s far from being the point.

The horror

I took a break from work last year and decided to go and have some fun in NZ. Oh, did I have fun there!
There’s nothing more frustrating than returning to work, turning on your dusty computer and witnessing the following:

*** An error ocfcurred during the filesystem check.
Give root password for maintenance (or type Control-D to continue):

Investigating it just a bit more I got to a conclusion that my /home is not mounting. OMG!!! all of my personal customizations and some private data is at /home!
I must admit, it’s nothing that couldn’t be reproduced at a reasonable amount of time, but having my neat KDE customizations I didn’t want to start the process from the beginning. Think about yourself losing your /home, it’s no fun. I decided I wanted it back.
OK, so I’m running e2fsck:

# e2fsck /dev/sda5
e2fsck 1.39 (29-May-2006)
e2fsck: No such file or directory while trying to open /dev/sda5

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 


Oh man, how frustrating, e2fsck can’t read my superblock!
Something I did notice during boot up, is that the HD is very noisy, in addition to very slow boot process. It wasn’t a new HD and it has worked hard (I tested on it numerous times our FS infrastructure of video recording). Probably it’s time has come.

I wanted to feel at /home again.
I decided the smartest thing would be to first try and copy this whole partition aside as I knew there is a hardware problem with the HD. After I’ll solve that one, I could hopefully handle the missing superblock problem much better.

Getting physical

So I quickly inserted a fresh new HD to my machine, disconnected the old faulty HD (it caused the computer to boot so slow because of it’s defects) and issued a network install.
15 minutes later I’m again at linux, with a bare new /home, and the faulty HD connected to it, slowing the computer as hell.
I was sure dd would come for the rescue:

# dd if=/dev/sda5 of=home-sweet-home.ext3 bs=4M

After a few minutes of anticipation and cranky HD noises, I’m with:

dd: reading `/dev/sda5': Input/output error
0+0 records in
0+0 records out
# ls -l /home/home-sweet-home.ext3
-rw-r--r-- 1 root root 0 Jul  10 08:26 home-sweet-home.ext3

Great :(. I’m searching the net, searching for an aggressive dd program, something that instead of giving up on bad sectors, would fill them with zeros and continue on (hoping the defects on the HD are at a very specific place). I must admit I have almost written something by myself, but finally I’ve found dd_rescue.
And off we go:

# dd_rescue -e 1 -A /dev/sda5 /home/home-sweet-home.ext3

It ran for hours! It was 65GB that dd_rescue had to tackle. With a dying HD that could take a lot of time. After more or less 8 hours I was back at my desktop, looking at my old home:

# ls -lh /home/home-sweet-home.ext3
-rw-r--r--   1 root   root        61G Jul  10 20:43 home-sweet-home.ext3

Being logical

OK, that’s it, I have my data. Time to dump the old HD and deal with the logical errors I still have with this partition dump. Mounting the partition gave me the same result as I pasted above: no superblock – no fun!
Oh! but ext3 always creates a few backup superblocks, maybe this is my lucky day where I will finally be able to use one of these backups. You are probably familiar with the following output:

# e2fsck /tmp/e2fsck-test
27 block groups
8192 blocks per group, 8192 fragments per group
1992 inodes per group
Superblock backups stored on blocks:
        8193, 24577, 40961, 57345, 73729, 204801

Writing inode tables: done
Writing superblocks and filesystem accounting information: done

Now go figure out where your backup superblocks are. Trying the obvious of 8193 and 32768 did not work for me. I knew there should be more backup superblocks. Google comes for the rescue again. I was quite close at this time as well to writing a small C program that would search the partition dump for ext3 superblock signatures and tell me where the backup superblocks are. But then again, I thought I’m probably not the first one who needs such a utility, here TestDisk came for the rescue.
I simply ran TestDisk which reveled the remaining trustworthy superblocks on my damaged filesystem.
Later on I discovered that it is also possible to run e2fsck on a partition with the same size and see where the superblocks get written. However, I think that probing the superblocks is much cleaner altogether.

# mkdir /home2
# mount -o loop,ro,sb=884736 /home/home-sweet-home.ext3 /home2


Did it work?

# ls -l /home2
drwxr-xr-x  41 dan    dan    8192 Feb  5 13:39 dan
drwx------   7 distcc distcc   88 Jul 26 15:24 distcc
drwxrwxrwx   2 nobody nobody    1 Feb  5 05:39 public

Wow, it did!
So how much of it was eventually damaged? – less than 1%!
So I’ve found a few garbled files I didn’t need anyway, but I was more than 99% back at /home.


Needless to say that ever since I’m backing up my /home in a strict manner. But this is obviously not the point.
The simplicity of Linux in general and ext2/3 in specific is something we should adore. I wouldn’t want to imagine what would have happened on a different OS or a different filesystem (and please don’t start a flame war about it now…).

Conventions   11 comments

I like Bash, I really like Bash. It’s simple, it’s neat, it’s quick, it’s many things. No one will convince me otherwise. It’s not for big things though, I know, but for most system administration tasks it will suffice.
The problem with Bash begins where incompetent SysAdmins start to use it. They will most likely treat it poorly and lamely.
Conventions are truly needed when writing in Bash. In our subversion we have ~30,000 lines of bash. And HELL NO! our product is definitely not a lame conglomerate of Bash scripts gluing many pieces of binaries together. Bash is there for everything that C++ (in our case) shouldn’t do, things like:

  • Packaging and anything related to packaging (post scripts of packages for instance)
  • SysV service infrastructure
  • Backups
  • Customization of the development environment
  • Deployment infrastructure

Yes, so where have we been? – Oh, how do we manage ~30,000 lines of Bash without getting everything messed out. For this, we need 3 main things:

  • Version control system (CVS, SVN, git, pick your kick)
  • Competent people
  • Coding conventions

Configuring a version control system is easy, espcially if you are a competent developer, I’ll skip to the 3rd one.

Conventions. Show me just one competent C/C++/Java/C# developer who would skip on using coding conventions and program like an ape. OK, I know, there might be some, but we both think the same thing about them. Scripting in Bash shouldn’t be an exception in that case. We should use strict scripting conventions on Bash in particular and on any other scripting language in general.
There’s nothing uglier and messier than a Bash script that is written without conventions (well, maybe Perl without conventions).

I’ll try in the following post to introduce my Bash scripting conventions and if you find them neat – feel free to adopt them. Up until today, sadly, I havn’t seen anyone suggesting any Bash scripting conventions.

Before starting any Bash script, I have the following skeleton :


main() {

main "$@"

Call me crazy, but without my main() function I ain’t going anywhere. Now we can start writing some bash. Once you have your main() it means you’re going to write code ONLY inside functions. I don’t care it’s a scripting language – let’s be strict.


# $1 - hostname
# $2 - destination directory
main() {
	local hostname=$1; shift
	local destination_directory=$1; shift

main "$@"

Need to receive any arguments in a function? – Do the following:

  • Document the variables above the function
  • Use shift after receiving each variable, and always use $1, it’s easier to later move the order of the variables

Notice that i’m using the local keyword to make sure the scope of the variable is only within its function. Be strict with variables. If for some reason you decide you need a global variable, it’s OK, but make sure it’s in capital letters and declared as needed:

# a read only variable
declare -r CONFIGURATION_FILE="/etc/named.conf"
# a read only integer variable
declare -i -r TIMEOUT=600

# inside a function
sample_function() {
	local -i retries=0

You are getting the point – I’m not going to make it any easier for you. Adopt whatever you like and remember that being strict in Bash yields code in higher quality, not to mention better readability. Feeling like writing a sloppy script today? – Go home and come back tomorrow.

Following is a script written by the conventions I’m suggesting. This is a very simple script that simply displays a dialog and asks the user which host he would like to ping, then displays the result in a tailbox dialog. This is more or less how many of my scripts look like. I admit it took me a while to find a script which is not too long (under 100 lines) and can still represent some of the ideas I was mentioning.

I urge you to question my way and conventions and suggest some of your own, anyway, here it is:


# dialog options (notice it's read only)
# notice as well it's in capital letters
declare -r DIALOG_OPTS="0 0 0"
declare -i -r OUTPUT_DIALOG_WIDTH=60

# ping timeout in seconds (using declare -i because it's an integer and -r because it's read only)
declare -i -r PING_TIMEOUT=5
# size of pings
declare -i -r DEFAULT_SIZE_OF_PINGS=56
# number of pings to perform

# neatly pipes a command to a tailbox dialog
# here we can specify the parameters the function expects
# this is how i like to do it, perhaps there are smarter ways that may integrate better
# with doxygen or some other tools
# $1 - tailbox dialog height
# $2 - title
# $3 - command
pipe_command_to_tailbox_dialog() {
	# parameters extraction, always using $1, with `shift` right afterwards
	# in case you want to play with the order of parameters, just move the lines around
	local -i height=5+$1; shift
	local title="$1"; shift
	local command="$1"; shift

	# need a temporary file? - always use `mktemp`, please spare me the `/tmp/$$` stuff
	# or other insecure alternative to temporary file creations, use only `mktemp`
	local output_file=`mktemp`
	# run in a subshell and with eval, so we can pass pipes and stuff...
	# eval is a favorite of mine, it means you truely understand the Bash shell
	# nevertheless - it is really needed in this case
	(eval $command) >& $output_file &
	# need to obtain a pid? - it's surely an integer, so use 'local -i'
	local -i bg_job=$!

	# ok, show the user the dialog
	dialog --title "$title" --tailbox $output_file $height $OUTPUT_DIALOG_WIDTH

	# TODO my lame way of checking if a process is running on linux, anyone has a better way??
	# my way of specifying a 'TODO' is in the above line
	if [ -d /proc/$bg_job ]; then
		# if the process is stubborn, use 'kill -9'
		kill $bg_job || kill -9 $bg_job >& /dev/null
	# wait for process to end itself
	wait $bg_job

	# not cleaning up your temporary files is similar to a memory leak in C++
	rm -f $output_file

# pings a host with a nice dialog
ping_host_dialog() {
	local ping_params_tmp=`mktemp`
	# slice lines nicely, i have long commands, i'm not going even to mention
	# line indentation - that goes without saying
	# i like to use dialogs, it makes more people use your scripts
	if dialog --ok-label "Submit" \
		--form "Ping host" \
			"Address:" 1 1 "" 1 30 40 0 \
			"Size of pings:" 2 1 "$DEFAULT_SIZE_OF_PINGS" 2 30 40 0 \
			"Number of pings:" 3 1 "$DEFAULT_NUMBER_OF_PINGS" 3 30 40 0 2> $ping_params_tmp; then
		# ping_params_tmp will be empty if the user aborted the dialog...
		local address=`head -1 $ping_params_tmp | tail -1`
		# yet again if you expect an integer, use 'local -i'
		local -i size_of_pings=`head -2 $ping_params_tmp | tail -1`
		local -i number_of_pings=`head -3 $ping_params_tmp | tail -1`
	rm -f $ping_params_tmp

	# this is my standard way of checking if a variable is empty
	# may not be the prettiest way, but it surely catches the eye...
	if [ x"$address" != x ]; then
		pipe_command_to_tailbox_dialog 15 "Pinging host \"$address\"" "ping -c $number_of_pings -s $size_of_pings -W $PING_TIMEOUT \"$address\""

# main function, can't live without it
main() {

# although there are no parameters passed in this script, i still pass $* to main, as a habit
#main $*
# after Uri's comment, I'm fixing the following and calling "$@" instead
main "$@"

I don’t want this post to be too long (it is already quite long), but I think you got the idea. I still have some conventions I did not introduce here. In case you liked what you’ve seen – do not hesitate to contact me so I can provide you with a document describing all of my Bash scripting conventions.