Archive for the ‘System Administration’ Category

Handling many files in one directory   Leave a comment

The Assignment

You have a directory with gazillion files. Since most filesystems are not very efficient with many files in one directory, it is advisable to spread them among a hierarchy of directories. Write a program (or script) which handles a directory with many files and spreads them in an efficient hierarchy.

Does that sounds like a University assignment or something? Yes, it does.

Well apparently such a situation just happened to me in real life. Searching across the internet I couldn’t find anything too useful. And I will stand corrected if there is something which already deals with that problem. Post ahead if so.

And yes, thank god I’m using Unix (Linux), don’t even want to think what one would do on Windows.

The Situation

An application was spooling many files to the same directory, generating up to a million files in the same directory. I’m sorry I cannot disclose any more information about it, but lets just say it is a well known open source application.
Access to these files was obviously fast having ext4 and dir_index, but the directory index is too big to actually list files or do anything else without clogging everything in the system. And we need these files.

So we’ve decided to model the files in a way that’ll be more efficient for browsing and we can then handle it from there.

The Solution

After implementing something pretty quick and dirty for the situation, to mitigate the pain, I’ve sat down and wrote something a bit more generic. I’m happy to introduce the spread_files.sh utility.
What does it take care of:

  • Reading the directory index just once
  • Hierarchy depth as parameter
  • Stacking up to X files per mv command
  • Has recursion in Bash!!
  • Obviously the best solution would be to never get to that situation, however if you do, feel free to use spread_files.sh.

    Advertisements

SSHing efficiently   6 comments

I personally have a numerous number of hosts which I sometimes have to SSH to. It can get rather confusing and inefficient if you get lost among them.

I’m going to show you here how you can get your SSHing to be heaps more efficient with just 5 minutes of your time.

.ssh/config

In $HOME/.ssh/config I usually store all my hosts in such a way:

Host host1
    Port 1234
    User root
    HostName host1.potentially.very.long.domain.name.com

Host host2
    Port 5678
    User root
    HostName host2.potentially.very.long.domain.name.com

Host host3
    Port 9012
    User root
    HostName host3.potentially.very.long.domain.name.com

You obviously got the idea. So if I’d like to ssh to host2, all I have to do is:

ssh host2

That will ssh to root@host2.potentially.very.long.domain.name.com:5678 – saves a bit of time.

I usually manage all of my hosts in that file. Makes life simpler, even use git if you feel like it…

Auto complete

I’ve added to my .bashrc the following:

_ssh_hosts() {
    local cur="${COMP_WORDS[COMP_CWORD]}"
    COMPREPLY=()
    local ssh_hosts=`grep ^Host ~/.ssh/config | cut -d' ' -f2 | xargs`
    [[ ! ${cur} == -* ]] && COMPREPLY=( $(compgen -W "${ssh_hosts}" -- ${cur}) )
}

complete -o bashdefault -o default -o nospace -F _ssh_hosts ssh 2>/dev/null \
    || complete -o default -o nospace -F _ssh_hosts ssh
complete -o bashdefault -o default -o nospace -F _ssh_hosts scp 2>/dev/null \
    || complete -o default -o nospace -F _ssh_hosts scp

Sweet. All that you have to do now is:

$ ssh TAB TAB
host1 host2 host3

We are a bit more efficient today.

Posted March 31, 2013 by malkodan in Bash, Linux, System Administration

Tagged with , , ,

Creating a puppet ready image (CentOS/Fedora)   10 comments

Cloud computing and being lazy

The need to create template images in our cloud environment is obvious. Especially with Amazon EC2 offering an amazing API and spot instances in ridiculously low prices.
In the following post I’ll show what I am doing in order to prepare a “puppet-ready” image.

Puppet for the rescue

In my environment I have puppet configured and provisioning any of my machines. With puppet I can deploy anything I need – “if it’s not in puppet – it doesn’t exist”.
Coupled with Puppet dashboard the interface is rather simple for manually adding nodes. But doing stuff manually is slow. I assume that given the right base image I (and you) can deploy and configure that machine with puppet.
In other words, the ability to convert a bare machine to a usable machine is taken for granted (although it is heaps of work on its own).

Handling the “bare” image

Most cloud computing providers today provide you (usually) with an interface for starting/stopping/provisioning machines on its cloud.
The images the cloud providers are usually supplying are bare, such as CentOS 6.3 with nothing. Configuring an image like that will require some manual labour as you can’t even auto-login to it without some random password or something similar.

Create a “puppet ready” image

So if I boot up a simple CentOS 6.x image, these are the steps I’m taking in order to configure it to be “puppet ready” (and I’ll do it only once per cloud computing provider):

# install EPEL, because it's really useful
rpm -q epel-release-6-8 || rpm -Uvh http://download.fedoraproject.org/pub/epel/6/`uname -i`/epel-release-6-8.noarch.rpm

# install puppet labs repository
rpm -q puppetlabs-release-6-6 || rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-6.noarch.rpm

# i usually disable selinux, because it's mostly a pain
setenforce 0
sed -i -e 's!^SELINUX=.*!SELINUX=disabled!' /etc/selinux/config

# install puppet
yum -y install puppet

# basic puppet configuration
echo '[agent]' > /etc/puppet/puppet.conf
echo '  pluginsync = true' >> /etc/puppet/puppet.conf
echo '  report = true' >> /etc/puppet/puppet.conf
echo '  server = YOUR_PUPPETMASTER_ADDRESS' >> /etc/puppet/puppet.conf
echo '  rundir = /var/run/puppet' >> /etc/puppet/puppet.conf

# run an update
yum update -y

# highly recommended is to install any package you might deploy later on
# the reason behind it is that it will save a lot of precious time if you
# install 'httpd' just once, instead of 300 times, if you deploy 300 machines
# also recommended is to run any 'baseline' configuration you have for your nodes here
# such as changing SSH port or applying common firewall configuration for instance
yum install -y MANY_PACKAGES_YOU_MIGHT_USE

# and now comes the cleanup phase, where we actually make the machine "bare", removing
# any identity it could have

# set machine hostname to 'changeme'
hostname changeme
sed -i -e "s/^HOSTNAME=.*/HOSTNAME=changeme" /etc/sysconfig/network

# remove puppet generated certificates (they should be recreated)
rm -rf /etc/puppet/ssl

# stop puppet, as you should change the hostname before it will be permitted to run again
service puppet stop; chkconfig puppet off

# remove SSH keys - they should be recreated with the new machine identity
rm -f /etc/ssh/ssh_host_*

# finally add your key to authorized_keys
mkdir -p /root/.ssh; echo "YOUR_SSH_PUBLIC_KEY" > /root/.ssh/authorized_keys

Power off the machine and create an image. This is your “puppet-ready” image.

Using the image

Now you’re good to go, create a new image from that machine and any machine you’re going to create in the future should be based on that image.

When creating a new machine the steps you should follow are:

  • Start the machine with the “puppet-ready” image
  • Set the machine’s hostname
  • hostname=uga.bait.com
    hostname $hostname
    sed -i -e "s/^HOSTNAME=.*/HOSTNAME=$hostname/" /etc/sysconfig/network
    
  • Run ‘puppet agent –test’ to generate a new certificate request
  • Add the puppet configuration for the machine, for puppet dashboard it’ll be something similar to:
  • hostname=uga.bait.com
    sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:add name=$hostname
    sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:groups name=$hostname groups=group1,group2
    sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:parameters name=$hostname parameters=parameter1=value1,parameter2=value2
    
  • Authorize the machine in puppetmaster (if autosign is disabled)
  • Run puppet:
    # initial run, might actually change stuff
    puppet agent --test
    service puppet start; chkconfig puppet on
    

This is 90% of the work if you want to quickly create usable machines on the fly, it shortens the process significantly and can be easily implemented to support virtually any cloud computing provider!

I personally have it all scripted and a new instance on EC2 takes me 2-3 minutes to load + configure. It even notifies me politely via email when it’s done.

I’m such a lazy bastard.

Posted March 23, 2013 by malkodan in Bash, Linux, System Administration

Tagged with , , , , , , , , , , ,

Primus (primusrun) and FC18   Leave a comment

Continued from my previous article at:
https://bashinglinux.wordpress.com/2013/02/18/bumblebee-and-fc18-a-horror-show/

This is a little manual about running primus with the previous setup I’ve suggested.

Packages

Pretty simple:

yum install glibc-devel.x86_64 glibc-devel.i686 libX11-devel.x86_64 libX11-devel.i686

We should be good to go in terms of packages (both x86_64 and i686)

Download and compile primus

Clone from github:

cd /tmp && git clone https://github.com/amonakov/primus.git

Compiling for x86_64:

export PRIMUS_libGLd='/usr/lib64/libGL.so.1'
export PRIMUS_libGLa='/usr/lib64/nvidia/libGL.so.1'
LIBDIR=lib64 make
unset PRIMUS_libGLd PRIMUS_libGLa

And for i686 (32 bit):

export PRIMUS_libGLd='/usr/lib/libGL.so.1'
export PRIMUS_libGLa='/usr/lib/nvidia/libGL.so.1'
CXX=g++\ -m32 LIBDIR=lib make
unset PRIMUS_libGLd PRIMUS_libGLa

Running

Running with x86_64:

cd /tmp/primus && \
LD_LIBRARY_PATH=/usr/lib64/nvidia:lib64 ./primusrun glxspheres

Untested by me, but that should be the procedure for i686 (32 bit):

cd /tmp/primus && \
LD_LIBRARY_PATH=/usr/lib/nvidia:lib ./primusrun YOUR_32_BIT_OPENGL_APP

Bumblebee and FC18 – a horror show   3 comments

Preface

I’ve seen numerous posts about how to get bumlebee, optirun and nvidia to run on Fedora Core 18, the only problem was that all of them were using the open source (and somewhat slow) nouveau driver.

I wanted to use the official Nvidia binary driver which is heaps faster.

My configuration is a Lenovo T430s with a NVS 5200M.

Following is a tick list of things to do to get it running (at least on my configuration with FC18 x86_64).

Installing the Nvidia driver

The purpose of this paragraph is to show you how to install the Nvidia driver without overwriting your current OpenGL libraries. Simply download the installer and run:

yum install libbsd-devel dkms
./NVIDIA-Linux-x86_64-XXX.XX.run --x-module-path=/usr/lib64/xorg/nvidia --opengl-libdir=lib64/nvidia --compat32-libdir=lib/nvidia --utility-libdir=lib64/nvidia --no-x-check --disable-nouveau --no-recursion

Even though we ask it to disable nouveau, it still wouldn’t.

This method will not ruin all the good stuff in /usr/lib and /usr/lib64.

Disabling nouveau

Disabling nouveau is rather simple, we need to blacklist it and remove it from initrd:

echo "blacklist nouveau" > /etc/modprobe.d/nvidia.conf 
dracut /boot/initramfs-$(uname -r).img $(uname -r) --omit-drivers nouveau

Good on us. You may either reboot now to verify that nouveau is out of the house, or manually rmmod it:

rmmod nouveau

Bumblebee and all the rest

Install VirtualGL:

yum --enablerepo=updates-testing install VirtualGL

Remove some rubbish xorg files:

rm -f /etc/X11/xorg.conf

Download bbswtich and install with dkms:

cd /tmp && wget https://github.com/downloads/Bumblebee-Project/bbswitch/bbswitch-0.5.tar.gz 
tar -xf bbswitch-0.5.tar.gz
cp -av bbswitch-0.5 /usr/src
ln -s /usr/src/bbswitch-0.5/dkms/dkms.conf /usr/src/bbswitch-0.5/dkms.conf
dkms add -m bbswitch -v 0.5
dkms build -m bbswitch -v 0.5
dkms install -m bbswitch -v 0.5

Download bumblebee and install:

cd /tmp && wget https://github.com/downloads/Bumblebee-Project/Bumblebee/bumblebee-3.0.1.tar.gz
tar -xf bumblebee-3.0.1.tar.gz
cd bumblebee-3.0.1
./configure --prefix=/usr --sysconfdir=/etc 
make && make install 
cp scripts/systemd/bumblebeed.service /lib/systemd/system/
sed -i -e 's#ExecStart=.*#ExecStart=/usr/sbin/bumblebeed --config /etc/bumblebee/bumblebee.conf#g' /lib/systemd/system/bumblebeed.service
chkconfig bumblebeed on

Bumblebee configuration is at /etc/bumblebee/bumblebee.conf, edit it to have this:

[bumblebeed]
VirtualDisplay=:8
KeepUnusedXServer=false
ServerGroup=bumblebee
TurnCardOffAtExit=false
NoEcoModeOverride=false
Driver=nvidia

[optirun]
VGLTransport=proxy
AllowFallbackToIGC=false
[driver-nvidia]
KernelDriver=nvidia
Module=nvidia
PMMethod=bbswitch
LibraryPath=/usr/lib64/nvidia:/usr/lib/nvidia:/usr/lib64/xorg/nvidia
XorgModulePath=/usr/lib64/xorg/nvidia/extensions,/usr/lib64/xorg/nvidia/drivers,/usr/lib64/xorg/modules
XorgConfFile=/etc/bumblebee/xorg.conf.nvidia

The default /etc/bumblebee/xorg.conf.nvidia which comes with bumblebee is ok, but in case you want to make sure, here is mine:

Section "ServerLayout"
    Identifier "Layout0"
    Option "AutoAddDevices" "false"
EndSection

Section "Device"
    Identifier "Device1"
    Driver "nvidia"
    VendorName "NVIDIA Corporation"
    Option "NoLogo" "true"
    Option "UseEDID" "false"
    Option "ConnectedMonitor" "DFP"
EndSection

Add yourself to bumblebee group:

groupadd bumblebee
usermod -a -G bumblebee YOUR_USERNAME

Restart bumblebeed:

systemctl restart bumblebeed.service

Testing

To run with your onboard graphics card:

glxspheres

Running with nvidia discrete graphics:

optirun glxspheres

Useful links

Inspiration from:
http://duxyng.wordpress.com/2012/01/26/finally-working-nvidia-optimus-on-fedora/
Download links:
https://github.com/Bumblebee-Project/Bumblebee/downloads
https://github.com/Bumblebee-Project/bbswitch/downloads
http://www.nvidia.com/Download/index.aspx

If it works for you, please let me know!!
And if it doesn’t, perhaps I could help.

Bash Scripting Conventions   2 comments

Have decided to publish the infamous Bash scripting conventions.

Here they are:
https://github.com/danfruehauf/Scripts/tree/master/bash_scripting_conventions

Please, comment, challenge and help me modify it. I’m very open for feedback.

Posted January 28, 2013 by malkodan in Bash, Linux, System Administration

Tagged with , , , , ,

Rocket science   Leave a comment

I still remember my Linux nightmares of the previous century. Trying to install Linux and wiping my whole HD while trying to multi boot RedHat 5.0.
It was for a reason that they said you have to be a rocket scientist in order to install Linux properly.
Times have changed, Linux is easy to install. Perhaps two things are different, one is that objectively Linux became much easier to handle and the second is probably the fact I gained much more experience.
In my opinion – one of the reasons that Linux became easier along the years is the improving support for various device drivers. For the home users – it is excellent news. However, for the SysAdmin who deals mostly with servers and some high-end devices, the headache, I believe, still exists.
If you thought that having a NIC without a driver is a problem, I can assure you that having a RAID controller without a driver is ten times the headache.
I bring you here the story of the RocketRAID device, how to remaster initrd and driver disks and of course, how to become a rocket scientist!

WTF

With Centos 5.4 you get an ugly error in the middle of the installation saying you have no devices you can partition.
DOH!!! Because it discovered no HDs.

So now you’re asking yourself, where am I going? – Google of course.
RocketRAID 3530 driver page

And you discover you have drivers only for RHEL/CentOS 5.3. Oh! but there’s also source code!
It means we can do either of both:

  1. Remaster initrd and insert the RocketRAID drivers where needed
  2. Create a new driver disk and use it

I’ll show how we do them both.
I’ll assume you have the RocketRAID driver compiled for the installation kernel.
In addition, I’m also going to assume you have a network installation that’s easy to remaster.

Remastering the initrd

What do we have?

# file initrd.img
initrd.img: gzip compressed data, from Unix, last modified: Sun Jul 26 17:39:09 2009, max compression

I’ll make it quicker for you. It’s a gzipped cpio archive.
Let’s open it:

# mkdir initrd; gunzip -c initrd.img | (cd initrd && cpio -idm)
12113 blocks

It’s open, let’s modify what’s needed.

  • modules/modules.alias – Contains a list of PCI device IDs and the module to load
  • modules/pci.ids – Common names for PCI devices
  • modules/modules.dep – Dependency tree for modules (loading order of modules)
  • modules/modules.cgz – The actual modules inside this initrd

Most of the work was done for us already in the official driver package from HighPoint.
Edit modules.alias and add there the relevant new IDs:

alias pci:v00001103d00003220sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003320sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003410sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003510sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003511sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003520sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003521sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003522sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003530sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003540sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003560sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004210sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004211sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004310sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004311sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004320sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004321sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004322sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004400sv*sd*bc*sc*i* hptiop

This was taken from the RHEL5.3 package on the HighPoint website.

So now the installer (anaconda) knows it should load hptiop for our relevant devices. But it needs the module itself!
Download the source package and do the usual configure/make/make install – I’m not planning to go into it. I assume you now have your hptiop.ko compiled against the kernel version the installation is going to use.
OK, so the real deal is in modules.cgz, let’s open it:

# file modules/modules.cgz
modules/modules.cgz: gzip compressed data, from Unix, last modified: Sat Mar 21 15:13:43 2009, max compression
# mkdir /tmp/modules; gunzip -c modules/modules.cgz | (cd /tmp/modules && cpio -idm)
41082 blocks
# cp /home/dan/hptiop.ko /tmp/modules/2.6.18-164.el5/x86_64

Now we need to repackage both modules.cgz and initrd.img:

# (cd /tmp/modules && find . -print | cpio -c -o | gzip -c9 > /tmp/initrd/modules/modules.cgz)
41083 blocks
# (cd /tmp/initrd && find . -print | cpio -c -o | gzip -c9 > /tmp/initrd-with-rr.img)

Great, use initrd-with-rr.img now for your installation, it should load your RocketRAID device!

A driver disk

Creating a driver disk is much cleaner in my opinion. You do not remaster a stock initrd just for a stupid driver.
So you ask what is a driver disk? – Without going into the bits and bytes, I’ll just say that it’s a brilliant way of incorporating a custom modules.cgz and modules.alias without touching the installation initrd at all!
I knew I couldn’t live quietly with the initrd remaster so choosing the driver disk (dd in short) option was inevitable.
As I noted before, HighPoint provided me only a RHEL/CentOS 5.3 driver disk (and binary), but they also provided the source. I knew it was a matter of some adjustments to get it to work also for 5.4.
It is much easier to approach the driver disk now as we are much more familiar with how the installation initrd works.
I’m lazy, I already created a script that takes the 5.3 driver package and creates a dd:

#!/bin/bash

# $1 - driver_package
# $2 - destination of driver disk
make_rocketraid_driverdisk() {
        local driver_package=$1; shift
        local destination=$1; shift

        local tmp_image=`mktemp`
        local tmp_mount_dir=`mktemp -d`

        dd if=/dev/zero of=$tmp_image count=1 bs=1M && \
        mkdosfs $tmp_image && \
        mount -o loop $tmp_image $tmp_mount_dir && \
        tar -xf $driver_package -C $tmp_mount_dir && \
        umount $tmp_mount_dir && \
        local -i retval=$?

        if [ $retval -eq 0 ]; then
                cp -aL $tmp_image $destination
                chmod 644 $destination
                echo "Driver disk created at: $destination"
        fi

        rm -f $tmp_image
        rmdir $tmp_mount_dir

        return $retval
}

make_rocketraid_driverdisk rr3xxx_4xxx-rhel_centos-5u3-x86_64-v1.6.09.0702.tgz /tmp/rr.img

Want it for 5.4? – easy. Just remaster the modules.cgz that’s inside rr3xxx_4xxx-rhel_centos-5u3-x86_64-v1.6.09.0702.tgz and replace it with a relevant hptiop.ko module 🙂

Edit your kickstart to load the driver disk:

driverdisk --source=http://UGAT/HA/BAIT/INC/HighPoint/RocketRAID/3xxx-4xxx/rr3xxx-4xxx-2.6.18-164.el5.img

Make sure you have this line in the main section and not meta generated in your %pre section as the driverdisk directive is being processed before the %pre section.

The OS doesn’t boot after installation

You moron! This is because the installation kernel/initrd and the one that boots afterwards are not the same!
You can fix it in one of the 3 following ways:

  1. Recompile the CentOS/RHEL kernel and repackage it with the RocketRAID driver – pretty ugly, not to mention time consuming.
  2. Build a module RPM for the specific kernel version you’re going to use – very clean but also very time consuming!
  3. Just build the module for the relevant kernel in the %post section – my way.

In the %post section of your kickstart, add the following:

(cd /tmp && \
        wget http://UGAT/HA/BAIT/INC/rr3xxx_4xxx-linux-src-v1.6-072009-1131.tar.gz && \
        tar -xf rr3xxx_4xxx-linux-src-v1.6-072009-1131.tar.gz && \
        cd rr3xxx_4xxx-linux-src-v1.6 && \
        make install)

The next boot obviously have a different initrd image. Generally speaking, initrd creation is done after the %post section, so you should not bother about it too much…
Server should boot now. Go play with your 12x2TB RAID array.

I hope I could teach you something in this post. It was a hell of a war discovering how to properly do all of these.
Now if you’ll excuse me – I’ll be going to play with spaceships and shoot rockets!