Farm Guide

From Pmgwiki

Revision as of 23:48, 8 December 2008 by Evanj (Talk | contribs)
Jump to: navigation, search

The farm machines are a set of 14 rack mounted servers in the CSAIL machine room, available for experiments. Warning: The data on these machines is not backed up. Use at your own risk. These machines are shared: please be courteous. Use top or uptime to see if a machine is being used by someone else for something that may be computationally expensive. If so, please use a different machine.


Contents

Disk Space

Each of the farm machines has the following configuration:

  • Root partition: /dev/sda1 using ext3, occupying the entire disk (minus swap)
  • Swap: /dev/sda2 (supposed to be 2x RAM = ~4 GB)
  • /space: /dev/sdb1 using ext3, occupying the entire disk

In other words, if you need more disk space, put stuff in /space


Installing Software

Please don't run make install as root to install software system-wide. Please use Ubuntu packages, since that will permit the software to be painlessly installed across all the machines, and to be removed/upgraded. If you need something specific for your research, please consider installing it into your home directory, to avoid cluttering up the systems.

To install software:

  1. Find the package name in the Ubuntu package archive
  2. Install the package on all the machines as root: for i in `seq 1 14`; do ssh farm$i apt-get install [package-name]
  3. Document the package that you wanted in the "important packages" section below, so we don't forget in the future.

Using AFS

The farm machines are configured to use AFS. To access your CSAIL AFS files:

  1. Get a Kerberos ticket: kinit [username]. Type your CSAIL password.
  2. Access your files: ls /afs/csail.mit.edu/u/[first letter of user name]/[user name] Example: ls /afs/csail.mit.edu/u/e/evanj

To access your Athena AFS account, first follow the above to get a Kerberos ticket for your CSAIL account. The following are adapted from the CSAIL cross-cell HOWTO:

  1. Create a cross-cell entry: aklog -cell athena.mit.edu. You will get a message like: created cross-cell entry for [username]@csail.mit.edu (Id 16383603) at athena.mit.edu
  2. Log in to an Athena machine, give your CSAIL account access to all the files in your home directory: cd; find . -name .snapshot -prune -o -type d -exec fs sa {} [username]@athena.mit.edu all \;

Important Packages

  • am-utils: used for amd to automount NFS. At some point we might want to migrate to autofs, the in-kernel implementation.
  • g++ gdb make gcc-4.2 subversion valgrind git-core: Development tools, including GCC 4.2 for compiling older C code
  • csh tcsh emacs: shells and editors that people use
  • sun-java6-jdk ant ant-optional: Java development environment
  • ntp: for time synchronization
  • libxp6: needed for matlab
  • python-psyco: much faster Python execution for long running programs
  • libgoogle-perftools0: Includes Google's profiler as google-pprof

Hardware Details

farm1-4

  • Dell PowerEdge 650
  • 1x Intel(R) Pentium(R) 4 CPU 3.06GHz (with hyperthreading: two virtual CPUs)
  • 2 GB RAM
  • 2x 120 GB disks
  • 2x Intel e1000 gigabit Ethernet (eth0 connected)
  • BIOS: Revision A05 (except farm1, which is using A04)

farm5-15

  • Dell PowerEdge SC1420
  • 2x Intel(R) Xeon(TM) CPU 3.20GHz (with Hyperthreading: 4 virtual CPUs)
  • 2 GB RAM
  • 2x 160 GB disks
  • 2x Intel e1000 gigabit Ethernet (eth0 connected)
  • BIOS: Revision A03

Special Configurations

  • farm2-4 have a RAID controller card in them. farm1 has this card removed. Besides that, their hardware is identical.
  • farm2 has backups on it, on its second disk (/dev/sdb). These backups are mounted in /archive. As such, it does not have a second disk mounted on /space like the others. It still has the /space directory, to maintain compatability with the matlab symlinks.
  • farm6 has backups located in /space/archive.

Installing Ubuntu Server

Installing Ubuntu on these machines is relatively straightforward.

  1. Back up important directories (I used the second disk /dev/sdb1): /etc /home /root
  2. Grab the Ubuntu Server CD from the Media Lab Ubuntu Mirror
  3. (Optional): I put Ubuntu on a USB key, instead of on a CD, by following the "flexible" directions in the Ubuntu install guide.
  4. Reboot. (Optional): For the USB key: Press F2 to enter bios; change Hard Drive boot order to put USB Flash first.
  5. Use the default partitioning on /dev/sda.
  6. Tell the installer to mount /dev/sdb1 as /space, but not partition it
  7. Install the default Ubuntu Server. Add the OpenSSH package, but no others.
  8. When logged in to the new system, edit /etc/network/interfaces to have the correct static IP address. Edit /etc/resolv.conf to have the right DNS server and search domain (copy from an existing machine)
  9. ifdown eth0; ifup eth0 to use the new configuration
  10. Edit /etc/apt/sources.list to use the media lab Ubuntu mirror, since that is faster: perl -pi -e 's/us.archive.ubuntu.com/ubuntu.media.mit.edu/' /etc/apt/sources.list
  11. apt-get update; apt-get upgrade
  12. With the base system installed, install the "important" packages above.
  13. Restore the SSH host keys: cp [backup location]/etc/ssh/ssh_host* /etc/ssh
  14. Fix automount to start at boot: update-rc.d am-utils defaults (Ubuntu bug filed: hopefully this will be unneeded in the future)
  15. Uncomment the line server pool.ntp.org in /etc/ntp.conf to get more accurate NTP synchronization.
  16. copy home directories from a backup
  17. Remove shadow passwords (needed for PMG stuff): pwunconv
  18. Copy passwd file to passwd.base. This gets used to produce the "real" passwd file, by adding users to it: cp /etc/passwd /etc/passwd.base
  19. Edit /etc/passwd.base to remove any user accounts. You can store any local passwords in /etc/passwd.local
  20. Install the PMG scripts and accounts:
cd /
curl http://pmg.csail.mit.edu/internal/new-pmg.tar.gz | tar xzf -
/usr/local/adm/bin/updatemachine


Installing AFS

The following procedure builds an AFS module package specific for the kernel being used on the system. This package can be used on other systems, provided that they have the same kernel. This can save some time on other systems.

  1. sudo apt-get install openafs-krb5 openafs-client krb5-user module-assistant
  2. Accept the defaults. Set the AFS cell to csail.mit.edu
  3. sudo module-assistant prepare
  4. sudo module-assistant auto-install openafs
  5. sudo /etc/init.d/openafs-client restart


Installing Matlab

These directions are stolen from [1]

mkdir /space/matlab
cp /space/backup*/space/matlab7.4/etc/license.dat /space/matlab
cd /space/backup*/space/matlab-download*
./install -t
        (probably don't need -t: -t only necessary when X is not available...)
a       (accept license)
/space/matlab   (where the install should go)
c       (continue)
y       (make links)
        (/usr/local/bin is fine)
y       (begin installation)
matlab  (test it minimally)
quit

Cloning a Ubuntu Server

Installing individual machines using the procedure above is somewhat reasonable, but when installing more than a few servers, you want to automate the task. Here is how I installed Ubuntu across all the servers:

  1. Install and configure one machine with Ubuntu, as it should be replicated across all the machines.
  2. Burn System Rescue CD on a CD or on a USB key (I put in on the same USB key as the Ubuntu installer, so I could choose which to boot)
  3. Remount root using a bind mount, to avoid cloning stuff from other file systems (such as devfs, proc, etc): mkdir /temproot; mount --bind / /temproot
  4. Create a tar archive containing all the files in the root filesystem: cd /temproot; tar cpf /tmp/image.tar .
  5. Unmount the bind mount: cd; umount /temproot; rm -rf /temproot
  6. Start an "image server": while true; do nc -l 12345 < /tmp/image.tar; done
  7. Boot the destination machine using the System Rescue CD. (if you boot with rescuecd docache at the syslinux prompt you will be able to eject the CD/unmount the disk, which can be useful)
  8. Configure the network: net-setup eth0
  9. Run the script to clone the image: ./clone_farm.sh
  10. When done, use CTRL-C to stop the "image server", and rm /tmp/image.tar.

The clone script does all the main work. It will need to be customized, depending on what kind of configuration needs to be done. Look for TODO comments in the script:

#!/bin/sh

set -e

HOSTNAME=$1
if [ -z "$HOSTNAME" ]; then
    echo "missing hostname"
    exit 1
fi

IP=`host -t A $1 | cut -d " " -f 4`
if [ -z "$IP" -o "$IP" = "out;" ]; then
    echo "missing ip"
    exit 1
fi

# Partition first hard drive
# TODO: Fix size for your disk size. Size is in megabytes
# Value for farm1-4: 110332
# Value for farm4-14: 148500
sfdisk /dev/sda -uM << EOF
0,110332,L,*
,,S
EOF

# Make file systems and mount root
mke2fs -j /dev/sda1
mkswap /dev/sda2
mkdir /mnt/disk
mount /dev/sda1 /mnt/disk
cd /mnt/disk

# Fetch and extract disk image
# TODO: Replace host name with location of your "image server"
nc farm13 12345 | tar xvf - --preserve

# Fix UUIDs in /etc/fstab and /boot/grub/menu.lst
UUID_ROOT=`vol_id --uuid /dev/sda1`
UUID_SWAP=`vol_id --uuid /dev/sda2`
UUID_SPACE=`vol_id --uuid /dev/sdb1`

# TODO: Replace these UUIDs with the UUIDs from the source system
perl -pi -e "s/d9539d01-2090-484e-8e70-067f40d6bd35/$UUID_ROOT/g;" etc/fstab boot/grub/menu.lst
perl -pi -e "s/26a0ceda-4a9f-4a6f-9744-57d389088dc1/$UUID_SWAP/g;" etc/fstab
perl -pi -e "s/21afaa7b-7cd9-403b-9cc4-794377a5b888/$UUID_SPACE/g;" etc/fstab

# Mount proc and dev to make grub work
mount -t proc none proc
mount -o bind /dev dev

# Install grub on the disk to make it bootable
chroot . grub-install /dev/sda
chroot . update-grub

# Fix hostname and /etc/network/interfaces
echo $HOSTNAME > etc/hostname
# TODO: Fix IP address to match your source machine
perl -pi -e "s/18.26.1.62/$IP/;" etc/network/interfaces

# Remove the record of the network interfaces so they get redetected
rm etc/udev/rules.d/70-persistent-net.rules

# Copy original SSH keys from /space
mkdir /mnt/space
mount /dev/sdb1 /mnt/space -o ro
cp /mnt/space/backup-200811*/etc/ssh/ssh_host_* etc/ssh
Personal tools