Mini Blog

I don't need a full-featured blog, so I'm just using this simple announcement page to record what pops in my head.  I used to have a full blog I set up in 2004, but I pretty much stopped posting in 2006.  There wasn't much there of interest, so I just nuked it when I updated my site recently.

Upgrade FreeBSD 8.2 to 9.0 with GMIRROR boot disk

posted Jan 31, 2012, 8:32 AM by Vick Khera   [ updated Jan 31, 2012, 8:48 AM ]

Many of my servers are configured with a software mirrored boot volume configured as per the FreeBSD handbook. Unfortunately the process outlined on those pages results in a mirror configuration that sometimes works in FreeBSD 9.0 and sometimes does not.  Here is how I handle the situation where it does not.

Firstly, I assume you know how to run a source upgrade, and you already have mirroring configured.  The instructions are pretty clear in the handbook and the UPDATING file in the source tree.

The trouble will become apparent on the initial reboot into the new kernel.  The tell-tale sign is that you get the following error:

GEOM_PART: integrity check failed (mirror/gm0, MBR)

and you end up with the mountroot> prompt.  If you do not get this problem, then your gmirror will work fine with FreeBSD 9.0, and you don't need to do anything otherwise special to survive this transition.

If you do get this error, the fix is to break the mirror, and re-create it using the GPT partitions which FreeBSD 9 prefers.  My examples assume two SATA drives which in FreeBSD 8 and earlier show up as /dev/ad4 and /dev/ad6.

Breaking the Mirror

  1. Reboot the system.
  2. Escape to the boot loader prompt at the FreeBSD boot menu.
  3. At the OK prompt, type boot kernel.old -s and it will get you to the single-user FreeBSD 8 system again.
  4. Clear the existing gmirror configuration:
gmirror remove gm0 ad6
gmirror remove gm0 ad4

This will destroy the mirror. Reboot again, into single user mode in FreeBSD 9 and resume the upgrade process. Return here to rebuild the mirrors using GPT partitions before the final reboot (or reboot back into single user).

Rebuilding GMIRROR

The process is to clear the secondary drive, partition it using GPT, copy the data over to it, boot from this drive, re-partition the first drive similarly, and set up the mirrors for each pat. Be sure to keep the partitions similarly sized to existing ones in the old gmirror, and don't forget any data partition we may have. This is done after booting into the system single-user. A good time is prior to the final reboot of the upgrade steps after the delete-old step or after mergemaster.

Set up the partitions

We need to clear the drive, and partition it.  In this example, I set up 512MB root file system, 2G each for swap and /usr, and the rest for /var.

dd if=/dev/zero of=/dev/ada1 count=2
gpart create -s gpt ada1
gpart add -t freebsd-boot -l boot-secondary -s 64 ada1
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1
gpart add -t freebsd-ufs -l rootfs-secondary -s 512M ada1
gpart add -t freebsd-swap -l swap-secondary -s 2G ada1
gpart add -t freebsd-ufs -l usr-secondary -s 2G ada1
gpart add -t freebsd-ufs -l var-secondary ada1
gpart show ada1

Create new file systems


The root partition is not recommended to have soft-updates. Also, if you have an SSD, the -t flag is useless once the mirror is set up so there is no point in using that.  Since this is an SSD, I'm going to specify the -E flag to clear file system back to the hardware. I also like to label my partitions so my /etc/fstab file is consistent across all machines.

newfs -E -L root /dev/gpt/rootfs-secondary
newfs -E -L usr -U /dev/gpt/usr-secondary
newfs -E -L var -U /dev/gpt/var-secondary

Turn on soft-updates journaling

Soft updates journaling is a new feature in FreeBSD 9.  We enable it now on the new file systems, and should also do it for existing file systems which are not being re-mirrored.

tunefs -A -j enable /dev/gpt/var-secondary
tunefs -A -j enable /dev/gpt/usr-secondary

Copy the data

Use dump + restore to copy the file systems.  The -L is probably not necessary since they are not mounted (and in the case of root, mounted read-only).

(mount /dev/gpt/rootfs-secondary /mnt && cd /mnt && dump -C24 -0aL -f - / | restore -rf - && cd && umount /mnt)
(mount /dev/gpt/var-secondary /mnt && cd /mnt && dump -C24 -0aL -f - /var | restore -rf - && cd && umount /mnt)
(mount /dev/gpt/usr-secondary /mnt && cd /mnt && dump -C24 -0aL -f - /usr | restore -rf - && cd && umount /mnt)

Reboot into new drive

reboot

Break into the boot prompt and specify the alternate boot device:

set vfs.root.mountfrom=ufs:gpt/rootfs-secondary
boot -s

This will now be running the system in single user mode from the second drive. 

Partition the first drive

We partition the first drive identically to the second drive we did above, just using different names:

dd if=/dev/zero of=/dev/ada0 count=2
gpart create -s gpt ada0
gpart add -t freebsd-boot -l boot-primary -s 64 ada0
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada0
gpart add -t freebsd-ufs -l rootfs-primary -s 512M ada0
gpart add -t freebsd-swap -l swap-primary -s 2G ada0
gpart add -t freebsd-ufs -l usr-primary -s 2G ada0
gpart add -t freebsd-ufs -l var-primary ada0
gpart show ada0

Set up the mirrors

gmirror label -v gmroot /dev/gpt/rootfs-secondary
gmirror label -v gmusr /dev/gpt/usr-secondary
gmirror label -v gmvar /dev/gpt/var-secondary

And insert the partitions from the empty drive:

gmirror insert gmroot /dev/gpt/rootfs-primary 
gmirror insert gmusr /dev/gpt/usr-primary
gmirror insert gmvar /dev/gpt/var-primary

Wait a while for it to finish sync. gmirror status is your friend here.

Finishing Up

The file system labels seem to get masked by the mirror labels in this configuration, so you need to update the /etc/fstab to use those.

mount -u /
mount /dev/mirror/gmusr /usr
mount /dev/mirror/gmvar /var

Edit /etc/fstab to change the /dev/ufs/XYZ to /dev/mirror/gmXYZ devices for root, usr, and var partitions.

Reboot.

Profit.

Almost Perfect Disk Crash Recovery

posted Oct 31, 2011, 9:11 PM by Vick Khera

The other day my daughter told me there is some beeping near the main family iMac.  I sounded bad to me, so I set it to shut down.  It never completed.  When I powered it off a while later, the beeping stopped.  Clearly something bad was happening with it.

I tried to reboot it, but all I got was the flashing question mark of a missing boot volume.  Sadness.

As it turns out, the brand new disk (almost exactly one month old) failed.  I had it upgraded recently to give me more room to hold all my pictures and music.  The store replaced the failed drive and I just finished the restore from the time machine backup using the Lion Recovery Disk on a USB stick.  It was all incredibly slick.

Once it rebooted about 7 hours later (the disk was network mounted, not directly attached), I decided to check everything out.  There were three programs that failed to run: Firefox, Thunderbird, and Chrome.

Firefox and Thunderbird both complained they could not read my profile.  Easy test was to move the profile directory (and indeed, the entire Firefox and Thunderbird directories under Application Support) out of the way and restart.  Still, same error.  Then I found some web page that said to try running the binary directly on the command line.  This showed me the error message Error: Access was denied while trying to open files in your profile directory.  A quick google search later, and the most improbable solution was offered:  Run sudo chown -R vivek ~ to fix the ownership of all files in my home directory.  At first I couldn't see how this could work.  I had already compared the file permissions of my Firefox directory (and all directories above it) to those on a working machine, and all was the same.  Nevertheless, I decided to go ahead and try it.  Amazingly, it fixed all three programs.  I still have no idea why this fixed it, but I have better things to worry about.

All in all, the Mac OS Time Machine is absolutely amazing.  The simplicity of recovering from a complete disk failure is astonishing.  The recovery USB stick image booted, found my network backup, and restored it to near perfect working condition with zero data loss.  All my pictures are there and all my music is there.  Needless to say I'm anxiously waiting for the current backup to complete.  I think I may also get a second backup drive to keep my most precious files off-site and back it up once a week or so.

Is Date Arithmetic Really That Difficult?

posted Mar 3, 2011, 7:57 AM by Vick Khera

I was just reviewing some invoice from one of my vendors, a content delivery network.  The one for last month showed the services rendered were for "1/31/2011 to 2/30/2011".  I went back and looked at some others.  I paid for services on 11/31/2010 last fall.  I'd also say I paid for 9/31/2010 but I can't because the invoice where that was expected reads 0/31 instead.  I fully expect to pay for 2/31/2011 on my next invoice.  I guess they will owe me four days. :-)

Here's a screenshot:


Is it really that hard to do date arithmetic correctly?  There are plenty of standard libraries that have solved nearly every date manipulation problem known to man, and they are available in every useful programming language.  Please, people, just use them!

I Should Have Patented That!

posted Feb 23, 2011, 8:00 AM by Vick Khera

Today I received my regular list of network updates email from LinkedIn.  One of my friends from grad school had listed a patent into his profile, and I figured I should also enter the one I have. A quick search on Google's patent database for my name and I had found it: System and method for increasing click through rates of internet banner advertisements. That was easy! 

Interestingly, it also turned up two patents issued to some other people that referenced a paper I co-wrote while in grad school at Duke, The Intemet Programming. Contest: A Report and Philosophy.  We described The Duke Internet Programming Contest which we invented and ran from 1990 through 1993.

The basic gist of one of these patents, Systems and methods for coding competitions, is how to run a coding (programming) contest over a communications network.  To my eyes, the bulk of the claims involve describing exactly how we ran the contest, with a few minor additions.  The biggest difference I see is they have a pre-qualifying test.  We sort of had that, too, but it was designed to ensure that the communications part of the system worked end to end for each contestant, not to qualify them for the contest itself.

The other patent, Apparatus and system for facilitating online coding competitions, seems to describe the software they used for their contest.  They describe a web based client/server system, whereas ours was driven by email and shell scripts.  Email was just the conduit through which the shell script client sent the request to the server, and how the server responded to the client.  We could not have used web browsers as our interface since our contest pre-dated the general availability of the web (or any Internet access) by several years.  The clever thing about email was that it worked over other channels as well, which permitted people from many parts of the world to participate.

Clearly, we should have patented our contest method. My personal opinion is that these two patents should never have been granted since they are substantially just a translation of our contest onto the web interface.  Our full contest and all code has been available for anyone to use since 1990, and I'm certain the developers of this patent were well aware of it.

Keeping Track of FreeBSD Kernels

posted Jan 25, 2011, 7:51 AM by Vick Khera   [ updated Mar 3, 2011, 8:20 AM ]

In March 2007, I wrote about this technique I invented for tracking the kernel configurations on the servers we use to run MailerMailer.  I recently nuked that old blog but I thought it would be useful to keep this information archived, so I'm recreating and updating it now.

I keep track of my kernel configurations in Subversion. I have a common component which applies to all systems under my control, and an architecture specific component that applies separately to i386 and amd64 systems.

In the common file I include all of the required kernel configuration lines, and devices which are on all systems, such as SCSI disks, our common Ethernet controllers (we have limited vendors, so there are only a handful of drivers needed), and any other options lines we need for every system. I also include a small set of modules to build for features or devices that are present on some servers, or that are rarely used, such as the floppy disk, CD-ROM, and USB serial adapters. This speeds up builds since we don't spend time making modules which will never be used, and keeps the kernel smaller and faster by not probing for such devices.

In the architecture-specific files are just the devices applicable to that architecture. For example, the 'pmtimer' devices doesn't exist for amd64, so it is in the i386 config file.

In each configuration file, I take advantage of the fact that the file can have 'makeoptions' which are basically dumped right into the generated Makefile.

So in my common file, KCICOMMON, I have this at the top:

makeoptions KCICOMMONREV="$Revision: 2493 $"
makeoptions KCICOMMON="${KCICOMMONREV:C/[^0-9]//g}"

and in the i386 specific file, KCI32, I have this:

ident "KCI32@${KCI}+${KCICOMMON}"
makeoptions KCIREV="$Revision: 358 $"
makeoptions KCI="${KCIREV:C/[^0-9]//g}"

Since some of my systems are SMP enabled, I have a minor variant called KCI32SMP also, which is entirely this:

include KCI32
ident "KCI32SMP@${KCI}+${KCICOMMON}"
options SMP

What happens is that the lines inserted into the Makefile are evaluated at build-time, and those variables are made visible to the line that computes the kernel identifier string (also in the generated Makefile).  One of the final steps in building the kernel is to copy the identifier string into the compiled kernel.

The final step is to automatically select the proper kernel when I run make installkernel in the source tree.  Add this line into /etc/make.conf:

KERNCONF?=KCI32

Now, my kernel identifies itself with uname:

% uname -i
KCI32SMP@358+2493

So I know this is a 32-bit system running SMP with the version 358 i386 config and the version 2493 common configuration. A trivial look-up in subversion tells me exactly what's in it. I can also quickly find my systems which are running kernels with older configurations on the rare occasion it is updated.

Over the last four years of using this scheme, the architecture-specific files have not been updated once.  The common file has been updated several times, mostly for updating from FreeBSD 6 to 7 to 8, but also occasionally to add another module or two for features we decided to start using, such as ZFS.  Also with recent versions of FreeBSD, the SMP kernel will boot and run just fine on single-processor systems, so that extra set of SMP configuration files I use will probably go away at some point, and every kernel will be SMP capable.

For your reference, the live configuration files I use are attached below.  Feel free to tweak them to your needs.  Do remember to set the svn:keywords property in subversion to enable the "Revision" keyword on these files, else you'll be stuck with my revision numbers forever.  A final tip: the KCICOMMON file lives in my amd64 kernel directory, and the i386 kernel directory just contains a symlink to it.

Serial Console access in VirtualBox

posted Jan 18, 2011, 12:46 PM by Vick Khera

I tried today to use the VirtualBox trick I described earlier to configure up my pfSense firewall CF card before deploying it.  The only problem was that the embedded version boots to the serial port of the device, and the normal console is entirely unused other than to show the "BTX Loader" version number.

Digging around I found out how to make VirtualBox connect the serial port to the host, from where I could access it.

First, configure the serial connection in the virtual machine: enable serial ports, set COM1 to "use host-pipe", check the create pipe box, and in the path, name a file.  I used "/Users/vivek/serialconsole".  Since pfSense already uses the serial port as its console, there was no configuration necessary for the software.

When the virtual machine starts, this file name will be a socket to which you can connect and interact with the serial port of the machine.  On MacOS, the command to use is "nc -U /Users/vivek/serialconsole".  This seems to lose some random characters, but for the most part is quite usable to watch the boot process and interact with it as necessary to do the initial configuration required on the console.

USB Stick/CF Card booting with VirtualBox

posted Jan 10, 2011, 12:38 PM by Vick Khera

Yesterday I was attempting to upgrade my FreeNAS server to the latest version.  The trick is that I built a custom box that boots from a CF card so I could leave all the drive bays free for real disk drives, and it doesn't have a CD-ROM drive on it either.  The only option here was to use a virtual machine emulator to boot the install CD and have it install the new version onto a CF card via USB adapter.

The first thingI did was download the install ISO, boot using VirtualBox, and install to the CF card plugged into a USB port on my Mac desktop.  Tricking VirtualBox into hijacking the USB device is an easy parlor trick explained in the VirtualBox manual and other places on the webs: basically, you plug in the drive, then add a filter to map that device into the VM using the menu on the Devices dialog of the VM itself (not the main VirtualBox window).

When I plugged in the CF card back into the NAS box but it just sat there. Not having a handy dandy monitor either, I was stumped... until I figured "why not just boot the USB CF card adapter in VirtualBox?"

Here is how I did it.

First, plug in the USB card reader with the CF card inserted.  MacOS will warn you it is an unformatted drive.  Click the "ignore" option (if it is a fresh CF card that is recognized, when you open Disk Utility, click the "unmount" button, not the "eject" button).  Now open Disk Utility, select the drive and view the info for that volume.  Find the "Disk Identifier" for that drive, something like disk2.  This is basically the same way you find the device for using to do the "dd" image copy of embedded usb bootable systems.

Once the image is written to the USB device, make a VirtualBox virtual drive to reference it.  This doesn't copy the image, but arranges for the raw device to be usable to VirtualBox:

VBoxManage internalcommands createrawvmdk -filename ~/usbdrive.vmdk -rawdisk /dev/disk2 -register

Remember to use the diskN device you discovered above.  This will register a virtual drive right into VirtualBox.  Now simply enter the configuration settings for your VM and go into the storage settings, and add a disk to the disk controller.  Select this virtual device and remove any other devices that may boot.

When you turn on the VM it will now boot from the USB stick attached to your host computer and you can pre-configure it as you wish before installing it on your actual embedded device.

What I discovered with my FreeNAS installation was that something botched the disk write and the boot loader was unable to find the boot volume.  Simply re-flashing directly on the Mac using the raw image file from the FreeNAS web site let me get it to work.  I think something with the VirtualBox USB connection was breaking it.

As a final trick of awesomeness, I cloned the MAC address of the file server onto the VirtualBox VM and treated it as the real thing (this allowed it to get the right IP address via DHCP server), uploaded my saved config file, and did a shutdown.  Then when I plugged it into the real server it fired up straight away and was ready to run.

Finally, to clean up, I went into the virtual media manager in VirtualBox, deleted this new temporary vmdk file, and deleted the temporary VM I created for this purpose.  None of it is needed anymore.

I plan to update my pfSense embedded installation this way as well.  Newer versions allow self-upgrade, but to get to that version I have to do a full install one last time.   This is a simple way to prepare the CF card with the existing configuration to minimize the downtime.

Best Birthday Card Ever

posted Dec 2, 2010, 8:21 PM by Vick Khera

My oldest made this birthday card for me today.


How to Pronounce A Web Address

posted Oct 22, 2010, 7:16 AM by Vick Khera

Yesterday my five-year-old wanted to play a game on the computer.  She likes a particular games web site, so she told me to "go to H-T-T-P squishy-squishy games2girls DOT COM".

No Good Ice Cream

posted Oct 13, 2010, 7:21 AM by Vick Khera   [ updated Oct 13, 2010, 7:24 AM ]

Around here, it just became really hard to get good fresh ice cream.  The last of the locally made ice cream parlors, Giffords, just closed their main store in Rockville.  It was in a very prime location in the middle of town square where there's tons of foot traffic, so one has to wonder what they were doing wrong... I doubt it was the cold weather since you can find ice cream shops all over the place in cold cities like Boston.  I guess people around here just don't like premium ice cream.

One of my friends tells me there's a place a few miles north of here that still makes their own, so I'll have to try that out.

1-10 of 10