geek

You are currently browsing the archive for the geek category.

Annat är det numera när man kan flasha firmware-updates tom. på sin teve. Men döm om min förvåning då jag tittade in i firmware-flasharen. boot.img? u-boot.bin? vmlinux.ub?

image

Bor det pingviner i min teve?

Tags: ,

Here’s the story of how i rescued a Windows XP installation from a broken 160 GB SATA hard disk to an intact 60 GB SATA disk, illustrated in a few easy teps that will make my six and a half hours of creative hackery seem like a work (walk) in the park. I also sing high praise to the penguin.

But first a disclaimer, since my boss will probably be reading this.  All this could probably have been done using suitable tools running on Windows. We just don’t have any. Also, you could probably have done this using partimg, saving you a bucketload of work, but since you’re doing this from a broken disk, partimg will puke and fall over.

Here’s the brief background. A few days ago, i heard from a customer that one of their laptop hard disks had broken. Today, while waiting for the replacement HD, i got an update. The guy with the broken laptop is going on a business trip to see some customers and that he needs a laptop with him. So if either that one could be repaired, or if i could get a spare laptop of theirs in running order, that would be, well, critical. Deadline in 24 hours, preferrably less.

This would have been easier if we actually had had a replacement hard disk for his machine, or had not the replacement laptop been “slow to boot” (ie either full of viruses/worms/crapware or just decomposed). Now it was a no-win in either direction.

Step 0: Sanity check

To successfully perform this trick, you need a spare hard disk, cannibalized from your demonstration station, an external HD, and a wonderful little distribution called System Rescue CD. Oh, and a lot of coffee. Optional extras, which would have been nice, would have been a SATA adapter so that you can have two laptop HDs plugged in at the same time, a second copy of System Rescue CD, and the same number of power bricks that you have laptops to work with. I did this with two laptop, one Rescue CD (stupid) and one power brick (equally stupid). If you have only one laptop to work with, be prepared to plug and unplug hard disks plentiful times, and try to compensate my scribbling with your manifestation of reality. I could probably rewrite this article with a more optimal setup, but then it would seem even less heroic.

Oh, also a functioning computer that you can have for reference and to play music from is essential :)

Now before i let you get your hands in the mud, realize that the narrative that follows is just that. A narrative that follows. I can’t take any responsibility if you follow the story below to the comma and a small black hole appears in the middle of your living room that sucks everything into it and reality just ends and the whole thing just ruins your day. If you’re unsure of what i’ve written and the correctness of it, assume i’ve made a mistake and stop right there.

Now let’s get our hands in the mud.

Using, for instance, the laptop’s HD checking tool built into the BIOS, make sure that the hard disk actually is broken. Remember: “Patients lie.”

If your source disk fails, now would be a good time to label your disks (dymo, magic marker, whatever) and your computers, since on the outside they look very much alike when you can’t boot onto them to see which box really is which.

If your source disk actually hasn’t failed yet but only show signs (or sounds) of age, i’ve added how to do this in way fewer steps at the end.

Step 1: Make “just-in-case” backups

This step is completely optional, but since you’re soon going to do irreversibly damaging things to your source hard disk, it’s probably a Really Good Idea to follow. Also, you’re going to repeat this step soon, so why not practice now when it’s not irrevocably dangerous?

Boot the “broken” laptop with System Rescue CD. Plug in the external HD, which needs to have more free space than the HD you are going to rescue, and needs to be formatted in a way that supports gigantic files (ntfs, ext3). Mount the external hard disk as /mnt/brick (or whatever you like). Figure out, using fdisk -l /dev/sdX, which hard disk it is that you’re trying to rescue. Mine was /dev/sda and the brick was /dev/sdb.

Make a backup copy of the master boot record (MBR) using the following two commands (substituting paths where necessary):

dd if=/dev/sda of=/mnt/brick/backup-sda.mbr count=1 bs=512
sfdisk -d /dev/hda > /mnt/brick/ backup-sda.sf

(tip taken from here). Without the MBR, the computer Just Won’t Boot even if everything else is restored. This i realized only after everything else was restored but hey, i’m nice and i’m writing it here where things are still simple.

The reason why you’re using dd and sfdisk to back up the MBR is that while the Windows XP restore disk has the very convenient tool fixmbr and was provided with your nice HP laptop, it does not include SATA drivers so it won’t see that you have a hard disk on your computer to fix the damn MBR on. Or in essence, it is a useless piece of compressed polycarbonate and it should be a criminal offence to ship it as such as a restore disk. Also, the Vista installation disk you have backstage will not bother running a restore console on an XP installation. Well, mine didn’t. (End rant)

Back up the hard disk using ddrescue, make a backup of the b0rken hard disk. If your paths are like mine, the syntax is ddrescue /dev/sda1 /mnt/brick/sda1-backup /mnt/brick/sda1-backup.log and what it does is copy the first partition of the disk sda onto a file named sda1-backup on the external hard drive and using a log file in case things go haywire. This will probably take a an hour or two. Send St. Anthony some warm thoughts, just in case.

Nota Bena: If you have the two laptops up and running at the same time (because you have two System Rescue CDs), remember to sync and umount the it before pulling the plug and connecting it to the other lappie. If you’re on a gigabit network, screw USB hard disks and copy over the net instead. If you have just one of the lappies up at a time (because you have just one power brick :)) you’ll need to go through the mkdir /mnt/brick && mount /dev/sdb1 /mnt/brick hoop after each startup. Oh, and make sure /dev/sdb1 actually is your external HD brick :)

Step 2: Prepare the target disk

As mentioned, we had a spare disk that was smaller than the disk that had broken. Fortunately, the amount of stuff on the broken source disk was lesser in size than the capacity of the target disk. This is where the dangerous fun parts begin.

Boot a laptop with the target disk using System Rescue CD, or plug it into the system you got running in the previous steps using a SATA adapter/enclosure/doohickey/thingamajig. Give a sigh to the installation you have on it, back up the valuable stuff from it onto the external hard disk. If you haven’t yet done so, start XWindows using the command wizard. Plow through the options until you have a graphical user interface. Start GParted by clicking the icon with the disk symbol. Make really really really sure you are selecting the right disk unit (this is why it might be good to boot up the computer with only that disk connected, and to unmount and unplug the external HD before you commence with the following) and delete all partitions there are on the target disk. Create a new NTFS partition on the disk, filling all of it. Then, using the resize/move partition button, make a note (pen and paper, baby!) how many MBs the partition is. Then, just for good measure, using fdisk -l /dev/sda (assuming the disk you just repartitioned is sda) write down the size info you get there too.

And you think that was scary?

Step 3: Resize the source partition

Go back to the laptop with the broken hard disk. Get GParted running on it like in the step above. Grab that /dev/sda1 partition and Resize it into the exact number of MB as your target disk’s image is, the one you made notes of in the previous step. Breath normally (if you can). Oh, and remember to run the computer on a power brick, not batteries, while you do this. It feels much better. I promise.

At this stage, half of your system probably thinks that the /dev/sda1 partition is still of the previous larger size. If you feel unsure, run fdisk -l /dev/sda to check. Or reboot. Or something.

Step 4: Back up the resized partition

Again, using ddrescue, back up the the partition you just resized to the external HD. You’ll probably need to run through the mkdir /mnt/brick and mount /dev/sda1 /mnt/brick hoop again if you’re running with just one System Rescue CD (and one power brick). In case you have both lappies running, i suppose now is a little to late to remind you that you need to sync and umount the /mnt/brick before swapping it between laptops. If you didn’t, your data is probably fried at this stage, so start from the top. Don’t say i didn’t tell you before, because i just added that bit (see, i can write in a nonlinear fashion even if you’re probably reading this from up to down). Then back up the MBR as outlined in step 1.

Thinking of it, you might as well first back up the MBR and then back up the data, since backing up the data is going to take a lot longer than backing up the boot record. Still, since you just made the data partition smaller, it’s not going to take as long as in the previous data backup phase. If you’re running short on disk space on the external brick, it’s probably faster to run down to the chip shop and get a new disk than trying to gzip the original image, even if the chip shop is closed. OK, down to business.

Suggested syntax:
dd if=/dev/sda of=/mnt/brick/resized-sda.mbr count=1 bs=512
sfdisk -d /dev/hda > /mnt/brick/resized-sda.sf
ddrescue /dev/sda1 /mnt/brick/sda1-resized /mnt/brick/sda1-resized.log

Again, be sure of yer paths yadda yadda (hey, we’re all grown ups so we can take care of ourselves so i’ll stop warning you at this stage).

Step 5: All pieces fall together nicely

Right then, time to put all your pieces together. The partimg manual (linked to in step 1) suggests now would be a good time to restore your resized partition table to the empty disk. I didn’t, because i only realized later copying the MBR is a mandatory step if you want the target box to boot. So it will probably work if you do it in the wrong order too. But i’ll document the procedure here in the supposedly correct(er) order.

Boot the computer with the blank NTFS-formatted hard disk (which we suppose is /dev/sda — oh that’s right, i said i wouldn’t be warning about paths anymore) and the external USB brick plugged in.

dd if=/mnt/brick/resized-sda.mbr of=/dev/sda
sfdisk /dev/sda < /mnt/brick/resized-sda.sf

…and a fdisk -l /dev/sda, a sync and/or a reboot if you weel wobbly. Could be the coffee at this stage though.

Finally, restore the resized partition image onto the new disk:

ddrescue /mnt/brick/resized-sda1 /dev/sda1 /mnt/brick/resized-sda.restore-log

Step 6: The resurrection

Place the restored hard disk in the laptop which used to house the broken disk. Boot that laptop. Be very, very satisfied. Buy yourself a chocolate, because you’re worth it.

Post mortem

I could probably re-write this article using a more optimized setup. But then again, i started with a way more complicated question which was “how can i resize the backup image i’d taken and fit it on the target disk?”. Turned out it was easier to just resize the broken partition and dump that on the new disk. Also, backing up my 160 gig backup image (i’d rather be careful than sorry) from and to the same external USB hard disk took sooooooo long that i was going to see sunrise before a complete copy.

An easier solution that wouldn’t have worked

Here’s how to do this whole trick if your hard disks aren’t broken just yet. Or if you’re migrating to a larger/smaller HD and don’t want to install everything anew. I’m going to assume this time that you’re doing it on a computer where you can have both disks plugged in at the same time. I’m also going to assume you’re only going to move/rescue a disk with one partition. If there are more partitions there, you’ll have to improvise a bit. They’ll all be copied though, but i’ll leave the particulars to you, the enlightened reader.

Finally, i’m assuming that you’ve read the whole article down until here because i’m not going to repeat how you’re going to do it here. If you haven’t, start from the top and i’ll be waiting right here until you’re through, okay?

Case 1: Identical source and target disks

Plug in both hard disks. Boot with System Rescue CD. Verify that your source disk is /dev/sda and your target disk is /dev/sdb (and not the other way around or your data will be forever fried — you might consider making a backup at this stage :) eg by mounting one of them and checking what’s inside.

ddrescue /dev/sda /dev/sdb transfer.log

Wait. Reboot. Rejoice. Piece of cake.

Case 2: Target disk is larger than source disk

Plug in both disks. Boot with System Rescue CD. Verify /dev/sda is your source disk and /dev/sdb is your target disk as above.

ddrescue /dev/sda /dev/sdb transfer.log

Wait.

Start XWindows. Start GParted. Select target disk from the less-than-obvious drop down at the near top right corner of the GParted window. Resize target disk to maximum. Apply.

Reboot. Rejoice. Cake with crusting.

Case 3: Target disk is smaller than source disk

This is what i should have done (see, now i spoiled my own thunder) and is more or less a more efficient re-write of this whole article up until now.

Plug in both disks. Boot with System Rescue CD. Plug in external HD brick. Mount as above to /mnt/brick. Make a backup of the source disk’s MBR if you’re nervous/careful/pedantic. Back up the source disk, just in case (optional for the brave/foolish).

ddrescue /dev/sda /mnt/brick/sda-backup backup.log

Start XWindows. Start GParted. Select source disk. Resize the partition so that it’ll fit on the target disk. Move your pr0n/mp3s/dvdrips to external brick first if required. Exit GParted. Take a deep breath.

ddrescue /dev/sda /dev/sdb transfer.log

Wait. Restart GParted. Resize your newly transferred /dev/sdb1 to fill all of the disk. Apply. Sync. Reboot. Rejoice.

And that’s about the size of it! Oh, and these tricks would probably have worked equally well for backing up other Windowsen, Linuces and OSXen. I just didn’t try.

Tags: , , ,

My CDN provider changed services without my noticing, so this site has seemed down for a while

Things should be back to normal now. Thanks to me dear wife (who otherwise never reads this blog) for informing!

(<geek>In fact, it wasn’t the site, but rather that images and CSS files never were delivered. I’ve done the necessary changes on the backline, but it’ll take a while before the DNS updates have propagated around the intertubes, so this blog will have to stand alone until then.</geek>)

Tags: , ,

A while ago, i caught a jaiku from Jonasl regarding Simple CDN, a Content Delivery Network which is (duh) Simple. Simple to register to, simple to implement and, perhaps most importantly, simple to maintain.

So why use a CDN? Well, basically there are three reasons:

  • Efficiency. Let your back end server do the processing-intensive bits and leave serving out static content to a dedicated workhorse with a phat pipe.
  • Money. It’s/you’re cheap.
  • Geeky. It can be done, so why not. And your next door blogger hasn’t got CDN yet, unless s/he’s on a hosted blog.

Me? I went for the geeky and cheap. And here’s how i enabled CDN for my Wordpress blog:

  1. Register an account at SimpleCDN and jump through the one step confirmation mail hoop.
  2. Register a “bucket” which will represent your stuff served from the CDN. I chose a Mirror bucket with a S3plus back end. A mirror bucket will populate itself and maintain its content without your intervention and the S3plus option is the cheaper one.
  3. Optional fluff: Create a DNS CNAME record to point cdn.yoursite.net to yourbucket-options.simplecdn.net. In my case, cdn.navelfluff.org points to navelfluff-s1.simplecdn.net, where the s1 option enables gzip and fairly sensible expiration headers.
  4. Get and install Yejun Yang’s MyCDN Wordpress plugin. Configure CSS, themes and Javascript with the “pre-url” above, i.e. http://yourbucket-options.simplecdn.net (or the prettier cdn.yoursite.net). No trailing slash in the pre-url!
  5. Hit Reload on your site. View the site’s source code to make sure everything’s in place.
  6. Rejoice.

That’s it. Your CSS, theme files and Javascript will be automagically slurped to SimpleCDN and served from there. Nothing to do further but to rejoice your increased geekdom.

Sure the solution could be improved. I’ve read that Amazon’s Cloudfront is way more efficient than Simple CDN, but with my traffic, who cares. In my world, this is just enough.

Tags: , , , , ,

I’m not sure if i should post this or not. Not because it’s got any information that is secret, but just because it isn’t very elegant. But i’m posting.

Scenario: The Customer has a server in their DMZ. It’s a Windows server and it’s running Terminal services (RDP). A custom application needs to be installed onto this server. For that, the firewall must be configured so that a list of addresses, including the party installing the application, can access RDP and the port the custom application will answer on. I’m on the Inside net doing the firewall configuration.

So how can i test that RDP actually works from the outside, when i am on the inside? That would probably be easy if i had a Windows box i could RDP into and then RDP out of it to the customer’s server. But i don’t.

Enter (cough) Linux. And (cough cough) Cygwin.

  1. Install Cygwin on your Windows laptop. To install X-Windows, choose to install “xinit” from the X section. The rest of the files will follow.
  2. Run Cygwin. Exit Cygwin (it’s voudou, don’t question it).
  3. As administrator, run Cygwin and start X (or XWin or startx). Click away errors (more voudou).
  4. Start PuTTY and enable X forwarding.
  5. ssh into Linux box on the Outside you have access to.
  6. Start tsclient on the Linux box, which will the graphical stuff tunnel over ssh and end up on your X-Windows which is running on Cygwin/X which is, in fact, running on your Windows box. I think we have two or three layers of tunnelling here, but i’m not sure.
  7. Connect to the server in the basement, going through an improbable chain of loosely coupled and technically incompatible loops.
  8. Marvel.

So there. Didn’t say it was elegant. I’m not particularly proud of the solution, but at least i showed it worked. The elegant way would probably have been to use my cell phone to hook my laptop up to the Internet and get to the DMZ server from there… but where’s the fun in that? ;)

Tags: , , ,

Navelfluff är en trespråkig blogg. Ibland häller jag ur mej på svenska, ibland på finska och ibland på engelska, lite beroende på vad jag har att säja och till vilken publik jag riktar mej. Språkindelningen annoterar jag med Wordpressens kategorier, vilka jag oxå använder för kategorisering av själva innehållet (geek, Timor-Leste, pynja, osv). Det är en debattfråga huruvida det kan tänkas syntaktiskt korrekt att använda samma slags metadata att annotera både innehållet och språket för inlägget men jag geek och inte lingvist, så jag godkänner mitt beteenede i detta fall.

Wordpress å andra sidan är inte mångspråkig, vilket kanske inte är så konstigt med tanke på att de flesta bloggar skrivs på ett och samma språk. Visst, man kan få WP att tala ett annat språk, men det är fortfarande bara tal om ett annat språk i taget. Vilket jag tyckte var lite trist.

Ända tills lite tidigare idag tyckte nämligen Navelfluff att “please leave a comment” för alla inlägg, oavsett vilket språk inlägget var skrivet på. Och i samband med en kommentar jag fick angående Tarski-temats lite lakoniska “no comments”-statustext i anslutning till inlägg som saknar kommentarer (till skillnad från inlägg som inte önskar kommentarer) så tog jag och gjorde nåt åt saken.

Följande plugin lokaliserar skriv-en-kommentar- eller inga-kommentarer -texten. Pluggaren baserar sej på change_no_comments_text -pluginnen av Tony Trainor. Förklaring följer efter koden.

function change_comments_text($text, $number) {
    global $post;
    $lang = "any" ;

    $cats = wp_get_post_categories( $post->ID );
    $stac = array_flip( $cats );

    if( array_key_exists( 3, $stac )) $lang = "sv";
    if( array_key_exists( 4, $stac )) $lang = "fi";

    $text = Array( "any" => array( 0 => "Please leave a comment! ",
                                  1 => "1 comment ",
                                  2 => "$number comments "),
                  "sv" => array(  0 => "Skriv en kommentar! ",
                                  1 => "1 kommentar ",
                                  2 => "$number kommentarer "),
                  "fi" => array(  0 => "Anna palautetta! ",
                                  1 => "1 palaute ",
                                  2 => "$number palautetta "));
    $num = $number;
    if( $num > 2 ) $num = 2;
    $ret = $text[$lang][$num];

    return $ret;
}

add_filter('comments_number', 'change_comments_text', 10, 2);

Varje postning har ett antal kategorier (noll eller flera) och de representeras som en array vars värden utgör kategori-ID:na. Efter en del reverse-engineering kategori-arrayn (att de facto läsa dokumentationen är tråkigt :)) insåg jag att mina svenskspråkiga inlägg har en kategori-ID 3 och mina finskspråkiga ID 4. [0] 

I koden används ett lite fult trick, nämligen array_flip samt array_key_exists för att kolla om artikeln har kategori-ID tre eller fyra. Samma sak skulle ja antagligen skrivit elegantare med array_search men av nån orsak fick jag inte den funktionen att funktionera. Men så är jag oxå nybörjare, så jag har en bortförklaring.

Slutligen konstateras att jag har ingen aning vad parametrarna 10 och 2 gör i slutet av alltsammans.

Har du en flerspråkig blogg (hej skrubu! hej nikc!) och språk-kategoriserar dina inlägg så behöver du “bara” gissa fram vilka kategori-ID:n du använder (vink: ersätt return $ret; i koden ovan med return "$ret <!-- $lang $num " . implode( "|", array_values( $cats )) . " -->";), infoga ett kommentarblock i början av koden med de av WP obligerade kommentarfälten, inslut kodsnutten i <?php ... ?>-markering, och slutligen spara hela rasket i ditt wp-content/plugins -direktorat. 

Skulle det mot all vettig förmodan finnas en social beställning på denna pluggare i ett människovänligare format, så … vänligen skriv en kommentar!

[0] Övriga (samt o-språk-kategoriseade) inlägg behandlas som om de vore engelska och de med flera språk-kategoriseringar får den sista lokaliseringen i listan. Det är så sällan jag skriver ett inlägg med flera såpråk att jag inte orkar koda om pluggaren så att den lokaliserar till alla noterade språk.

Tags: , , ,

I have teared enough hairs from my skull to make a rug trying to install Ubuntu Server 8.10 on a HP ProLiant DL360 server. The short answer is it will not work and the quick solution is install Ubuntu Server 8.04.1-LTS instead.

The longer answer is that has to do with the disks. The DL360 (and supposedly its sibling servers) use a RAID that Ubuntu 8.10 does not understand. It doesn’t matter if i tell it to enable or disable SATA RAID, or to use or not use LVM. The system installs nicely but after that, it just won’t boot. Same goes with both the x64 and x86 versions of Ubuntu Server 8.10. Since the RAID is enabled in hardware, i am supposing that my disks are mirrorred and that i’m protected on that plane. The 8.10 setting probably just allowed me to actually see that we have a RAID going on. Transparency is always nice.

I’ve read incoherent (at least to me) explanations that you should go and poke with Grub to get things right, but i couldn’t get a comprehensive enough explanation that i would know exactly what i was doing. So i decided not to be bothered. And then i read in another article that thou shalst screw the latest version and just go with the previous one, and things are nice and fine. You should even be able to update to the latest version over the command line, so you’ll get virtual machine support and all the other goodies the 8.10 provides.

There are two implications. One: install 8.04 and you’re up and running before your coffee gets cold (even in a well ventilated server room), or two: if you know exactly how to actually get 8.10 up and running with the RAID discovered, please tell me in the comments. Thank you.

Tags: , , , ,

We had a set of tickets to win at work today, and since i didn’t think it was fair that the first one to answer would win them, i rather had a raffle between the folks who responded to my mail within an hour.

But how do you draw a name from a hat in an impartial, elegant, fair and accountable way, rather than resorting to actual pen, paper and a hat? Enter Linux.


llauren@echkilon:~/tmp$ cat -> thehat
Alice
Bob

Carol
Dave
Eve
Isaac
Justin
Mallory
Oscar
Peggy
Steve
Trent
Walter
Zoe
llauren@echkilon:~/tmp$ shuf thehat | head -1
Eve
llauren@echkilon:~/tmp$ shuf thehat | head -1
Justin
llauren@echkilon:~/tmp$ shuf thehat | head -1
Walter
llauren@echkilon:~/tmp$

So there! The winner is Eve (scary!) with runners-up Justin and Walter. If we had three equal prices, the corresponding line would be shuf thehat | head -3.

I’m sure you could make the routine even more accountable by calculating a cryptographically valid hash of the result or at least signing the lottery session (since you can’t really store the evanescent value of /dev/urandom used for shuffling the hat), but i’ll leave that to a more critical use case.
Ah how i love Open source :)

Tags: , ,

I had a less than ideal experience with Hyper-V the other day. We finally got a server that we had borrowed to a customer back and it was time to install it for ourselves. Windows server 2008, here we come.

Since i think it’s a wrong-doing of nearly criminal proportions to have a server to run just one instance of an operating system these days, it was only natural to me to run a hypervisor layer on the server and stack servers on that hypervisor layer. And since Windows server 2008 Enterprise comes with Hyper-V and licenses to run four instances of itself on it [0] that seemed the logical path. After all we’re a Microsoft shop, so screw VMware and screw open source solutions like the Xen hypervisor.

Installing Windows Server 2008 was nearly painless. Since the hardware i was working on is a HP pizza box, the correct working order is to boot the box with HP’s SmartStart package. This tool helps to configure the RAID, set up the partitions, install Windows Server and all the platform specific drivers (and keep them up to date!). Sadly, the version of SmartStart i had would not support Windows Server 2008, but at least i got the RAID configured with it and after that i could boot with the Windows Server installation disc while i was downloading a newer SmartStart. After Windows Server 2008 was installed, i was able to “patch” the system with SmartStart version eight point something. This added, among other things, the HP network interface teaming which adds redundancy and throughput to the network connectivity.

Adding Hyper-V support for Windows Server was also nearly painless. From the server administration console, i just added the Hyper-V Role and with the obligatory system reboot, i changed the BIOS settings to allow “Intel Virtualization technology”.

But from there, it was all downhill.

The Hyper-V console dryly informed me that the virtual machine management bits were unable to start (the two other components for Hyper-V started without a hitch). And to make things worse, the computer suddenly refused all incoming connections. I checked the firewall, but nothing there seemed to suggest that the system now was in some kind of hardened mode. Outgoing connections were okay. The firewall explicitly allowed remote desktop connections, but nothing would come in. Not even a ping.

Today i removed the Hyper-V role and the server came back to normal. It is possible that the HP network interface teaming bit is incompatible with Hyper-V, as you’re supposed to leave one NIC reserved for the host and assign the remaining NIC(s) for the virtualized machines but even when i dissolved the teamed network interface cards, Hyper-V just wouldn’t budge.

I suppose we’re either going to have to wait for a HP/Hyper-V update. If it really is true that you can install true hypervisor support on an already deployed Windows Server 2008 platform, then we can wait. Otherwise, it just feels like a waste. Or time to give Xen another look…

[0] i know i’m starting to sound recursive here; i was intending to run four instances of Windows Server 2008 on the Hyper-V built into the base installation of Windows Server 2008

Tags:

I got a service call from our biggest customer on Sunday. The girl at the check in desk told me that she couldn’t get to the reservation system, so she couldn’t check customers in or out. She also could not open her email. And there was something about cooling equipment in the engine room that had failed.

That last bit worried me that a reboot of the workstation might not do the trick this time.

Last Sunday was also Fathers’ day. Not a good day for an emergency. I am happy that the call didn’t come before my kids had the chance to “wake me up” and deliver their congratulations and prezzies. They had really been waiting for it. In fact, i even had the time for a proper breakfast. But the rest of the day would seem to go to the dumps. My wife was also on duty call that weekend, and the kids and i were supposed to show up at my parents in law mid-day. Doom was impending.

I called the customer’s site security manager and got the news. There had been a power failure in a transformer a few blocks away. The on-site UPS was sucked dry, and the generators had failed to start. It was a cascade failure, and it was not good. But hey, they are a big customer. Maybe the servers would come back once power had been restored.

Power was back at about eleven-thirty. I did a bunch of phone calls to the customer’s different sites to ask whether their reservation systems were down or up, while the kids were growing louder. They were all dressed up and ready to leave and did an excellent job getting on each others’ nerves.

The reports from the sites were contradictory to say the least. The reservation system was up, no down, no it was up but now it’s not. Email was still down. And the lunch at the in-laws was about to start. So i gave them too a call and said that we’re going to be a few minutes late but that i’d probably have to set up a remote office at their place and do some phone calls and use my computer to take a remote connection to the customer. If all was really bad, i might have to skip lunch and visit the customer’s site, but the kids would be there anyway. And it surely wouldn’t take very long.

I felt the first grain of bad karma fall on me.

From my remote office, i was able to talk with the firewall, but the mail server didn’t respond to pings. And with the site manager on the phone suggesting that i should maybe stop mucking about with remote help and get my servicing arse over there instead, i concurred. Since i don’t have access to the servers’ ILO management system (which works even if the server is off and through which i could be able to remotely switch on the server), i thought i might as well look good in the customers’ eyes and drive down town to push the damn power button and be back in time for desert. Or coffee, if it was more than one server.

On the way down town, i had another chat with the customer’s IT manager and he decided he too would come to the disaster area. At the time, i thought it might be overkill. It’s probably just a flick of the switch on a server and we’re back up and running.

Boy i was wrong.

Things were a bit more silent in the engine room than usual. The air conditioning was okay, which was the first good bit of work related news for the day. We proceeded to fire up the servers. The domain controller was off. The file server was off. The mail server had hung, or it was off, or just b0rked. The intranet was down. The virtual server server (in lack of a better term) was off, and with it, the virtual servers. The disk array was on but one of the virtual servers could not connect to it. The reservation system was off for this site but up for another. The billing system, it turned out, was off. The orders printer in the kitchen was blown. The applications to operate and monitor telephone calls, wake-ups, keys and (oh!) the mini bars were off. Also, our management PC was off. And to top things off, the console thingy that one would operate half the servers with had suddenly decided that it wanted a password which nobody had. And all this was by no means apparent with a glance. Problems oozed in as others were solved. On site, three fathers: the site security chief, the IT manager, and me. How could things be better.

We started with the most critical systems. At this time, i had mobilized half of the Infra crew, most notably Niko who got the virtual servers and the disk array into order and Tero who was on a beach in Spain and remote-instructed us from there. Had it not been for their expertise, the customer’s systems would probably still be down. Soon, we had the check in system up and the three systems that need to run in tandem (trindem?) to take care of billings was slowly back in operation. Email required an extra booting, but it also came back.

Seldom had i more wished for proper documentation of the system than now. An inventory of equipment and servers and how to get everything running even for a guy like me who doesn’t spend most of his billable hours at this customer… would have save the day.

At this time, lunch, dessert and coffee were but a pressing but sad memory. By each hour, i had to tell my wife that this won’t take much longer and we just need this one system back up, after which it turned out that that one system really is a whole bunch of subsystems that first need to be physically located to get into operation. I felt the bad karma pile in massive quantities.

At this time i should probably tell you about the third server room on site. The first two ones are like proper server rooms. There’s loud air conditioning. There are a bit more monitors, cables, power supplies, cardboard boxes and junk lying around than there should be. There are racks with loud expensive technical equipment having lots of lights that blink. There’s a crapload of cables going in front of the boxes that blink most, so you can’t really access the equipment without a jungle machete or a lot of patience (the second option is preferred). Many of the servers are tightly crammed because at the time, nobody thought you really would need to get to the other side of the servers. Say, to plug in one of those bulky CRT monitors lying around because the console demands a password which, as i probably mentioned, nobody knew. And you couldn’t use remote desktop, because the stickers on the computers failed to mention the hostname or IP address of the box. And you would need to get to the computer to see if the apps on it are running. And just to really top it off, a few of the machines refused to start without a keyboard plugged in, and since the console was off-line because nobody knew of the password, it wasn’t considered a working keyboard, at least not by the computer.

Compared to the two main server rooms, the third server room is a mess. The non-techie people working around there use the room for ad-hoc storage of audiovisual equipment (speakers, cables, microphones, amps, cables, more cables…) and junk. I had to step around a cardboard box of miscellanea just to get into the room. A ghetto blaster was obstructing half of the entrance. A snake pit of cables was lying on the so called operator table, partly on top of and partly under the keyboard, mouse and KVM switch.

Above the operator table are a few shelves with servers. Well, actually they aren’t servers of the kind you would call servers. They are more like old workstations on server duty, in part because it’s cheaper that way and in part because nobody seems to know whether an application on one “server” will play nice with the application on another. Thus, there is one box per application. Per critical application, i might add, and that the workstations are five years old or more, and that they live in a crammed space on the second to top shelf in a room filled with snakes, audiovisual trash and a ghetto blaster. I really should have taken a picture.

Since nobody thought of it at installation time, the “servers” were not set to start automatically once they got power. In fact, this held true for nearly all computers, be they proper servers or workstations working as servers. And even if they had started, many of the critical applications still needed somebody to actually log in to the computer and start the application in question. Here, the computers were not part of any site-wide Windows domain, so we had to guess the passwords, just to keep things interesting.

It was a quarter past four when i headed back towards the remains of the fathers’ day reception. The other guests had looked after our kids who had been a bit confused on the non-presence of their father on that fathers’ day reception. I gave my kids a big hug, apologized to the company present, and hoped that i’d never have to see a computer again.

Boy was i wrong.

Tags: , , ,

« Older entries

Bad Behavior has blocked 588 access attempts in the last 7 days.