Sunday, September 4, 2011

Whatever Can Go Wrong, Will Go Wrong (So Think Ahead)

Tux, the Linux penguin
I've talked a bit about my side project/venture over the past few weeks. I've also mentioned some bumps along the way. I've had two more in the past twenty four hours. Nothing major and I'm continuing to make good progress but the number of problems has been higher than normal. I feel a bit like my resolve is being tested. I'm doing well if that's the case. Adversity isn't a whole lot of fun but it's inevitable.

I'd been doing my main development and testing on a Linux VM running on my laptop. It's a beefy machine with eight gig of memory and a quad core i7 Intel processor but I'm doing a lot of database related work and the IO was really killing performance on both the VM and my Win 7 host OS so I made my first semi large purchase for the business, a low end Acer desktop system that is quad core AMD based with four gig of RAM and a 1TB disk. The new desktop/server is arguably a bit slower than the laptop but it gets to use all four CPU's and doesn't have to go through the overhead of running on a VM. I'm also fairly sure the disk IO is significantly faster based on the performance I'm seeing which is what I'd expected.

The first problem I ran into was installing the server edition of Ubunto Linux 11.04. I downloaded the image, burned it to a RW DVD and it booted to the initial install screen just fine. Soon after I told it to start the install though it would lock up. I tried disabling various options but the result was always the same. It turned out to be a bad image write. Somehow I'd missed the error message on my laptop when I did the burn. Not a huge deal but I lost an hour or two to debugging the issue.

Once the install was done I was able to transfer everything over to my new server without issue. I had to install a few additional optional packages to get things to work but I was up and running by midnight. With a sigh of relief I decided to call it a night since it had been a long day and I was beat.

Another problem I've had along the way is my wireless router. It is less that a year old but it loses its mind periodically and needs to be power cycled. The server wasn't plugged into it, but my laptop does use it get to the network. About thirty minutes after I called it a day the wireless decided to stop working. In theory this shouldn't have been a problem. Unfortunately I had started everything up while remotely logged in from my laptop and when the laptop dropped off the network the jobs I was running to gather and process data died. Luckily I woke up at 6:30AM and was able to get things working again quickly but I lost six plus hours of data gathering which isn't fatal, but painful since the project I'm working on is highly dependent on data.

The tough thing about any task is that the devil is in the details. In both these cases I neglected to take into account a potential failure scenario and I got bit. I could argue that chance didn't favor me but at the end of the day that doesn't matter; what matters is results.

I've mitigated the odds of my second failure repeating by running all of my tasks in the background via a handy program called "screen" that lets me detach my virtual terminals and reattach from any place I can log into the server from. The screen program is extremely venerable but still useful in circumstances like this.

Time to get back to work. Number one on the agenda is some sort of reliable data and code backup solution. Chance clearly prefers those who ask it few favors.

Image via Wikipedia
Enhanced by Zemanta

No comments:

Post a Comment