Posts Tagged ‘server administration’

Using VMWare ESX 3.0 in a SATA drive environment

Posted in Computers on November 4th, 2007 by chris – 1 Comment

About a year ago I wanted to setup a VMWare ESX 3.0 server to test out their new (at the time) release, unfortunately, I didn’t have the funds needed to setup a true production environment with it. I was able to find a fantastic alternative that is great to learn off of at a much more cost effective price point. It seems that the LSI Logic driver included is compatible with scsi and sata controllers, which is great news for us small folks wanting to check out this virtualization environment.

I won’t get into the details of a step by step setup installation but will jump ahead to the post installation changes needed. I’m probably going to miss or incorrectly state a technical term here or there but if there are any corrections needed or questions let me know.

The hardware (least the parts that matter for this writeup) I ended up using is as follows:
Motherboard: Tyan Transport GX28 (B2881)
Controller: LSI MegaRaid
Hard Drives: Seagate 320GB SATA 2

Keep in mind that the installation of ESX can be on any drive, including IDE, the datastores are what need to be on a supported device such as SAN, iSCSI, etc, or in this case a budget SATA setup. After the initial installation is complete you’ll need to check on and modify a file. It took some rummaging around at the time and after some good old trial and error along with a couple installations I was able to narrow it down to the following steps.

#cat /etc/modules.conf

alias eth0 e100
alias eth1 tg3
alias eth2 tg3
alias scsi_hostadapter megaraid2
alias usb-controller usb-ohci

From the above lines I think I did this to either check that “alias scsi_hostadapter megaraid2″ was there or I added it in.

#lspci | grep LSI

020:0e.0 RAID bus controller: LSI Logic / Symbios Logic: Unknown device 0409 (rev 0a)

This is to find out what the device number is for the controller, in this case it is “0409″.

#cat /etc/vmware/ | grep LSI

vendor ,0x1000,Symbios,LSI Logic / Symbios Logic
device 0x1000,0x0050,scsi,LSI1064,mptscsi_2xx.o
device 0x1000,0x0054,scsi,LSI1068,mptscsi_2xx.o
device 0x1000,0x0056,scsi,LSI1064E,mptscsi_2xx.o
device 0x1000,0x0058,scsi,LSI1068E,mptscsi_2xx.o
device 0x1000,0x005a,scsi,LSI1066E,mptscsi_2xx.o
device 0x1000,0x005c,scsi,LSI1064A,mptscsi_2xx.o
device 0x1000,0x005e,scsi,LSI1066,mptscsi_2xx.o
device 0x1000,0x0060,scsi,LSI1078,mptscsi_2xx.o
device 0x1000,0x0407,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x0408,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x0411,scsi,LSI Logic MegaRAID SAS1064R,megaraid_sas.o
device 0x1000,0x1960,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x9010,scsi,LSI Logic MegaRAID ,megaraid2.o
device 0x1000,0x9060,scsi,LSI Logic MegaRAID ,megaraid2.o

The spacing in there is exactly how it was returned, without the word wrapping. I’m generally pretty organized with files and code so seeing this I needed to keep myself from fixing it :) Take note of the device line with 0×0408 in it, this will be changed to 0×0409 which we found out from the previous command.

From: device 0x1000,0x0408,scsi,LSI Logic MegaRAID,megaraid2.o
To: device 0x1000,0x0409,scsi,LSI Logic MegaRAID,megaraid2.o

Now that the updated the file to see the controller card we’ll need to update ESX and reboot. I’m not fully sure if each of these are needed, I would think that at least the first is, but I have done in before for safe practice to make sure the system is up to date.

#esxcfg-boot -p (reloads PCI data)
#esxcfg-boot -i (reloads initrd information)
#esxcfg-boot -b (sets up boot information)

Upon rebooting and logging into the VMware Virtual Infrastructure Client you should be able to access that datastore and begin to create virtual machines. Watch out with keeping snapshots around too long, I talked about this some in a previous post. I now also recall watching the various services starting on boot and that it would begin to fail on a particular one until this was fixed. I didn’t write that down but I’ll try and find out what it is and add it here.

VMWare: Watch out for those snapshots

Posted in Computers on October 12th, 2007 by chris – 1 Comment

I consider myself to be fairly cautious when implementing a new server level application, especially so for the level in which ESX runs and how much depends on it to be stable. Something escaped my attention, or perhaps it wasn’t talked about much in the docs I read. Either way it’s too late now and I’m not about to dig through all of those pdf’s to see if I missed it.The ESX 3.0 server was setup just over a year ago and from my previous trials with ESX, and VMWare’s other virtualization products, snapshots have been fantastic and the ability to revert to them in various scenarios such as if a recent OS/Application patch went sour saved a lot of time for peoples. Unfortunately I forgot the saying “if it looks too good to be true, then it probably is”, and such is the case it turns out with the snapshot.

Over the past couple weeks I’ve noticed that the space in the datastore was decreasing at a more accelerated rate then usual. Peeking my curiosity I poke around and with a recent influx in data I contributed it to that, but I didn’t tally precise number to see if they balanced. Being a busy time and a lot to do I took it for what it was and moved on. A couple days later I wake up to one of the virtual machines no longer being accessible and checking the datastore again I see it dropped over 30gb’s during those couple days and reduced the free space down to 2mb. I’m surprised, confused, and aggravated that a machine went down. I’m sure many admin’s have experienced this at one point or another. By either browsing the to virtual machine’s folder in the service console or through browsing the datastore when you right click it by looking for files with the word “delta” in them will indicate if they are from a snapshot. I no longer have the exact error message that was displayed at that point but I was given the option to “Retry” or “Abort”. I clicked retry and then was faced with:

There is no more space for the redo log of ComputerName-000002.vmdk. You maybe be able to continue this session by freeing disk space on the relevant partition, and clicking Retry. Otherwise click Abort to terminate this session.

I proceed to run various commands in both the console and virtual client to work the problem. One of the threads in vmware communities that came up more than once in the searches is I’ve tried various suggestions in there and I think some additional ones as well. I first tried to remove the snapshot that was in the snapshot manager (this was by clicking delete and not delete all) and after several hours of processing it removed the snapshot from the gui but the vmdk files were certainly still there. After which I tried:

vmware-cmd <cfg> hassnapshot
hassnapshot() =

Yes, thats a blank. For whatever reason it wasn’t detecting that any snapshot exists even though there are numerous delta files in the virtual machines directory. I then proceeded to create a snapshot in the snapshot manager and then delete it, this time with delete all. Still no luck, they were all still there. I continued by removing the vm from the inventory (not from the drive! – be careful there, there’s a big difference) and re-added it. No dice. With the outlook becoming more gloomy I tried creating the snapshot in the service console with:

vmware-cmd <cfg> createsnapshot <name of snapshot>

and was returned with the error of

VMControl error -11: No such virtual machine

I checked the path then checked it again, it was correct. I searched around google for a while too and didn’t find anything helpful with the message. I was thinking that it may have been a somewhat generic message that could have meant several things.

In the end I have resorted to removing a vm that was recently built to clear up enough space to boot it, and thankfully it has not been configured yet so not much time gone there, and remove the data from the vm so it can be completely removed to be removed and a fresh vm built. This particular server was used as storage for the network and to hold backups so I am thankful there isn’t much configuration that needs to occur once it’s rebuilt. I thought as the data was transferring, all 75gb or so, that I would write this article up. There sure is a lesson learned here – regardless of how much you may trust a piece of software to work right, it can always turn on you. This goes for the mac users out there too.

On a side note, the <cfg> tag’s above is a common abbreviation used in VMWare’s documentation which corresponds to the full path and file name of the vmx file. For example, in this scenario mine is similar to: