Posts Tagged ‘vmware’

LSISAS3442E-R BIOS and Firmware v1.28.02.00 and ESXi 3.5u5

Posted in Computers on June 27th, 2009 by chris – Be the first to comment

This is partly a continuation of a previous post which I described my Esxi 3.5u5 I/O problems. Since that post I’ve searched google and read in the vmware community forums for any known performance issues with Raid 1E in a four drive configuration compared to Raid 10, checked if there was a known problem with the controller card itself I’m using (LSI SAS3442E), and also proper disk alignment with virtualization which is something I’ll probably revisit once other matters are taken care of.

A point I didn’t mention in my previous post is that when I first purchased the raid card I checked the firmware and updated it as people should. This was around the first of the year and was version 1.26 of the firmware and version 6.24 of the bios, LSI Logic packages them together for ease of upgrade for the user. Good job to LSI on that one. A couple days ago I went back to their site to check for any updates and I happily saw there was, firmware v1.28.02. I’m not sure of what the new bios version is since I didn’t write it down from the screen after flashing and it’s not readily available from LSI’s support site, but since as I mentioned packaged together it’s in with the firmware download. Anyhow, v1.28.02 came out May 5th and scrapping bottom as to what to do I wasted no time in downloading it and prepared for the update.

I created a standard dos boot disk from windows on one floppy and put the firmware files on another. I don’t have a floppy drive in the server but I do use a handy usb based floppy drive. I plugged it in, updated the boot options in the motherboard bios and it was up and running. It did error a couple times on not finding the ‘command.com’ file on the firmware disk but I negated that by copying the one on the dos boot disk to the c:\ drive that gets created in memory. Then when the error came up I could leave the firmware disk in the drive and reference the file on c:\ drive. A couple adapter/chip selections and confirmations later and the new firmware is loaded.

Upon the server booting I find the nearest file on my windows home server virtual machine that’s somewhat sizable to copy to my local machine just to see if there is any noticeable gain. I find a 1GB file and see the MB/sec in the windows transfer window be about 25-35MB/sec. A smile instantly comes to my face. Checking the network utilization that’s hovering around 45% on a gigE adapter. About half way through the file the speed drops to a piddly 2-3MB/sec. I think it has something to do with my local machine since the hard drive light was still ticking pretty steadily. I cancel the transfer and move onto something a little more scientific.

I booted up a separate vm I use for software testing, which is an XP SP3 x32 build set to use an LSI Logic controller type within the vm settings. I re-run the same tests (again referencing my previous post) and for ease of reference below is the before results, taken from the previous post, and after results which are currently the latest.

Before

Test Name Total I/O/sec Total MB/sec Avg IO Resp ms Max IO Resp ms % CPU
Max Throughput 100% Read 12,392.23 387.26 4.598 291.9839 36.33
Real Life – 60% Rand, 65% Read 358.07 2.8 162.0115 1,115.5989 9.73
Max Throughput – 50% Read 243.33 7.60 240.3390 2,361.3908 8.71
Random 8k – 70% Read 390.20 3.05 147.6751 927.4924 9.53

After

Test Name Total I/O/sec Total MB/sec Avg IO Resp ms Max IO Resp ms % CPU
Max Throughput 100% Read 3,119.41 97.48 1.0043 150.996 23.23
Real Life – 60% Rand, 65% Read 347.95 2.72 162.15 394.88 9.51
Max Throughput – 50% Read 240.38 7.51 240.24 628.91 9.61
Random 8k – 70% Read 387.74 3.03 146.42 442.96 12.66

The first thing the jumps out at me is the ‘Max Throughput 100% Read’ test with how much it dipped. The others are about equal and since I did only run these once (I don’t have time to average out 3 5min tests on each) it could be within the margin of error for these. I do know one thing, and so far within the real world test of transferring files the difference is as clear as day in the direction of being a great improvement.

I’ll give it a little more time and general usage before I fully make up my mind on this but I am very hopeful. I think the next change would be upgrading to ESXi vSphere 4 (aka ESXi 4.0), I think the LSI Logic drivers have used have been updated.

Do you want BusLogic or LSI Logic? Choose wisely.

Posted in Computers on April 20th, 2009 by chris – 2 Comments

I’ve been setting up a new home server to host a few vm’s and noticed along the way the poor performance I was seeing from the server.  For an initial rundown of the specs:

Motherboard: Gigabyte GA-P35-DS3L
CPU: Penryn E8200 2.6GHz
Memory: A-DATA 8GB DDR2 800
Storage: LSI SAS3442E
Network: gigE

At this time the RAID configuration I was running was RAID 1E with three WD 750 WD7500AACS hard drives.  The mentioned “poor performance” was a combination of the following.  Transfer speed from another computer on the network of about 3.5 MB/s from a Vista (64bit) physical machine to Windows XP sp3 32 bit virtual machine, downloading updates for another windows xp virtual machine, while trying to listen to mp3′s from a virtual machine to a physical machine the song’s buffer would cut out and sometimes end the song altogether and move to the next, and sometimes data transfers would stop mid-transfer and simply “complete” without an error but that data did copy or move.

Having spend a fair amount of time troubleshooting hardware and performance issues before for desktops and servers I turned to the usual suspects of CPU, memory, network, hard drives (aka I/O).  I cut all activities (moving files, listening to MP3′s, etc) testing one at a time and adding one systematically to see if it was a particular one causing the trouble.  I used the very handy ESXi Performance tab within the ‘Infrastructure Client’ to gauge system performance.  I/O stood out as the culprit and I chalked it up hard drives I was running, which you may have noticed are Western Digital’s “Green” line.  This means that rather than running at a full 7200rpm it stayed mostly at 5200rpm increasing as needed.  The benefit here is that it draws a lot less electricity, and I already had them on hand when I built the server and were hoping they would be sufficient.

Having it narrowed down to I/O I turned to the knowledgable VMWare Communities I came across the a thread (a newer thread was made since it was getting so long, accessible here) of people posting specific I/O performance using IOMeter and based on the IOMeter configuration file found there I ran some tests of my own.  (These test’s were all performed in Windows XP sp3 32bit, as well as the other IOMeter results later shown)

Test Name Total I/O/sec Total MB/sec Avg IO Resp ms Max IO Resp ms % CPU
Max Throughput 100% Read 1392.61 43.52 43.99 808.31 19.51
Real Life – 60% Rand, 65% Read 113.25 0.88 528.28 1594.38 14.10
Max Throughput – 50% Read 94.96 2.97 628.82 2266.18 14.57
Random 8k – 70% Read 75.55 0.59 788.23 5399.25 13.85

The numbers above caused my jaw to drop and over the following couple days from then pondering what to do I saw a sale Dell was having on Western Digital “Black” hard drives model WD1001FALS.  Unlike the Green drives these were made with performance in mind.  So I ordered up four of them in 1TB with the intention of configuring them up in RAID 10.

Days pass, I get them in, install them, and set them up in RAID 10.  In short order I move the same virtual machines to the new datastore and re-run the same IOMeter test to see the results.

Test Name Total I/O/sec Total MB/sec Avg IO Resp ms Max IO Resp ms % CPU
Max Throughput 100% Read 743.40 23.23 81.7536 495.4186 15.03
Real Life – 60% Rand, 65% Read 148.70 1.16 401.6210 2453.4196 13.60
Max Throughput – 50% Read 125.92 3.94 475.4601 1919.5262 13.35
Random 8k – 70% Read 139.83 1.09 426.9836 2789.4462 14.05

I was astounded, and rather frustrated at this point.

I turned again to the VMWare Community and saw a couple scattered posts about choosing different ‘SCSI Controller Type’ based on the SCSI/RAID card that’s being used.  Checking the type in use I see it’s ‘BusLogic’, which I thought was odd since I’m most certainly using an LSI card.  The BusLogic setting is what ESXi created for me when setting up a new virtual machine and not something I was prompted to choose.  I proceed to change the SCSI Controller Type from ‘BusLogic’ to ‘LSI Logic’ for the previous configured vm but that resulted in nothing but blue screens with instantaneous rebooting once the blue screen was reached.  I conceded to creating a new vm and reinstalling Windows XP sp3, with changing the SCSI Controller Type after the wizard of creating a new virtual machine and the results were…

Test Name Total I/O/sec Total MB/sec Avg IO Resp ms Max IO Resp ms % CPU
Max Throughput 100% Read 12392.23 387.26 4.598 291.9839 36.33
Real Life – 60% Rand, 65% Read 358.07 2.8 162.0115 1115.5989 9.73
Max Throughput – 50% Read 243.33 7.60 240.3390 2361.3908 8.71
Random 8k – 70% Read 390.20 3.05 147.6751 927.4924 9.53

Certainly better than what I was getting previously and I don’t have the same issue’s mentioned previously.  I am not thrilled by any stretch about the write performance and the read transfer doesn’t hit that high outside of the test.  I don’t have a number offhand but its not even close to that, its under 10MB/s.  At this point I’m not sure right now what the next step may be.  Is it a limitation in ESXi since it is the free version?  Is it the RAID card?  Physical to Virtual data transfers (Vista to XP) cap’s out at about 6 MB/s with a more consistent speed of about 4.35.  Certainly nothing to get excited over but it is stable and I’m not having the same issues.

The point of this is if you are deciding between BusLogic vs LSI Logic there is most certainly a difference and one should test prior to adding virtual machine’s to make sure the best is chosen.

Using VMWare ESX 3.0 in a SATA drive environment

Posted in Computers on November 4th, 2007 by chris – 1 Comment

About a year ago I wanted to setup a VMWare ESX 3.0 server to test out their new (at the time) release, unfortunately, I didn’t have the funds needed to setup a true production environment with it. I was able to find a fantastic alternative that is great to learn off of at a much more cost effective price point. It seems that the LSI Logic driver included is compatible with scsi and sata controllers, which is great news for us small folks wanting to check out this virtualization environment.

I won’t get into the details of a step by step setup installation but will jump ahead to the post installation changes needed. I’m probably going to miss or incorrectly state a technical term here or there but if there are any corrections needed or questions let me know.

The hardware (least the parts that matter for this writeup) I ended up using is as follows:
Motherboard: Tyan Transport GX28 (B2881)
Controller: LSI MegaRaid
Hard Drives: Seagate 320GB SATA 2

Keep in mind that the installation of ESX can be on any drive, including IDE, the datastores are what need to be on a supported device such as SAN, iSCSI, etc, or in this case a budget SATA setup. After the initial installation is complete you’ll need to check on and modify a file. It took some rummaging around at the time and after some good old trial and error along with a couple installations I was able to narrow it down to the following steps.

#cat /etc/modules.conf

alias eth0 e100
alias eth1 tg3
alias eth2 tg3
alias scsi_hostadapter megaraid2
alias usb-controller usb-ohci

From the above lines I think I did this to either check that “alias scsi_hostadapter megaraid2″ was there or I added it in.

#lspci | grep LSI

020:0e.0 RAID bus controller: LSI Logic / Symbios Logic: Unknown device 0409 (rev 0a)

This is to find out what the device number is for the controller, in this case it is “0409″.

#cat /etc/vmware/vmware-devices.map | grep LSI

vendor ,0x1000,Symbios,LSI Logic / Symbios Logic
device 0x1000,0x0050,scsi,LSI1064,mptscsi_2xx.o
device 0x1000,0x0054,scsi,LSI1068,mptscsi_2xx.o
device 0x1000,0x0056,scsi,LSI1064E,mptscsi_2xx.o
device 0x1000,0x0058,scsi,LSI1068E,mptscsi_2xx.o
device 0x1000,0x005a,scsi,LSI1066E,mptscsi_2xx.o
device 0x1000,0x005c,scsi,LSI1064A,mptscsi_2xx.o
device 0x1000,0x005e,scsi,LSI1066,mptscsi_2xx.o
device 0x1000,0x0060,scsi,LSI1078,mptscsi_2xx.o
device 0x1000,0x0407,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x0408,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x0411,scsi,LSI Logic MegaRAID SAS1064R,megaraid_sas.o
device 0x1000,0x1960,scsi,LSI Logic MegaRAID,megaraid2.o
device 0x1000,0x9010,scsi,LSI Logic MegaRAID ,megaraid2.o
device 0x1000,0x9060,scsi,LSI Logic MegaRAID ,megaraid2.o

The spacing in there is exactly how it was returned, without the word wrapping. I’m generally pretty organized with files and code so seeing this I needed to keep myself from fixing it :) Take note of the device line with 0×0408 in it, this will be changed to 0×0409 which we found out from the previous command.

From: device 0x1000,0x0408,scsi,LSI Logic MegaRAID,megaraid2.o
To: device 0x1000,0x0409,scsi,LSI Logic MegaRAID,megaraid2.o

Now that the updated the vmware-devices.map file to see the controller card we’ll need to update ESX and reboot. I’m not fully sure if each of these are needed, I would think that at least the first is, but I have done in before for safe practice to make sure the system is up to date.

#esxcfg-boot -p (reloads PCI data)
#esxcfg-boot -i (reloads initrd information)
#esxcfg-boot -b (sets up boot information)
#reboot

Upon rebooting and logging into the VMware Virtual Infrastructure Client you should be able to access that datastore and begin to create virtual machines. Watch out with keeping snapshots around too long, I talked about this some in a previous post. I now also recall watching the various services starting on boot and that it would begin to fail on a particular one until this was fixed. I didn’t write that down but I’ll try and find out what it is and add it here.

VMWare: Watch out for those snapshots

Posted in Computers on October 12th, 2007 by chris – 1 Comment

I consider myself to be fairly cautious when implementing a new server level application, especially so for the level in which ESX runs and how much depends on it to be stable. Something escaped my attention, or perhaps it wasn’t talked about much in the docs I read. Either way it’s too late now and I’m not about to dig through all of those pdf’s to see if I missed it.The ESX 3.0 server was setup just over a year ago and from my previous trials with ESX, and VMWare’s other virtualization products, snapshots have been fantastic and the ability to revert to them in various scenarios such as if a recent OS/Application patch went sour saved a lot of time for peoples. Unfortunately I forgot the saying “if it looks too good to be true, then it probably is”, and such is the case it turns out with the snapshot.

Over the past couple weeks I’ve noticed that the space in the datastore was decreasing at a more accelerated rate then usual. Peeking my curiosity I poke around and with a recent influx in data I contributed it to that, but I didn’t tally precise number to see if they balanced. Being a busy time and a lot to do I took it for what it was and moved on. A couple days later I wake up to one of the virtual machines no longer being accessible and checking the datastore again I see it dropped over 30gb’s during those couple days and reduced the free space down to 2mb. I’m surprised, confused, and aggravated that a machine went down. I’m sure many admin’s have experienced this at one point or another. By either browsing the to virtual machine’s folder in the service console or through browsing the datastore when you right click it by looking for files with the word “delta” in them will indicate if they are from a snapshot. I no longer have the exact error message that was displayed at that point but I was given the option to “Retry” or “Abort”. I clicked retry and then was faced with:

There is no more space for the redo log of ComputerName-000002.vmdk. You maybe be able to continue this session by freeing disk space on the relevant partition, and clicking Retry. Otherwise click Abort to terminate this session.

I proceed to run various commands in both the console and virtual client to work the problem. One of the threads in vmware communities that came up more than once in the searches is http://communities.vmware.com/message/510545#510545. I’ve tried various suggestions in there and I think some additional ones as well. I first tried to remove the snapshot that was in the snapshot manager (this was by clicking delete and not delete all) and after several hours of processing it removed the snapshot from the gui but the vmdk files were certainly still there. After which I tried:

vmware-cmd <cfg> hassnapshot
hassnapshot() =

Yes, thats a blank. For whatever reason it wasn’t detecting that any snapshot exists even though there are numerous delta files in the virtual machines directory. I then proceeded to create a snapshot in the snapshot manager and then delete it, this time with delete all. Still no luck, they were all still there. I continued by removing the vm from the inventory (not from the drive! – be careful there, there’s a big difference) and re-added it. No dice. With the outlook becoming more gloomy I tried creating the snapshot in the service console with:

vmware-cmd <cfg> createsnapshot <name of snapshot>

and was returned with the error of

VMControl error -11: No such virtual machine

I checked the path then checked it again, it was correct. I searched around google for a while too and didn’t find anything helpful with the message. I was thinking that it may have been a somewhat generic message that could have meant several things.

In the end I have resorted to removing a vm that was recently built to clear up enough space to boot it, and thankfully it has not been configured yet so not much time gone there, and remove the data from the vm so it can be completely removed to be removed and a fresh vm built. This particular server was used as storage for the network and to hold backups so I am thankful there isn’t much configuration that needs to occur once it’s rebuilt. I thought as the data was transferring, all 75gb or so, that I would write this article up. There sure is a lesson learned here – regardless of how much you may trust a piece of software to work right, it can always turn on you. This goes for the mac users out there too.

On a side note, the <cfg> tag’s above is a common abbreviation used in VMWare’s documentation which corresponds to the full path and file name of the vmx file. For example, in this scenario mine is similar to:

/vmfs/volumes/storage1/vmname/vmname.vmx