Showing posts with label vSphere 4. Show all posts
Showing posts with label vSphere 4. Show all posts

Friday, April 19, 2013

Update Manager 4.1 failed to remediate


I came across another nice puzzle after updating one of our customers' vCenter and Update Manager to 4.1U3 (yeah, some customers hold off on updating forever, I know..): When trying to update some ESX hosts attached to the vCenter, I got the following failure:

The host returns esxupdate error codes: 7. Check the Update Manager log files and esxupdate log files for more details

The logfiles contained the following info:

Remediation did not succeed for XXXXesx01: SingleHostRemediate: esxupdate error, version: 1.30, operation: 7: ('http://XXXX:9084/vci/hostupdates/hostupdate/vmw/vibs/cross_oem-vmware-esx-drivers-net-vxge_400.2.0.28.21239-1OEM.vib', '/var/cache/esxupdate/3375545638666279871', '[Errno 14] HTTP Error 404: Not Found') . error 4/11/2013 3:18:13 PM 

This error turned out to be because of a known error. Apparently pre-Update Manager 4.1U2, the webserver serving the patches was case insensitive, and from U2 on, it was made case-sensitive. However, certain patches that were previously downloaded before, were stored case-insensitive, but they were supposed to be case sensitive.

The KB article is here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011656

Because redownloading at this moment was not an option, we had to rename the files to be with the proper case sensitivity. The files are fortunately not that many:

From:
  • bind-libs-9.3.6-4.p1.el5_5.3.i386.vib 
  • bind-libs-9.3.6-4.p1.el5_5.3.x86_64.vib 
  • bind-utils-9.3.6-4.p1.el5_5.3.x86_64.vib 
  • bind-libs-9.3.6-4.p1.el5_5.3.i386.vib 
  • cross_oem-vmware-esx-drivers-net-vxge_400.2.0.28.21239-1oem.vib 
  • cross_oem-vmware-esx-drivers-scsi-3w-9xxx_400.2.26.08.036vm40-1oem.vib 
  • vmware-esx_swmgmt_provider-4x.1.0.1-1.4.348481.vib

To (note the bold typeface):
  • bind-libs-9.3.6-4.P1.el5_5.3.i386.vib 
  • bind-libs-9.3.6-4.P1.el5_5.3.x86_64.vib 
  • bind-utils-9.3.6-4.P1.el5_5.3.x86_64.vib 
  • bind-libs-9.3.6-4.P1.el5_5.3.i386.vib 
  • cross_oem-vmware-esx-drivers-net-vxge_400.2.0.28.21239-1OEM.vib 
  • cross_oem-vmware-esx-drivers-scsi-3w-9xxx_400.2.26.08.036vm40-1OEM.vib 
  • vmware-esx_swMgmt_provider-4x.1.0.1-1.4.348481.vib 




Tuesday, November 29, 2011

Citrix Xenapp Best Practices

A while ago I talked about Citrix Xenapp on VMWare, but now I came across best practices, both from VMWare as well as Citrix. So without further ado:


http://www.vmware.com/files/pdf/solutions/vmware-citrix-xenapp-best-practices-EN.pdf

http://support.citrix.com/article/CTX129761

Both of them say about the same I said in my first post, with some more details.

Friday, November 25, 2011

Invalid configuration for device ’0′ when enabling a NIC in vSphere

Suddenly a VM was unreachable on all interfaces. Upon investigation, I saw that none of the NIC's were set to connected:







When I tried to enable them again I got Invalid configuration for device ’0′:




Some googling led to a simple solution, which is to restart the management agents:

service mgmt-vmware restart

and

service vmware-vpxa restart

Good to know. Thanks go to the cupfighters for investigating it deeper and leading me to the answer.

Thursday, November 17, 2011

NFS advanced settings for ESX/ESXi

Netapp has this wonderful Best Practices document available online, which is well worth the read. It used to include the CLI commands but this has been transferred to a new document.

As part of the best practices of an ESX/ESXi installation , you need to change some settings for NFS. I keep forgetting which ones they are, so this a reminder to myself (and anyone that reads this blog ;-) )

The easiest thing to do, is to start up the Remote Tech Support (SSH) service in the security profile, SSH to the host, and copy/paste this into the SSH window:


/usr/sbin/esxcfg-advcfg -s 30 /Net/TcpipHeapSize 
/usr/sbin/esxcfg-advcfg -s 120 /Net/TcpipHeapMax 
/usr/sbin/esxcfg-advcfg -s 10 /NFS/HeartbeatMaxFailures 
/usr/sbin/esxcfg-advcfg -s 12 /NFS/HeartbeatFrequency 
/usr/sbin/esxcfg-advcfg -s 5 /NFS/HeartbeatTimeout 
/usr/sbin/esxcfg-advcfg -s 64 /NFS/MaxVolumes


Note that the last setting says MaxVolumes: 64. The default is set to 8, which means that the maximum number of NFS volumes is 8 by default. Setting the maximum to 64 works for ESX 4.x, but used to be 32 for ESX 3.x. ESX 5.x can even go to 128.

 If you want to follow what VMWare says instead of the Netapp Best Practices, you can set the TCPIP Heap Size to 32 in ESX4/5(see here).

Update for 5.1:

/usr/sbin/esxcfg-advcfg -s 32 /Net/TcpipHeapSize
/usr/sbin/esxcfg-advcfg -s 128 /Net/TcpipHeapMax
/usr/sbin/esxcfg-advcfg -s 10 /NFS/HeartbeatMaxFailures
/usr/sbin/esxcfg-advcfg -s 12 /NFS/HeartbeatFrequency
/usr/sbin/esxcfg-advcfg -s 5 /NFS/HeartbeatTimeout
/usr/sbin/esxcfg-advcfg -s 256 /NFS/MaxVolumes

Friday, August 19, 2011

Updating VMWare Tools on all powered on Windows VM's

Quick PowerCLI line to update the VMWare Tools all Windows VM's that are powered on without rebooting them:

get-vm | where {$_.PowerState -eq "PoweredOn" -and $_.Guest.OSFullName -match "Microsoft Windows*" } | Update-Tools -NoReboot

Oh how I love PowerCLI...

Monday, August 15, 2011

There are errors during the remediation operation

"There are errors during the remediation operation" . I stared blankly at the screen. Why was this happening?

I was trying to update an older vSphere 4.0 setup with update manager. The server was already in maintenance mode, but after clicking "Remediate" the error message came up almost immediately. A retry did not help. Quick googling came up with the following:

http://www.vmware.com/support/vsphere4/doc/vsp_vum_40_rel_notes.html

Host Patch and Upgrade Remediation Might Fail
Host patch and upgrade remediation might fail with the message There are errors during the remediation operation if an inaccessible virtual machine exists on the host. The reason for this failure might be that the virtual machine files reside on a disconnected network storage.
Workaround: Connect the disconnected network storage or remove the inaccessible virtual machine from the vSphere inventory.


I checked the host, and indeed there was an inaccessible VM! It was a no longer used one, so I removed it from the inventory, and updating went fine after that.

Where would we be without Google, right? ;-)

Monday, August 1, 2011

vm (invalid) and greyed out

A number of vm's in a vSphere setup were showing VMWare Tools not running. Checking the vm's themselves showed that it was running. I tried reinstalling VMWare Tools, but when I clicked the "Install/Upgrade VMWare Tools" the vm turned gray, and was appended with "(invalid)"

Some googling showed up this: http://communities.vmware.com/message/861038

  •  Log on to the host where the vm is running on
  • type: "service vmware-vpxa restart"
  • type: "service mgmt-vmware restart"

The host and vm's that were running on there became disconnected in the interface for a few seconds, but after that came back, and the vm's VMWare Tools became OK. It turned out the vm's didn't need updating anyway.

Monday, February 21, 2011

Citrix XenApp on VMWare

A few years ago, there was a lot of discussion regarding Citrix XenApp/Terminal Server running on VMWare. Our personal experience was that it wasn't that good, and with anything above a few (4-5) concurrent users performed terrible when doing day to day tasks. We even tried the tweaks that were going around (stuff like only use 1 vcpu on your Citrix server, and various other tweaks going around). Our general feeling was to not do it anymore, and left it at that.

However, time goes on, and technology moves forward. But the articles remained online, and no one really talked about the advancements that have been made. Then, on VMWorld 2009 there was a session about running XenApp using vSphere. If you create a (free) account on vmworld.com, you can watch it for free

Basically, a few best practices are given:
  1. Use the newest CPU's in your hosts (Nehalem architecture or higher). So if you are still using older hosts (like HP G5 series servers) then think again. The newer CPU's alone would give about 30% better performance (according to the video, ymmv)
  2. Use vSphere, mainly because it supports MMU virtualization which gives a good performance boost for Xenapp (but if you are still on ESX3.5, you *really* should be thinking of upgrading anyway, instead of thinking of virtualizing your Xenapp servers)
  3. The usercount on a VM will never be the same as on a physical server. The idea is to be running multiple smaller VM's and thereby getting more users in total per physical box.
  4. Don't use p2v'd systems. It's much better to start with a clean OS, but if you must, remove old hardware, hardware management agents, and unused OS features (wallpaper, menu animations, systemtray animating things such as network indicators and system clock. Of course there are other tweaks you always need to do for any terminal server environment.The sweetspot of the Xenapp VM's is often  2vCPU's and 4GB RAM. More vCPU's will usually give less performance.
  5. Old blogposts are no longer valid, so tweaks like "disable page sharing and memory ballooning" are no longer necessary.
  6. Use realistic tests. It's no good if your environment performs well in synthetic tests if the applications that the users will use are not performing up to par.
The video also has someone from eBay giving a talk about their environment, and they came to 6 to 8 users per VM, each with 2vCPU's and 4GB RAM, running 10 VM's on one DL380G6. Like I said, the video is free to watch, and highly informative.

Thursday, February 3, 2011

Pushing Host Profiles via PowerCLI.

Nice.. I had to push a host profile to a whole number of hosts, but they were all running a lot of VM's. Anyone who's seen vSphere knows that via the interface, you need to set a server in maintenance mode, apply the host profile, then take it out of maintenance mode. This is fine for one or two machines, but by the time you get to the 5th or 6th server, you get a bit bored, especially when you know you've got 20 to do.. So in comes Powershell (I should say PowerCLI) again, to save me some time.. I thought.

First things first: Figure out how to get a host profile:

$MyHostProfile = Get-VMHostProfile -Name "coolprofile"

Fail. Fail? Yes, fail. Somehow I run into a wall, with PowerCLI telling me that my profile can't be found. Some googling tells me that there's a bug in PowerCLI, but there's a workaround:

Get-VMHostProfile -Entity *

Woohoo! It shows me a hostprofile that was already pushed to the client before. So my statement would now become:

$MyHostProfile = Get-VMHostProfile -Entity * 

Note that this workaround would only work with one host profile. I'll figure out how to get a specific hostprofile implemented some day, but I've got one, so I got lucky (this time).

Now to apply a host profile to an esxhost:

Apply-VMHostProfile -Entity $esxhost -Profile $MyHostProfile -Confirm:$false

Damn, that was easy. Especially since the first part was so difficult..OK, now now to get a host in maintenance mode: 

Set-VMHost -VMHost $esxhost -State maintenance 


Great, that works! Now how do I get it out? 

Set-VMHost -VMHost $esxhost -State connected 


Cool, works too! Now I have to put the whole thing together:

$hosts = "esx1","esx2","esx5","esx6"
$MyHostProfile = Get-VMHostProfile -Entity *
foreach ($esxhost in $hosts) {
Set-VMHost -VMHost $esxhost -State maintenance
Apply-VMHostProfile -Entity $esxhost -Profile $MyHostProfile -Confirm:$false
Set-VMHost -VMHost $esxhost -State connected
}



As you can see, I used an array of hosts in this case (the first line of code. Didn't want all hosts in my case) but I could have changed that to all hosts by just doing a get-vmhost.

Running this now puts each host in maintenance mode, applies the host profile, and takes it out again.. Now imagine that for 20, 30 or even 100 hosts...... Yes, I *like* PowerCLI...

Migrating/upgrading a 32bit vCenter 4.0 installation to a 64bit 4.1 installation



With the coming of vSphere 4.1, vCenter now has a prerequisite of needing a 64bit Windows operating system. Many customers that upgraded from ESX3.5 to vSphere were using a 32bit Windows installation, but also many vSphere installations were built using 32bit Windows. vCenter 4.0 even needed 32bit DSN's so it appeared it was easier to just install vCenter using a 32bit installation. To make things worse, many vCenter implementations have SQL Server installed on the same server as vCenter itself. This article will be about the upgrade process of upgrading windows 2003 32bit with SQL 2005 32 bit to a 64bit environment, and migrating the data.

During testing, this upgrade was quite difficult because of little snags, so it is very wise to do a dry run of this using a P2V’d system and testing the upgrade thoroughly before you do this on a live system. You wouldn't want to reinstall your OS and find out your migration data is useless.

Needed
-          Windows 2003/8/8R2 x64 Standard
-          SQL Server 2005/8 x64 Standard (2008R2 is NOT supported yet)
-          VMWare vSphere 4.1 vCenter installation media
-          External source, such as an external harddisk or fileshare.

Basic steps
  1. Make sure all data is safely backed up, and you have all the information off of your 32bit installation
  2. Stop virtual center service, update manager service
  3. Run the backup.bat from VMWare's datamigration tool
  4. Copy the datamigration complete with data to an external source
  5. Stop SQL Server service and SQL Server agent, copy the ESX and Update Manager database to the external source
  6. Wipe server and install server with 64 bit OS, using the same name and IP as the previous one
  7. Install Microsoft SQL Server
  8. Copy vCenter and Updatemanager database to the newly installed server
  9. Set permissions on the databases, and on msdb
  10. Set compatibility mode of the db to SQL 2005 if your SQL Server came from SQL 2000.
  11. Create the ODBC links
  12. Copy the datamigration directory back to the server
  13. Make sure you have the vCenter installation media on dvd or on the server in a directory as well
  14. Run install.bat from the datamigration directory
  15. Answer the installation questions from the installer
Detailed Steps

So you want to migrate a 32bit system to a 64bit system. This means you will be doing a complete re-installation of the operating system. I will assume only the database (in my tests SQL2005) is installed and standard users have been created, and no extra software has been installed.

 
Once the first VMWare services are stopped, management through vCenter will be impossible, but the VM’s will keep on running. Services like HA and DRS will be unavailable then, but VM’s should not notice anything of this. 


Collect information and backup the system
Note the following:
-    IP address(es)/subnetmask(s)
-    Hostname
-    Workgroup (if the machine is part of a domain, take note of that as well)
-    Routes
-    Host file (c:\windows\system32\drivers\etc)
-    Usernames (and passwords, if you have them)
-    Possible created groups
 

For SQL look at the following things:
-    Which databases are configured
-    Where are the databases stored
-    Which Service Pack is the database currently
-    Any extra settings made in SQL, such as maintenance and backup plans.
-    The ODBC DSN’s used to connect the database to the Virtual Center.

 
Of course, if you have installed extra software, make sure you have copied any settings and data from that application. Finally, make sure you have a working backup, or pull one of the disks of the raid 1 set. This will ensure you can easily go back to the 32bit environment.
 

Stop vCenter services
In computer management, stop the following services
-    VMware VirtualCenter Server
-    VMware VirtualCenter Management Webservices (will be stopped along with the first service)
-    VMware Update Manager Service
 

Datamigration
Fortunately, VMWare has made a datamigration tool to help with the migration from 32bit to 64 bit. On the installation media, there is a directory called datamigration. In that directory is a zipfile. Extract that zipfile to your harddisk, or to the external harddisk, as this extracted directory will also hold the data from the migration. I have tried to run the tool from a network share, but it wouldn't work properly, so my advice is to not do that.
 

Open a command prompt, change to the datamigration directory and type “backup.bat”. The script will prompt you if the data needs to be backed up. Type Y and press enter to continue.
 
New files and directories will be created i
n the datamigration directory. The “data” directory will hold the configuration data from vCenter, Update Manager and vCenter Orchestrator (directories will be created vc, vum and vco). There will also be a "log" directory to show the results of the migration.
 

When the backup is done, check the vc and vum directories for data (and if vco has been used, check if there is data there too). The vc directory should have a vc_ssl with keys in them, and also a vc_data file, which is kind of small (in my test environment of 4 servers, it was 24kb).  The vum will have data there too, as well as update files downloaded from the VMWare site. This will be quite big (several GB).
 

Copy the data
Copy the entire datamigration directory to the external source (if you haven’t extracted it to the external source beforehand).
 

Stop SQL services and copy the databases
Stop the following services for SQL:
-    SQL Server
-    SQL Server Agent
You can now copy the MDF and LDF files belonging to vCenter and the Update Manager from the server to the external source. To speed up the copy, you could shrink the database before you copy the files. My test environment database shrank from 2GB to 200MB.
 

“Nuke and repave”
Reinstall the server with the 64bit operating system, using the same name and same IP information.  There will be issues with vCenter if other IP information is used. Patch the system as you would normally patch a system. Create routes, possible hosts file entries, users and groups. Re-install a SQL 64bit edition, and install the same service pack as the 32bit edition. You can go from SQL 2005 to SQL 2008 without needing extra configuration. I like to configure SQL so that it installs the databases to another disk from the SQL installation.


Copy databases back and configure SQL
Copy the MDF and LDF to the newly installed server, in the directory where the other databases are stored.  In the SQL Server Management Studio, attach the databases.

Make sure vCenter user has been assigned “db_owner” for both the vCenter and Update Manager database, as well as the MSDB database. This is necessary to be able to upgrade the database during the vCenter installation.

Another important thing is that the compatibility level be set to SQL 2005 (90). If the SQL instance you came from used to be SQL 2000 (e.g. when you have an old vCenter that has been upgraded from VI3) it is likely that the compatibility level is set to SQL 2000. Not setting this properly will make the upgrade fail!  

Configure other settings as needed, such as re-implement maintenance plans, and set the recovery model to Simple again. Do the same for the Update manager database if you use that.


Recreate ODBC connections
To recreate the database connections you will need DSN’s. Go to the Data Sources (ODBC) tool in Control Panel -> Administrative Tools and add a system DSN for the vCenter database and Update Manager. If you plan to migrate update manager as well, you need to use a 32 bit DSN for that. Run c:\windows\syswow64\odbcad32.exe to get the 32 bit version, and add a system DSN there.


With me so far? Good, then the fun part can start: Getting vCenter back up and running.

Installation/Upgrade of vCenter server 4.1
Copy the datamigration directory back to the server, and for speed sake, copy/extract the vCenter installation directory to the local harddrive. In any case you need to have this available when the datamigration installer is started.
 
Open a command prompt and change to the datamigration directory. If you are using Windows 2008, open a command prompt by "run as administrator". Type “install” to start. The installer script will check to see if the vCenter data is available, and tell you if all data is available or not. If it is available, it will use these settings during installation. It will ask for the directory of the installation media (e.g. D:\vim410).
After this it immediately checks for the Upgrade Manager upgrade data and ask for the installation media again. This is the same directory as before.

As a sidenote: I have had three or four attempts of testing this in a test environment, and a the first times the installer said there was no upgrade data available. I needed to run the backup script on the source again. That is why it is important to check for the existence of files during the backup and to test this upgrade before you try this on a live environment.

After the question for the Upgrade Manager has started, the installer for vCenter 4.1 will start.  Select the vCenter DSN and answer “yes” to the question if the database should be upgraded. At one moment, the question will come if the host agents can be upgraded automatically. I answer no.

In vCenter the ESX hosts will be in a disconnected state, but the VM information will be seen, and you should be able to see the performance data of the VM’s. I select no, because you can still fall back to the old installation until the complete installation finishes (remember that disk you took from the raid 1 set, or the backup of the previous installation? The ESX hosts can then still be managed by vCenter 4).

After the installation, the installation for Upgrade Manager will immediately start. Follow the instructions as they appear on screen.
Install the updated Virtual Infrastructure Client from the vpx directory in the installation media (vi-client.exe) and then connect to the vCenter. The hosts are in a disconnected state, but you can simply rightclick the hosts, and click "Connect". The agent will update, and HA will be re-enabled.


This should be all you need to complete a successful migration.


Note to self: Make some screenshots next time...