Monday, February 25, 2013

HP Firmware versions with PowerCLI

For documentation purposes  I needed BIOS and ILO versions of the vSphere environment.

I got the script from http://vnugglets.com/2011/11/get-hp-firmware-info-for-hp-vmware.html and combined some updates in the comments in the script so that it also shows which NetXen firmware and NetXen drivers are on the system. I adapted the script to give the info for all machines in the cluster, not just a single cluster.

Foreach ($strHostsClusterName in Get-Cluster){

Get-View -ViewType HostSystem -Property Name, Runtime.HealthSystemRuntime.SystemHealthInfo.NumericSensorInfo -SearchRoot (Get-View -ViewType ClusterComputeResource -Property Name -Filter @{"Name" = "^$([RegEx]::escape($strHostsClusterName))$"}).MoRef | %{

    $arrNumericSensorInfo = @($_.Runtime.HealthSystemRuntime.SystemHealthInfo.NumericSensorInfo)

    # HostNumericSensorInfo for BIOS, iLO, array controller

    $nsiBIOS = $arrNumericSensorInfo | ? {$_.Name -like "*System BIOS*"}

    $nsiArrayCtrlr = $arrNumericSensorInfo | ? {$_.Name -like "HP Smart Array Controller*"}

    $nsiILO = $arrNumericSensorInfo | ? {$_.Name -like "Hewlett-Packard BMC Firmware*"}

    $nsiNXdev = $arrNumericSensorInfo | ? {$_.Name -like "nx_nic device*"}

    $nsiNXdrv = $arrNumericSensorInfo | ? {$_.Name -like "nx_nic driver*"}

    # assume all at same level and take first or set as n/a

    if ( $nsiNXdev.Count -gt 0 ) {

       $nsiNXdevice = $nsiNXdev[0].Name

    } else {

       $nsiNXdevice = "n/a"

    }

    if ( $nsiNXdrv.Count -gt 0 ) {

       $nsiNXdriver = $nsiNXdrv[0].Name

    } else {

    $nsiNXdriver = "n/a"

    }

New-Object PSObject -Property @{

VMHost = $_.Name

"SystemBIOS" = $nsiBIOS.name

"HPSmartArray" = $nsiArrayCtrlr.Name

"iLOFirmware" = $nsiILO.Name

"nx_nic device" = $nsiNXdevice

"nx_nic driver" = $nsiNXdriver

    } ## end new-object

} ## end Foreach-Object

}


If I feel like it, I'll someday add LSI Logic cards and such as well. For now, it got me the info I needed, without logging on to 20 servers ;-)

Run the script from PowerCLI, by first connecting to the vCenter (Connect-Viserver <vcenter>) and then ./fwscript.ps1 | Export-CSV d:\fwlist.csv

Wednesday, February 20, 2013

CDP info from vSphere platform

For documentation purposes, I wanted to get an overview of all host connections to which switches they are connected, and which ports exactly. As luck would have it, VMware already has made some scripts to do this.

I adapted it to suit my needs:

$Hosts = Get-VMHost | sort -property Name
foreach ($vhost in $Hosts){
$vmh = Get-VMHost -Name $vhost
If ($vmh.State -ne "Connected") {
  Write-Output "Host $($vmh) state is not connected, skipping."
  }
Else {
  Get-View $vmh.ID | `
  % { $esxname = $_.Name; Get-View $_.ConfigManager.NetworkSystem} | `
  % { foreach ($physnic in $_.NetworkInfo.Pnic) {
    $pnicInfo = $_.QueryNetworkHint($physnic.Device)
    foreach( $hint in $pnicInfo ){
      # Write-Host $esxname $physnic.Device
      if ( $hint.ConnectedSwitchPort ) {
        $hint.ConnectedSwitchPort | select @{n="VMHost";e={$esxname}},@{n="VMNic";e={$physnic.Device}},DevId,PortId
        }
      }
    }
  }
}
}
To run this, simply copy this script to a file, open up powershell, connect to vcenter, and run it with:

.\VMHostCDPInfo.ps1 | Format-Table -AutoSize |Out-File cdpinfo.txt

It ignores all nics where it cannot retrieve any CDP info from, but other than that, you get a nice list, which you can put with your documentation which looks like:

VMHost          VMNic   DevId    PortId
------          -----   -----    ------
server01        vmnic0  switch01 GigabitEthernet1/25
server01        vmnic10 switch02 GigabitEthernet2/0/31

Monday, December 10, 2012

How to make ESXi crash

While looking at a VMware KBTV video, I saw a cool trick to make your ESXi host crash (which can be good for testing purposes):

From SSH, type: vsish -e set /reliability/crashMe/Panic 1

Apparently, I'm telling nothing new: Seaching for that string on Google, I found the links below for more info:

http://www.seancrookston.com/2012/01/09/forcing-a-kernel-dump-on-a-vsphere-host-the-purple-screen-of-death/

http://www.ntpro.nl/blog/archives/1388-Lets-create-some-Kernel-Panic-using-vsish.html

Monday, November 12, 2012

Unable to remove ESXi host from vCenter

I needed to remove an ESXi host from vCenter that I wanted to re-use, but I was unable to. After disconnecting the server, the "Remove" button was greyed out:







 Some searching gave the tip to remove it via PowerCLI (Remove-VMHost), but that did not work either:



It turned out the ESXi host could not be removed because it was part of a cluster.

Solution:

Move the ESXi host entry (which is disconnected) to the top of your vCenter tree, out of any cluster. Then re-run the Remove-VMHost command (most likely the Remove function will be visible in the GUI as well, but I just pressed up and enter in my PowerCLI screen, and it started to remove)

Friday, August 3, 2012

HP Servers disconnecting

We come across an issue lately, with several types of HP servers that have QLogic/NetXen NC375i networkcards in them. They disconnect, causing a disruption of service. You can imagine that having an NFS mount or iSCSI target with that happening is less than desirable and has caused Windows clusters to fail over and ESX/ESXi hosts to go crazy. This problem is solved by rebooting the host. This issue is very much OS independent!

In windows eventlog you may see things like:
DEVICE: HP NC375i Integrated Quad Port Multifunction Gigabit Server Adapter #4
PROBLEM: Tx path is hung. The device is being reset.

In ESX you see things in /var/log/vmkernel like:
Jul 31 21:02:12 server01 vmkernel: 165:01:40:09.914 cpu19:4295)<5>nx_nic[vmnic8]: Device is DOWN. Fail count[8]
Jul 31 21:02:12 server01 vmkernel: 165:01:40:09.915 cpu19:4295)<3>nx_nic[vmnic8]: Firmware hang detected. Severity code=0 Peg number=2 Error code=1 Return address=0


HP has brought out an advisory saying that indeed there are problems:

Network Adapters and Affected Firmware Versions
Network Adapter
Affected Firmware Versions
CN1000Q Dual Port Converged Network Adapter
EARLIER than firmware version 4.8.22
NC375i Integrated Quad Port Multifunction Gigabit Server Adapter
EARLIER than firmware version 4.0.585
NC375T PCI Express Quad Port Gigabit Server Adapter
EARLIER than firmware version 4.0.585
NC522m Dual Port Flex -10 10GbE Multifunction BL-c Adapter
EARLIER than firmware version 4.0.585
NC522SFP Dual Port 10GbE Server Adapter
EARLIER than firmware version 4.0.585
NC523SFP 10Gb 2-port Server Adapter
EARLIER than firmware version 4.9.81
The NC375i adapter is integrated on the following servers and storage systems:
  • ProLiant DL370 G6 Server
  • ProLiant DL580 G7 Server
  • ProLiant DL585 G7 Server
  • ProLiant DL980 G7 Server
  • HP Business Data Warehouse Appliance
  • StorageWorks D2D4312 Backup System
  • StorageWorks D2D4324 Backup System

Servers manufactured after 1 april 2012 are not affected by this, but check the firmware level if you suffer from this issue. An older interface may still have this issue in your newer machine.

How to check the firmware version:

Windows:
Go to the HP network utilities, and click on the network interface you are having issues with, and click Properties. The Information tab will show the Boot Code, which is the firmware version:


Alternatively, you can run the update tool, and it will tell you which version you are currently running as well.


Linux:

Type "modinfo netxen_nic" and look for the firmware line.
[user@server-01 ~]$ modinfo netxen_nic | grep firmware
firmware: phanfw-4.0.579.bin   <--------  version 4.0.579, so needs an update

ESX/ESXi:
VMware have released a KB article to get the firmware and driver version, available here.

Resolution:

The resolution is to update the firmware of the network cards. The advisory lists the latest drivers and firmware. For Windows and Linux, there are proper update tools, but unfortunately for VMware, no firmware update utility is given, and the Linux firmware utility does not work.

On ESX/ESXi you have to make use of a Linux LiveCD and boot from it (ESX-server in Maintenance mode and reboot). In our case we used Novell SLES11 CD (free ISO download, registering necessary) as the Rescue-CD for RHEL5 gave several errors running the firmware update-utility. Perhaps a OpenSUSE, Fedora, Ubuntu or other distro LiveCD can be used as well, but we haven't tested those.

Many thanks go to my colleague Sven for the info :-)

Thursday, August 2, 2012

Getting an overview of patches for your ESX hosts

A customer asked for an overview of which patches exactly were needed for his ESX hosts. He wanted to review the patches before they were applied with Update Manager. Unfortunately, you can't get the list from the gui in a nice way, so some PowerCLI goodness was needed.

First off, you need to have the Update Manager cmdlets installed to be able to use the Get-Compliance cmdlet in the code.

The following script will go over each host, and output the severity, patch ID, release date, additional info (a link to the kb article) and a short update to what the patch is for.

It will look something like this:
HostSecurity,ESX410-201204402-SG,"4/26/2012 10:00:00 AM","For more information, see http://kb.vmware.com/kb/2014988.","Updates libxml2"

Now for the code itself:

ForEach ($HostToCheck in Get-VMHost){
$Details = Get-Compliance $HostToCheck -Detailed| Select -ExpandProperty NotCompliantPatches| Select @{N="Hostname";E={$HostToCheck}}, Severity, IdByVendor, ReleaseDate, Description, Name

$ComplianceResult += $Details
}

$ComplianceResult | Export-CSV -Path c:\NeededPatches.CSV -NoType

Monday, June 18, 2012

I had an issue where I needed to put a machine in maintenance mode, but some VM's did not want to vMotion off of the host, because the VM's were busy doing a VMWare Tools installation. These machines needed a reboot to finish the installation, but my customer did not want to do that just yet.

I tried to do an "End VMWare Tools Install" via the menu, but this was greeted by the message "The operation is not allowed in the current state."

At first I thought I needed to wait for my customer to reboot the machines at his convenience, but a colleague of mine knew the solution:

  • Enable Remote Tech Support for the ESXi host where the machine is located (from Configuration > Security profile)
  • Connect via SSH with root account
  • Run the command: usr/bin/vim-cmd vmsvc/getallvms
  • Take the VMid of the VM with the hanging install and then use the second command to cancel it: /usr/bin/vim-cmd vmsvc/tools.cancelinstall vmid
  • Start an interactive VM tools installation and manually run it from the mounted CD
Hey presto, the VM's are able to be vMotioned off, and hosts are put in maintenance mode.
 
 Thanks Paul ;-)