Print

 

Today I found an intresting error on an ESX server service console. The ESX 4.0 U1 server displayed for following error...

Uhhuh. NMI received for unknown reason 21.

Do you have a strange power saving mode enabled?

Dazed and confused, but trying to continue

 

  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

The physcial server itself (a Dell PowerEdge 2950) was displaying an error on the LCD:

E171F PCIE Fatal Err Slot 3

 

Pressing F12 on the ESX service console to view the log displayed the following:

 
bnx2 Chip not in correct endian mode
bnx2 vmnic3 BUG! Tx ring full when queue awake.
WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic3: transmit timed out

 

 

Which also confirms the cause to some VMs losing network connection due to a physcial PCI-E NIC failure.

Even though a NIC had failed the ESX server continued to run but there were no reports of the failure in the hardware tab or via an email from vCenter. It was only highlighted from the orange flashing light on the physical server being noticed by a colleague.