Maintaining availability during Virtual Connect interconnect firmware upgrades

Virtual Connect (VC) interconnects reboot during the activation stage of the firmware update process, interrupting server connectivity to these modules. You can minimize the impact of module firmware activation by ensuring a redundant hardware configuration, redundantly connected networks, and uplink sets, as well as properly configured NIC teaming on the servers themselves. Hewlett Packard Enterprise recommends using these network design methodologies. When updating HPE FlexFabric interconnects, you must configure SAN connectivity redundantly as well, to avoid application outages.

When designing network connectivity, consider all the dependencies that can influence the ability of the server applications to continue to pass traffic without interruption during the VC interconnect firmware update process. Verify the following aspects of a redundant design before you update firmware in downtime sensitive environments:

Configuration

Description

Stacking links

Configure stacking links between VC interconnects to ensure network reachability for any server blade to any network or uplink set within the logical interconnect regardless of the server location. This plays a major role in the ability of the individual VC interconnects to sustain an outage during firmware upgrade.

Firmware activation

Activate firmware manually or script the activation using the REST APIs to minimize network outage. In this case, the order of module activation plays a crucial role in how network and storage connectivity will be interrupted or preserved during a firmware update. Hewlett Packard Enterprise recommends alternating the activation of VC interconnect firmware.

  • If the server network and storage connectivity are redundant across horizontally adjacent VC interconnects, then alternating the activation between the left and right (odd and even) side modules can minimize disruptions of network and storage connectivity.

  • If the server network and storage connectivity are redundant across vertically adjacent VC interconnects, then the activation order must be alternated so that a server does not lose connectivity to both adapter ports at the same time to minimize disruptions of network and storage connectivity.

A and B side connectivity

Create Ethernet and Fibre Channel networks with both A and B side connectivity to allow either all uplinks in the uplink set to be in an active state at all times or to provide for a controlled failover.

Redundant network connections

Configure NIC teaming and vSwitch configuration to ensure redundancy of the network connectivity, fast network path failure detection, and timely failover to a redundant path, if available.

The following operating system settings allow faster link failure detection and failover initialization:

  • Under normal operating conditions, the Smart Link setting will alter the individual NIC port state in the vSwitch, vDS (vNetwork Distributed Switch), or teaming software by turning off the corresponding server NIC port. This causes the operating system to detect a failure and direct traffic to an alternate path. In order for the Smart Link functionality to operate as designed, valid DCC (Device Control Channel)-compatible NIC firmware and drivers must be installed on the server blade. However, during the firmware update process when VC interconnects are reset for activation, Smart Link and the DCC protocol will not be able to send a message to the NIC indicating that the link went down since the interconnect management processor is being restarted. Therefore, during firmware update operation it is the responsibility of the NIC and host OS to detect the link failures by configuring the vSwitch or vDS network failover detection option for Link Status Only in VMware ESXi Server network configuration.

  • In vSphere environments, Hewlett Packard Enterprise recommends to either turn OFF the high availability (HA) mode or increase the vSphere HA timeout from the default of 13 seconds to 30-60 seconds. When these options are configured, all guest operating systems will be able to survive the upgrade and the expected network outage due to the stacking link reconvergence and optimal network path recalculation.

  • For environments where changing network failover detection options or HA settings is not possible, use the Stage firmware for later activation option of the firmware update. VC interconnects will be updated but not activated. You can then manually activate the firmware by rebooting VC modules with the OA or by navigating in the UI to Logical Interconnects > Actions > Update firmware and selecting Activate firmware.

  • The Spanning Tree Edge Port feature of some switches allows a switch port to bypass the ‘listening’ and ‘learning’ stages of spanning tree and quickly transition to the ‘forwarding’ stage. By enabling this feature, edge devices immediately begin communication on the network instead of having to wait on Spanning Tree to determine if it needs to block the port to prevent a loop – a process that can take over 30 seconds with default Spanning Tree timers. Since VC interconnects are edge devices, this feature allows server NICs to begin immediate communication on the network rather than waiting for the additional 30 seconds for the spanning tree algorithm to recalculate.

More information