Troubleshooting virtual machine instances

TIP: For additional troubleshooting information, enable console access on the Foundation base appliance using the CLI and then find the following logs. To enable access, see Basic troubleshooting techniques and read the Enable console access recommendation.

/var/log/nova/scheduler.log

You can also access the following logs on the Proxy appliance:

/var/log/nova/compute.log
/var/log/sdn/isc-neutron-agent.log

Delete instance action only partially completes when compute node is unresponsive

Symptom Possible cause and recommendation

You see Warning: This instance is on a host that is not responding

CloudSystem has lost communication with the server that is hosting the virtual machine instance

Deleting an instance triggers an action on both the CloudSystem Foundation base appliance and the compute node. If the compute node is unresponsive, deleting an instance is only partially completed and requires manual cleanup.

Click Cancel to return to the previous screen without deleting the instance.
Try to restore communication by rebooting the host shown in the “Hosted on” field of the instance. If the host recovers, the instance can be cleanly deleted using the “Delete” action on the Instances screen. A clean delete removes the instance from both the Foundation appliance database and the host.

If the host is in an unrecoverable state (on the Compute Nodes screen, the state of the host is “Error”), return to the Instances screen. Select the instance, select the “Delete” action, and click Yes, delete when the warning message is displayed.


	IMPORTANT: This performs a partial delete, which removes the instance from the Foundation appliance database but does not remove the instance from the compute node. After a partial delete, you must clean up the environment as follows: Manually delete all instances on the compute node. From the Compute Nodes screen, select the compute node and deactivate it. See Deactivate a compute node. Check the size of the log files in `/var/log/nova` directory and delete them if they are consuming disk space needed for the next activation or for other use. Remove the server blade from the cloud.

Reinstall the operating system and reactivate the host, if desired.

See Activate a compute node

Launch of first instance provisioned from ESX does not complete

Symptom	Possible cause and recommendation
The first attempt to launch an instance provisioned from ESX does not complete	Virtual machine is created on the hypervisor but provisioning fails due to vSwitch configuration issue Log in to vCenter Server. Select the compute hypervisor and click the Configuration tab. Click Networking in the left menu. Make sure the standard or distributed vSwitch has a unique name in vCenter Server. You cannot have two vSwitches in vCenter Server with the same name. Insufficient resources available on the hypervisor Log in to vCenter Server. Check the Tasks and Events log for the hypervisor. Add additional resources, if needed. Datastore does not have enough space Log in to vCenter Server. Check the available space on the datastore supporting the compute hypervisor. Add additional space, if needed. Datacenter, hypervisor or vSwitch names have white space Log in to vCenter Server. Check the names of the datacenter, hypervisor and vSwitch. If the name has white space, update the name to remove the white space.

Symptom

Possible cause and recommendation

The first attempt to launch an instance provisioned from ESX does not complete

Virtual machine is created on the hypervisor but provisioning fails due to vSwitch configuration issue

Log in to vCenter Server.
Select the compute hypervisor and click the Configuration tab.
Click Networking in the left menu.
Make sure the standard or distributed vSwitch has a unique name in vCenter Server. You cannot have two vSwitches in vCenter Server with the same name.

Insufficient resources available on the hypervisor

Log in to vCenter Server.
Check the Tasks and Events log for the hypervisor.
Add additional resources, if needed.

Datastore does not have enough space

Log in to vCenter Server.
Check the available space on the datastore supporting the compute hypervisor.
Add additional space, if needed.

Datacenter, hypervisor or vSwitch names have white space

Log in to vCenter Server.
Check the names of the datacenter, hypervisor and vSwitch.
If the name has white space, update the name to remove the white space.

Deployed instance does not boot

Symptom Possible cause and recommendation

You see Unable to create instance. No available host can provide the specified resources

No compute nodes in active state

Navigate to the Compute Nodes screen and ensure that at least one compute node is in the Active state.

See Activate a compute node.

Insufficient resources available on active compute nodes

Ensure that you have sufficient cloud resources on the compute node. On the Compute Nodes screen, verify the number of hosted VMs, CPU, memory, and storage usage and compare those values to the resources that will be allocated to the instance. If sufficient resources are not available:

Add compute resources to the compute node.
Free space on the compute node by deleting existing cloud instances.
Verify the physical to virtual oversubscription rates.

See Calculating the number of instances that can be provisioned to a compute node

NOTE: The Compute Nodes screen displays the percent of resources in use and the total amount of resources. However, the actual available resources of a compute node are calculated by subtracting allocated resources (the virtual machine instances already provisioned to a host) from the capacity of the compute node.

Note the number of hosted virtual machine instances when evaluating whether resources on a particular compute node are available, even if the instances are not consuming all allocated resources or are powered down. If one or more virtual machine instances in a host are powered down, the compute node appears to have a high percentage of free resources, but the available resources are actually already allocated to the powered down instances.

Create image action is unsuccessful

Symptom	Possible cause and recommendation
You see an indication that the compute node capacity has been exceeded	Oversubscription rates might be incorrect for the load on the virtual and physical servers Verify the physical to virtual oversubscription rates. See Calculating the number of instances that can be provisioned to a compute node.

Start instance action is unsuccessful

Symptom	Possible cause and recommendation
You cannot start the instance	The instance might be in Shutoff state Click Actions→Reboot. See Reboot instance.

See also

Troubleshooting

prev	up	next
Troubleshooting images	home	Troubleshooting provider networks