Troubleshooting compute nodes

Compute nodes do not appear on overview screen

Symptom Possible cause and recommendation

No KVM compute nodes are visible on the Compute Nodes overview screen

No KVM compute nodes are configured for use

  1. Make sure that all prerequisites have been met for each target compute node.

    See also Activate a compute node.

The KVM compute node network configuration is incorrect

  1. Run ifconfig to check the network configuration on the compute node. The DHCP server on the CloudSystem Console appliance must recognize the compute node.

  2. If the compute node is connected to the appliance (eth1 has a valid DHCP IP address), edit the file /etc/sysconfig/networks on the compute node with the following entries:

    NETWORKING=yes
    HOSTNAME=<change-name>
    DHCP_HOSTNAME=<NON-FQDN>
    

Local yum repositories are causing issues

  1. Remove any locally defined yum repositories that may be interfering with activation.

  2. When reactivating a compute node, make sure any leftover active yum repositories are removed.

The CloudSystem Console is displaying cached data

  1. Refresh the Compute Nodes screen.


[TIP: ]

TIP: Check the following logs for additional information:

  • Foundation base appliance: /var/log/isc/activity <yourhostname>.log

  • KVM compute node: /var/log/secure




Import cluster action does not complete


[TIP: ]

TIP: For additional troubleshooting information, enable console access on your Foundation base appliance using the CLI and then find the following logs. To enable access, see Basic troubleshooting techniques and read the Enable console access recommendation.

  • /etc/pavmms/deployer.conf

  • ci/logs/ciDebug.01.log

  • ci/logs/jetty-PulsarAVMManager/server.log


Symptom Possible cause and recommendation

Import cluster action cannot complete in CloudSystem Console

DHCP server is not defined on the Data Center Management Network

  1. Make sure the Data Center Management Network configured on the management hypervisor is set to use DHCP for IP address assignment.

  2. Log in to the CloudSystem Console and make sure the proxy appliance registered in Integrated Tools is set to receive IP addresses from DHCP.

    • From CloudSystem Console main menu select Integrated Tools.

    • Find the VMware vCenter Server panel on the screen.

    • Make sure the line IP addresses for proxy appliances is set to DHCP.

    • If DHCP is not set, click the link to open the Plan vCenter Access screen.

    • Select DHCP.

  3. Retry the Import cluster action.

Activate compute node action is unsuccessful


[TIP: ]

TIP: For additional troubleshooting information, enable console access on your Foundation base appliance using the CLI and then find the following logs. To enable access, see Basic troubleshooting techniques and read the Enable console access recommendation.


Symptom Possible cause and recommendation

You see an error on the Activity screen when you try to activate a cluster or compute node

Prerequisites for activation have not been met

  1. Make sure that all prerequisites have been met.

    See also Activate a compute node.

The cluster or compute node might be rebooting

  1. Check the status on the Compute Nodes screen.

  2. Wait for a reboot to complete. The status icon will be green.

The user name and password for the operating system on the KVM compute node might be incorrect

  1. Make sure that the user name and password are entered correctly in the Activate dialog when activating the KVM compute node.

Operating system installation on the KVM compute node may be incorrect

  1. Make sure that the activation base image was installed correctly by RedHat Package Manager (RPM).

Retry the activate action.

KVM compute node activation fails or hangs and no error message is displayed

A dependency on the KVM compute node was not met

  1. Log on to the KVM compute node and pull a support dump. The detailed activation log is in /var/log/isc/activation<hostname>.log

  2. Scroll through the log and find the start of the roll back procedure. The error details are displayed just above the start of the roll back procedure.

Deactivate compute node action is unsuccessful

Symptom Possible cause and recommendation
You see an error on the Activity screen when you try to deactivate a compute node

The managed compute node is not been activated

  1. Ensure that the compute node is activated. An active compute node displays a green icon.

One or more virtual machines are running on the compute node

  1. Make sure there are no virtual machine instances running on the compute node. If an instance is running on the compute node, you must delete the instance before you can deactivate the compute node.

    To delete an instance, select Instances from the main menu, then select ActionsDelete.

  2. Retry the deactivate action. See Deactivate a compute node.

The compute node fails when you try to deactivate it

Deactivation log files might have exhausted the compute node disk volume space

  1. Bring the compute node back up.

  2. Make sure that the log files are written to a physical volume other than the boot disk. You should have assigned a log location prior to activating the host.

  3. Check the size of the log files in the /var/log/nova directory. Delete the files if they are consuming too much space on the disk.

  4. When sufficient log file space is available, try again to deactivate the compute node. See Deactivate a compute node.

Delete compute node action is unsuccessful

Symptom Possible cause and recommendation
You see an error on the Activity screen when you try to delete a compute node

The compute node has not been deactivated

  1. Ensure that the compute node is deactivated. A deactivated compute node displays a red icon. See Deactivate a compute node.

  2. Retry the delete action. See Delete a compute node.

See also