Manuals

Manuals
Troubleshooting Your System: Dell PowerEdge M905, M805, M600, and M605 Hardware Owner's Manual

Back to Contents Page

Troubleshooting Your System

Dell™ PowerEdge™ M905, M805, M600, and M605 Hardware Owner's Manual

  Safety First—For You and Your System

  Start-Up Routine

  Checking the Equipment

  Troubleshooting External Connections

  Responding to a Systems Management Alert Message

  Troubleshooting a Wet Enclosure

  Troubleshooting a Damaged Enclosure

  Troubleshooting Enclosure Components

  Troubleshooting Blade Components



Safety First—For You and Your System

To perform certain procedures in this document, you must remove the system cover and work inside the system. While working inside the system, do not attempt to service the system except as explained in this guide and elsewhere in your system documentation.

CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.

Start-Up Routine

Look and listen during the system's start-up routine for the indications described in Table 5-1.

Table 5-1. Start-Up Routine Indicators

Look/listen for:

Action

An error message displayed on the monitor

See System Messages.

Alert messages from the systems management software

See the systems management software documentation.

The monitor's power indicator

See Troubleshooting Video.

The keyboard indicators

See Troubleshooting the Keyboard.

The USB diskette drive activity indicator

See Troubleshooting USB Devices.

The USB optical drive activity indicator

See Troubleshooting USB Devices.

The hard-drive activity indicator

See Troubleshooting Hard Drives.


Checking the Equipment

This section provides troubleshooting procedures for external devices attached to the system, such as the monitor, keyboard, or mouse. Before you perform any of the procedures, see Troubleshooting External Connections.


Troubleshooting External Connections

Loose or improperly connected cables are the most likely source of problems for the system, monitor, and other peripherals (such as a keyboard, mouse, or other external device). Ensure that all external cables are securely attached to the external connectors on your system. See Figure 1-6 for the front-panel connectors on your system and Figure 1-9 for the back-panel connectors.

Troubleshooting Video

Problem
  • Loss of video, or poor video quality

Possible Cause
  • Faulty monitor or monitor cable

  • Video port disabled

  • Faulty iKVM module

  • Blade connection to midplane

Action
  1. Check the connection to the iKVM module.

Try swapping cables if another monitor cable is available.

  1. Verify that the iKVM firmware revision is current.

  2. Check the monitor connection to either the front-panel connector on the blade or the back-panel iKVM module.

  3. Ensure that the port is not disabled by the CMC or by redirection to another port.

  4. If two or more blades are installed in the enclosure, select a different blade.

If the monitor is connected to the back-panel iKVM module and works with another blade, the first blade may need to be reseated. See Removing and Installing a Blade. If reseating the blade does not help, the blade may be faulty. See Getting Help.

  1. Swap the monitor with a known-working monitor.

If the monitor does not work when connected to the blade front-panel connector, the blade may be faulty. See Getting Help.

If the monitor does not work when connected to the iKVM module, the iKVM module may be faulty. See Getting Help.

Troubleshooting the Keyboard

Problem
  • No keyboard input

Possible Cause
  • Faulty keyboard or keyboard cable

  • iKVM module

  • Blade connection to midplane

  • Faulty SIP (KVM dongle, used with an external KVM)

Action
  1. Ensure that the blade(s) is turned on.

  2. Verify that the iKVM firmware revision is current.

  3. Check the keyboard connection to either the front-panel connector on the blade or to the back-panel iKVM module.

  4. If the keyboard is connected to an external KVM using a SIP, check that the SIP is compatible with the KVM.

  5. If two or more blades are installed in the enclosure, select a different blade.

If the keyboard is connected to the back-panel iKVM module and works with another blade, the first blade may need to be reseated. See Removing and Installing a Blade. If reseating the blade does not help, the blade may be faulty. See Getting Help.

  1. Swap the keyboard with a known-working keyboard and repeat step 3 and step 5. If the keyboard does not work with any blade, see Getting Help.

Troubleshooting the Mouse

Problem

No mouse input

Problem
  • Mouse or mouse cable

  • Blade

  • SIP (KVM dongle, used with an external KVM)

Action
  1. Ensure that the blade(s) is turned on.

  2. Verify that the iKVM firmware revision is current.

  3. Check the mouse connection to the either the front-panel connector on the blade or to the back-panel iKVM module.

  4. If the keyboard is connected to an external KVM using a SIP, check that the SIP is compatible with the KVM.

  5. If two or more blades are installed in the enclosure, select a different blade.

If the mouse is connected to the back-panel iKVM module and works with another blade, the first blade may need to be reseated. See Removing and Installing a Blade. If reseating the blade does not help, the blade may be faulty. See Getting Help.

  1. Swap the mouse with a known-working mouse and repeat step 3 and step 5. If the mouse does not work with any blade, see Getting Help.

Troubleshooting USB Devices

NOTE: USB devices can be connected only to the blade front panel. Total length of a USB cable should not exceed 3 m (9.8 ft).
Problem
  • USB device or USB device cable

  • Multiple devices connected directly to blade (powered USB hub not used)

  • Blade

Action
  1. Ensure that the blade(s) is turned on.

  2. Check the USB device connection to the blade.

  3. Swap the USB device with a known-working USB device.

  4. Connect the USB devices to the blade using a powered USB hub.

  5. If another blade is installed, connect the USB device to that blade. If the USB device works with a different blade, the first blade may be faulty. See Getting Help.


Responding to a Systems Management Alert Message

The CMC management applications monitor critical system voltages and temperatures, and the cooling fans in the system. For information about the CMC alert messages, see the Configuration Guide.


Troubleshooting a Wet Enclosure

Problem
  • Liquid spills

  • Splashes

  • Excessive humidity

Action
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
  1. Turn off the system.

  2. Disconnect the power supplies from the PDU.

CAUTION: Wait until all of the indicators on the power supplies turn off before preceding.
  1. Remove all the blades. See Removing a Blade.

  2. Remove the CMC module. See Removing a CMC Module.

  3. Remove the iKVM module. See Removing an iKVM Module.

  4. Remove all I/O modules installed in the system. See Removing an I/O Module.

  5. Remove all the fan modules. See Removing a Fan Module.

  6. Remove all the power supply modules. See Removing a Power Supply Module.

  7. Let the system dry thoroughly for at least 24 hours.

  8. Install all the power supply modules. See Installing a Power Supply Module.

  9. Install all the fan modules. See Installing a Fan Module.

  10. Install all the I/O modules. See Installing an I/O Module.

  11. Install the CMC module(s). See Installing a CMC Module.

  12. Install the iKVM module. See Installing an iKVM Module.

  13. Install all the blades. See Installing a Blade.

  14. Reconnect the power supply modules to their PDU and start up the system.

If the system does not start up properly, see Getting Help.

  1. Run the Server Administrator diagnostics to confirm that the system is working properly (see Running System Diagnostics).

If the tests fail, see Getting Help.


Troubleshooting a Damaged Enclosure

Problem
  • System was dropped or damaged

Action
  1. Ensure that the following components are properly installed and connected:

    • CMC module

    • iKVM module

    • I/O modules

    • Power supply modules

    • Fan modules

    • Blades

  2. Ensure that all cables are properly connected.

  3. Ensure that all components are properly installed and free from damage.

  4. Run the online diagnostics. See Running System Diagnostics.

If the tests fail, see Getting Help.


Troubleshooting Enclosure Components

The following procedures describe how to troubleshoot the following components:

  • Power supply modules

  • Fan modules

  • CMC module

  • Network switch module

Troubleshooting Power Supply Modules

Problem
  • A power supply module is not operating properly

Action
NOTICE: The power-supply modules are hot-pluggable. Remove and replace only one power-supply module at a time in a system that is turned on. Leave a failed power-supply module installed in the enclosure until you are ready to replace it. Operating the system with a power-supply module removed for extended periods of time can cause the system to overheat.
NOTE: The 2360-W power supply modules require a 200–240 V power source to operate. If they are plugged into 110-V electrical outlets, the power supply modules do not power up.
  1. Locate the faulty power supply module and check the indicators. See Figure 1-11. The power supply's AC indicator is green if AC power is available. The power supply's fault indicator is amber if the power supply is faulty. If no indicators are lit, ensure that 208V AC power is available from the PDU and that the power cable is properly connected to the power supply module.

  2. Install a new power supply. See Installing a Power Supply Module.

NOTE: After installing a new power supply, allow several seconds for the system to recognize the power supply and determine whether it is working properly. The power supply DC power indicator turns green if the power supply is functioning properly. See Figure 1-11.
  1. If none of the power supplies show a fault LED and the blades will not power on, check the LCD display or CMC for status messages.

  2. If the problem is not resolved, see Getting Help for information about obtaining technical assistance.

Troubleshooting Fan Modules

Problem
  • A fan is not operating properly

Action
NOTICE: The fan modules are hot-pluggable. Remove and replace only one fan module at a time in a system that is turned on. Operating the system without all six fan modules for extended periods of time can cause the system to overheat.
  1. Locate the faulty fan.

Each fan module has indicators that identify a faulty fan. See Figure 1-12.

  1. Remove the fan module. See Removing a Fan Module.

  2. Examine the blades for debris. If debris is present, carefully remove it.

  3. Reseat the faulty fan. See Installing a Fan Module.

  4. If none of the fans show a fault LED and the blades will not power on, check the LCD display or CMC for status messages.

  5. If the problem is not resolved, install a new fan.

  6. If the new fan does not operate, see Getting Help.

Troubleshooting the CMC Module

Problem
  • CMC module is not operating properly

  • System message indicates a problem with the CMC module

  • CMC module does not failover or fail back

  • CMC module cable connections

Action
NOTE: To eliminate the possibility of a hardware problem with the module or its attaching devices, first ensure that the module is properly initialized and configured. See the Configuration Guide and the documentation that came with the module before performing the following procedure.
  1. Verify that the latest firmware is installed on the CMC module.

See support.dell.com for the latest firmware and refer to the release notes for firmware compatibility and update information.

  1. Verify that the CMC(s) have valid IP addresses for the subnet. Verify using the ICMP ping command.

  2. Reseat the CMC module and see if the CMC module fault indicator turns off. See CMC Module. See Figure 1-14 for more information about the module's indicators.

  3. If another CMC module is available, swap the two modules.

  4. If the fault indicator is off, but the serial device connected to the serial port is not properly operating, go to step 6. If the fault indicator is off, but the network management device connected to the network interface connector port is not properly operating, go to step 9.

  5. Reseat the serial cable to the serial connector on the CMC module and to the serial device communicating with it.

  6. Connect a known-working null-modem serial cable between the CMC module and the serial device.

  7. Connect a known-working serial device to the CMC module.

If the serial device and CMC module still do not communicate with each other, see Getting Help.

  1. Reseat the network cable to the network connector on the CMC module and to the network device.

  2. Connect a known-working network cable between the CMC module and the network device.

NOTE: If the CMC is connected to another CMC in an adjacent enclosure and there is no failover, check the network cable connected to port Gb2. If there is no external management connection to the CMC, check the cable connected to port Gb1. See Figure 1-14.
  1. Connect a known-working network device to the CMC module.

If the network device and CMC module still do not communicate with each other, see Getting Help.

Troubleshooting the iKVM Module

Problem:

When using iDRAC video\console redirection you cannot see video through the iKVM when you switch to a blade running Linux.

Likely Cause and Solution:

A monitor or KVM appliance with a lower resolution has recently been added.

Example:

A blade running X Windows under Linux is inserted and powered on. A user connects to the blade in OS GUI mode via the iDRAC and a video resolution is detected and hard-set for that session. A monitor or KVM appliance is attached to the front or rear iKVM interface on the M1000e enclosure. The monitor or the KVM appliance is configured with a resolution LOWER than the currently configured resolution in the X-Window session on the Linux blade.

When you select the Linux blade using the front or rear port on the iKVM, the iDRAC circuit adopts the lower resolution of the external connected devices. The video on the lower resolution monitor or the KVM appliance will not be displayed until X Windows is restarted (iDRAC video should still be viewable.)

Solution:
  1. From the iDRAC session, exit and re-enter GUI mode. The lower resolution will be communicated and utilized.

  2. Set all monitors or KVM appliances connected to the M1000e enclosure to the same resolution or higher as configured on the Linux blades in GUI mode.

  3. From the lower resolution monitor (no video displayed) press <CTRL> <ALT><F3> to change to the non-GUI login screen.

  4. Restart X Windows to detect and utilize the lower resolution.

Troubleshooting a Network Switch Module

Problem
  • System cannot communicate with the network

  • Network cable connections

  • Network switch module and hub configuration settings

Action
NOTE: To eliminate the possibility of a hardware problem with the module or its attaching devices, first ensure that the module is properly initialized and configured. See the Configuration Guide and the documentation that came with the module before performing the following procedure.
  1. Check that you have installed the module in an I/O slot that matches its fabric type. See Supported I/O Module Configurations.

  2. Check that the passthrough module or switch ports are cabled correctly.

A given mezzanine card in a full-height blade connects to two I/O ports on the two associated I/O modules. See I/O Module Port Assignments - Full-Height Blades.

  1. Verify that the proper firmware revision is properly initialized and configured.

  2. Verify that the switch module has a valid IP address for the subnet. Verify using the ICMP ping command.

  3. Check the network connector indicators on the network switch module.

    • If the link indicator displays an error condition, check all cable connections.

See I/O Connectivity for the link indicator error conditions for your particular network switch module.

    • Try another connector on the external switch or hub.

    • If the activity indicator does not light, replace the network switch module. See I/O Modules.

  1. Using the switch management interface, verify the switch port properties. If the switch is configured correctly, back up the switch configuration and replace the switch. See the switch module documentation for details.

  2. If the blade requires a mezzanine card for a particular network switch module, ensure that the appropriate mezzanine card is installed. If so, reseat the mezzanine card. See I/O Module Mezzanine Cards.

If the network link indicator on the blade is green, then the blade has a valid link to the appropriate network switch module.

  1. Ensure that the appropriate operating system drivers are installed and that the protocol settings are configured to ensure proper communication.


Troubleshooting Blade Components

The following procedures describe how to troubleshoot the following components. See Figure 3-4 for the location of the components inside the blade.

  • Memory

  • Hard drives

  • Microprocessors

  • Blade system board

  • Battery

Troubleshooting Blade Memory

Problem
  • Faulty memory module

  • Faulty blade board

Action
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
NOTE: Before performing the following procedure, ensure that you have installed the memory modules according to the memory installation guidelines for the blade. See System Memory.
  1. Restart the blade.

    1. Press the power button once to turn off the blade.

    1. Press the power button again to apply power to the blade.

If no error messages appear, go to step 8.

  1. Enter the System Setup program and check the system memory setting. See Using the System Setup Program.

If the amount of memory installed matches the system memory setting, go to step 8.

  1. Remove the blade. See Removing a Blade.

  2. Open the blade. See Opening the Blade.

CAUTION: The memory modules are hot to the touch for some time after the blade has been powered down. Allow time for the memory modules to cool before handling them. Handle the memory modules by the card edges and avoid touching the components.
  1. Reseat the memory modules in their sockets. See Installing Memory Modules.

  2. Close the blade. See Closing the Blade.

  3. Install the blade. See Installing a Blade.

  4. Run the system memory test in the system diagnostics. See Running System Diagnostics.

If the test fails, see Getting Help.

Troubleshooting Hard Drives

Problem
  • Device driver error

  • Improperly seated hard drive carrier

  • Faulty hard drive or hard-drive carrier

  • Device drivers

Action
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
NOTICE: This troubleshooting procedure can destroy data stored on the hard drive. Before you proceed, back up all the files on the hard drive, if possible. Refer to the RAID controller documentation for rebuilding and servicing a RAID array.
  1. Run the appropriate controllers test and the hard drive tests in system diagnostics. See Running System Diagnostics.

If the tests fail, continue to step 3.

  1. Take the hard drive offline and wait until the hard-drive indicator codes on the drive carrier signal that the drive may be removed safely, then remove and reseat the drive carrier in the blade. See Hard Drives.

  2. Restart the blade, enter the System Setup program, and confirm that the drive controller is enabled. See Integrated Devices Screen.

  3. Ensure that any required device drivers are installed and are configured correctly.

NOTICE: Installing a hard drive into another bay will break the mirror if the mirror state is optimal.
  1. Remove the hard drive and install it in the other drive bay. See Hard Drives.

  2. If the problem is resolved, reinstall the hard drive in the original bay.

If the hard drive functions properly in the original bay, the drive carrier could have intermittent problems. Replace the drive carrier.

  1. If the hard drive is the boot drive, ensure that the drive is configured and connected properly. See Configuring the Boot Drive.

  2. Partition and logically format the hard drive.

  3. If possible, restore the files to the drive.

If the problem persists, see Getting Help.

Troubleshooting Microprocessors

Problem
  • System message indicates a problem with the microprocessor or hypertransport (HT) bridge cards

  • Heat sink is not installed for the microprocessor

  • (PowerEdge M805 systems only) – Missing or incorrectly installed HT cards in sockets CPU3 and CPU4.

Action
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
  1. Remove the blade. See Removing a Blade.

  2. Open the blade. See Opening the Blade.

CAUTION: The processor and heat sink can become extremely hot. Be sure the processor has had sufficient time to cool before handling.
  1. Ensure that the microprocessor(s) and heat sink(s) are properly installed. See Processors.

  2. If your system only has one microprocessor installed, ensure that it is installed in the primary processor socket. See Figure 7-3 or Figure 7-4.

  3. For a PowerEdge M805 system, check that hypertransport (HT) bridge cards are installed in sockets CPU3 and CPU4, and that both cards are fully seated in the processor sockets. See HT Bridge Card (Service Only).

  4. Close the blade. See Closing the Blade.

  5. Install the blade. See Installing a Blade.

  6. Run Quick Tests in the system diagnostics. See Running System Diagnostics.

If the tests fail or the problem persists, see Getting Help.

Troubleshooting the Blade Board

Problem
  • System message indicates a problem with the blade board.

Action
CAUTION: Only trained service technicians are authorized to remove the system cover and access any of the components inside the system. Before you begin this procedure, review the safety instructions that came with the system.
  1. Turn off the blade.

  2. Clear the blade NVRAM.

See Blade System Board Jumper Settings for the location of the NVRAM_CLR jumper.

  1. If there is a still a problem with the blade, remove and reinstall the blade. See Installing a Blade.

  2. Turn on the blade.

  3. Run the system board test in the system diagnostics. See Running System Diagnostics.

If the tests fail, see Getting Help.

Troubleshooting the NVRAM Backup Battery

Problem
  • System message indicates a problem with the battery

  • System Setup program loses system configuration information

  • System date and time do not stay current

Each blade contains a battery which maintains the blade configuration, date, and time information in NVRAM when you turn off the blade. You may need to replace the battery if an incorrect time or date is displayed during the boot routine.

You can operate the blade without a battery; however, the blade configuration information maintained by the battery in NVRAM is erased each time you remove power from the blade. Therefore, you must re-enter the system configuration information and reset the options each time the blade boots until you replace the battery.

Action
CAUTION: Many repairs may only be done by a certified service technician. You should only perform troubleshooting and simple repairs as authorized in your product documentation, or as directed by the online or telephone service and support team. Damage due to servicing that is not authorized by Dell is not covered by your warranty. Read and follow the safety instructions that came with the product.
  1. Re-enter the time and date through the System Setup program. See Using the System Setup Program.

  2. Remove the blade for at least one hour. See Removing a Blade.

  3. Install the blade. See Installing a Blade.

  4. Enter the System Setup program.

If the date and time are not correct in the System Setup program, replace the battery. See Blade System Board NVRAM Backup Battery.

If the problem is not resolved by replacing the battery, see Getting Help.

NOTICE: If the blade is turned off for long periods of time (for weeks or months), the NVRAM may lose its system configuration information. This situation is caused by a defective battery.
NOTE: Some software may cause the blade's time to speed up or slow down. If the blade seems to operate normally except for the time kept in the System Setup program, the problem may be caused by software rather than by a defective battery.


Back to Contents Page

 

Laptops | Desktops | Business Laptops | Business Desktops | Workstations | Servers | Storage | Services | Monitors | Printers | LCD TVs | Electronics
© 2012 Dell | About Dell | Terms & Conditions | Unresolved Issues | Privacy Statement | Ads and Emails | Dell Recycling | Contact | Site Map | Feedback
AT | AU | BE | BR | CA | CH | CL | CN | CO | DE | DK | ES | FR | HK | IE | IN | IT | JP | KR | ME | MX | MY | NL | NO | PA | PR | RU | SE | SG | UK | VE | ALL

snWEB8