The Server Administrator generates events that result in Simple Network Management Protocol (SNMP) traps or operating system event logs. Remote Access Controller (RAC) and Baseboard Management Controller (BMC) also can generate SNMP traps in response to hardware events. This section describes the traps, also known as alerts, generated by the Server Administrator, RAC, and BMC.
The Server Administrator generates events in response to changes in the status of sensors and other monitored parameters. When an event with predefined characteristics occurs on your system, the SNMP subagent sends information about the event, along with trap variables, to the management console.
Each status change event generates a unique identifier called the trap ID and a trap description that describes the event. The trap ID and message uniquely describe the severity and cause of the event, and provide other relevant information such as the location of the event and the monitored item's previous state.
"Instrumentation Traps" lists all Server Administrator Instrumentation trap IDs in numerical order and includes each trap ID's corresponding description, severity level, and cause. Description text in brackets (for example, <State>) describes the event-specific information provided by Server Administrator.
"RAC Traps" lists RAC trap IDs in numerical order and includes each trap ID's corresponding description, severity level, and cause.
"BMC Traps" lists BMC trap IDs and includes each trap ID's corresponding description and severity level.
Trap Variables
This section describes the variables that are sent to the management console to provide additional information about a trap or alert generated by some event on your system. The trap variables presented here apply to all Instrumentation and RAC traps. Trap variables are sent in the order listed and are reserved for use only in traps. When a varbind is created for a trap variable, a zero is appended to the object ID (OID) to create the OID for the varbind.
System
Variable Name
alertSystem
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.1
Description
Identifies the system generating the alert.
Syntax
DisplayString (SIZE (0..255))
Table Index OID
Variable Name
alertTableIndexOID
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.2
Description
Gives the object identifier for the index attribute in the table that contains the object causing the alert. Uniquely identifies the object causing the alert and can be used to correlate different alerts caused by the same object.
Syntax
OBJECT IDENTIFIER
Message
Variable Name
alertMessage
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.3
Description
Describes the alert.
Syntax
DisplayString (SIZE (0..1024))
Current Status
Variable Name
alertCurrentStatus
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.4
Description
Gives the current status of the object causing the alert.
Syntax
DellStatus
Previous Status
Variable Name
alertPreviousStatus
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.5
Description
Gives the previous status of the object causing the alert.
Syntax
DellStatus
Data
Variable Name
alertData
Object ID
1.3.6.1.4.1.674.10892.1.5000.10.6
Description
Provides Server Administrator-defined data related to the alert.
Syntax
OCTET STRING (SIZE (0..1024))
Understanding the Trap Description
Table 25-1 lists in alphabetical order each line item that may appear in the trap description.
Table 25-1. Trap Description Reference
Description Line Item
Explanation
Action performed was: <Action>
Specifies the automatic server recovery action that was performed, for example:
Action performed was: Power cycle
Action requested was: <Action>
Specifies the user initiated host control action that was requested, for example:
Action requested was: Reboot, shutdown OS first
Additional details: <Additional details for the events>
Specifies possible additional details about the specified device, for example:
Additional details:
Memory device: DIMM_1A Serial number: 11111111
Memory device: DIMM_1B Serial number: 22222222
<Additional power supply status information>
Specifies any additional power supply information pertaining to the event, for example:
Power supply input AC is off, Power supply POK (power OK) signal is not normal, Power supply is turned off
Battery sensor status: <status>
Specifies the status reported by the battery sensor, for example:
Battery sensor status: Predictive failure
Chassis intrusion state:
<Intrusion state>
Specifies the chassis intrusion state (open or closed), for example:
Chassis intrusion state: Open
Chassis location:
<Name of chassis>
Specifies the name of the chassis that generated the message, for example:
Chassis location: Main System Chassis
Configuration error type: <type of configuration error>
Specifies the type of configuration error that occurred, for example:
Configuration error type: Revision mismatch
Current sensor value (in Amps): <Reading>
Specifies the current sensor value in amps, for example:
Current sensor value: 7.853
Date and time of action: <Date and time>
Specifies the date and time that an automatic server recovery action was performed, for example:
Date and time of action: Fri May 30 23:55:44 2003.
Device location: <Location in chassis>
Specifies the location of the device in the specified chassis, for example:
Device location: Mem Card A
Discrete current state: <State>
Specifies the state of the current sensor, for example:
Discrete current state: Good
Discrete temperature state: <State>
Specifies the state of the temperature sensor, for example:
Discrete temperature state: Good
Discrete voltage state: <State>
Specifies the state of the voltage sensor, for example:
Discrete voltage state: Good
Fan sensor value: <Reading>
Specifies the fan speed in revolutions per minute (RPMs) or On/Off, for example:
Fan sensor value (in RPM): 2600
Fan sensor value: Off
Log type: <Log type>
Specifies the type of hardware log, for example:
Log type: Embedded Server Management (ESM)
Memory device bank location:
<Bank name in chassis>
Specifies the name of the memory bank in the system that generated the message, for example:
Memory device bank location: Bank_1
Memory device location:
<Device name in chassis>
Specifies the location of the memory module in the chassis, for example:
Memory device location: DIMM_A
Number of devices required for full redundancy: <Number>
Specifies the number of power supply or cooling devices required to achieve full redundancy, for example:
Number of devices required for full redundancy: 4
Possible memory module event cause: <list of causes>
Specifies a list of possible causes for the memory module event, for example:
Possible memory module event cause: Single bit warning error rate exceeded
Single bit error logging disabled
Power Supply type: <type of power supply>
Specifies the type of power supply, for example:
Power Supply type: VRM
Pre-failure state was: <State>
Specifies the status of the previous memory message, for example:
Pre-failure state was: Failed
Previous redundancy state was: <State>
Specifies the status of the previous redundancy message, for example:
Previous redundancy state was: Lost
Previous state was: <State>
Specifies the previous state of the sensor, for example:
Previous state was: OK (Normal)
Processor sensor status:
<status>
Specifies the status of the processor sensor, for example:
Processor sensor status: Configuration error
Redundancy unit:
<Redundancy location in chassis>
Specifies the location of the redundant power supply or cooling unit in the chassis, for example:
Redundancy unit: Fan Enclosure
Sensor location:
<Location in chassis>
Specifies the location of the sensor in the specified chassis, for example:
Sensor location: CPU1
Temperature sensor value (in degrees Celsius):
<Reading>
Specifies the temperature in degrees Celsius, for example:
Temperature sensor value (in degrees Celsius): 30
Voltage sensor value (in Volts):
<Reading>
Specifies the voltage sensor value in volts, for example:
Voltage sensor value: 1.693
Understanding Trap Severity
Traps often contain information about values recorded by probes or sensors. Probes and sensors monitor critical components for values such as amperage, voltage, and temperature. When an event occurs on your system, the Server Administrator sends information about one of the following event types to the system management console:
Information/Informational An event that describes the successful operation of a unit, such as a power supply turning on or a sensor reading returning to normal.
Warning An event that is not necessarily significant, but may indicate a possible future problem, such as crossing a warning threshold.
Critical/Error A significant event that indicates actual or imminent loss of data or loss of function, such as crossing a failure threshold or a hardware failure.
Instrumentation Traps
This section describes the traps that are generated by the Instrumentation service of the Server Administrator. All of the traps documented in this section belong to the MIB enterprise identified by OID 1.3.6.1.4.1.674.10892.1 and are sent with all of the trap variables documented in the section, "Trap Variables." The trap variables are sent in the order in which they are listed. The messages in the Description fields below show the format of the message that is sent in the alertMessage varbind. If a message in a Description field has multiple lines, the message contains newline (0Ah) characters that are part of the value in the alertMessage varbind.
Miscellaneous Traps
Table 25-2 lists Miscellaneous traps that inform you that certain alert systems are up and working.
Table 25-2. Miscellaneous Traps
Trap ID
Description
Severity
Cause
System Up
1001
Server Administrator startup complete
Information
Server Administrator completed its initialization.
Thermal Shutdown
1004
Thermal shutdown protection has been initiated
Error
This message is generated when a system is configured for thermal shutdown due to an error event. If a temperature sensor reading exceeds the error threshold for which the system is configured, the operating system shuts down and the system powers off. This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time.
Automatic System Recovery
1006
Automatic System Recovery (ASR) action was performed
Action performed was: <Action>
Date and time of action: <Date and time>
Error
This message is generated when an automatic system recovery action is performed due to a hung operating system. The action performed and the date and time of the action are provided.
Host System Reset
1007
User initiated host system control action
Action requested was: <Action>
Information
User requested a host system control action to reboot, power off, or power cycle the system or another event such as thermal shutdown protection initiated a power off, operating system shutdown.
Temperature Probe Traps
Temperature probes help protect critical components by alerting the systems management console when temperatures become too high inside a chassis. The temperature probe traps use additional variables: sensor location, chassis location, previous state, and temperature sensor value reported in degrees Celsius.
Table 25-3. Temperature Probe Traps
Trap ID
Description
Severity
Cause
Temperature Probe Normal
1052
Temperature sensor returned to a normal value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Temperature sensor value (in degrees Celsius): <Reading>
If sensor type is discrete: Discrete temperature state: <State>
Information
A temperature sensor on the backplane board, system board, or drive carrier in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Temperature Probe Warning
1053
Temperature sensor detected a warning value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Temperature sensor value (in degrees Celsius): <Reading>
If sensor type is discrete: Discrete temperature state: <State>
Warning
A temperature sensor on the backplane board, system board, or drive carrier in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Temperature Probe Failure
1054
Temperature sensor detected a failure value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Temperature sensor value (in degrees Celsius): <Reading>
If sensor type is discrete: Discrete temperature state: <State>
Error
A temperature sensor on the backplane board, system board, or drive carrier in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Temperature Probe Nonrecoverable
1055
Temperature sensor detected a non-recoverable value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Temperature sensor value (in degrees Celsius): <Reading>
If sensor type is discrete: Discrete temperature state: <State>
Error
A temperature sensor on the backplane board, system board, or drive carrier in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Cooling Device Traps
Cooling device traps monitor how well a fan is functioning.
Table 25-4. Cooling Device Traps
Trap ID
Description
Severity
Cause
Cooling Device Normal
1102
Fan sensor returned to a normal value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Fan sensor value: <Reading>
Information
A fan sensor reading on the specified system returned to a valid range after crossing a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided.
Cooling Device Warning
1103
Fan sensor detected a warning value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Fan sensor value: <Reading>
Warning
A fan sensor reading in the specified system exceeded a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided.
Cooling Device Failure
1104
Fan sensor detected a failure value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Fan sensor value: <Reading>
Error
A fan sensor in the specified system detected the failure of one or more fans. The sensor location, chassis location, previous state, and fan sensor value are provided.
Cooling Device Nonrecoverable
1105
Fan sensor detected a non-recoverable value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Fan sensor value: <Reading>
Error
A fan sensor detected an error from which it cannot recover. The sensor location, chassis location, previous state, and fan sensor value are provided.
Voltage Probe Traps
Voltage probes monitor the number of volts across critical components.
Table 25-5. Voltage Probe Traps
Trap ID
Description
Severity
Cause
Voltage Probe Normal
1152
Voltage sensor returned to a normal value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Voltage sensor value (in Volts): <Reading>
If sensor type is discrete: Discrete voltage state: <State>
Information
A voltage sensor in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
Voltage Probe Warning
1153
Voltage sensor detected a warning value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Voltage sensor value (in Volts): <Reading>
If sensor type is discrete: Discrete voltage state: <State>
Warning
A voltage sensor in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
Voltage Probe Failure
1154
Voltage sensor detected a failure value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Voltage sensor value (in Volts): <Reading>
If sensor type is discrete: Discrete voltage state: <State>
Error
A voltage sensor in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
Voltage Probe Nonrecoverable
1155
Voltage sensor detected a non-recoverable value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Voltage sensor value (in Volts): <Reading>
If sensor type is discrete: Discrete voltage state: <State>
Error
A voltage sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and voltage sensor value are provided.
Amperage Probe Traps
Amperage probes measure the amount of current (in amperes) that is traversing critical components.
Table 25-6. Amperage Probe Traps
Trap ID
Description
Severity
Cause
Amperage Probe Normal
1202
Current sensor returned to a normal value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Current sensor value (in Amps): <Reading>
If sensor type is discrete: Discrete current state: <State>
Information
A current sensor on the power supply for the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
Amperage Probe Warning
1203
Current sensor detected a warning value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Current sensor value (in Amps): <Reading>
If sensor type is discrete: Discrete current state: <State>
Warning
A current sensor on the power supply for the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
Amperage Probe Failure
1204
Current sensor detected a failure value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Current sensor value (in Amps): <Reading>
If sensor type is discrete: Discrete current state: <State>
Error
A current sensor on the power supply for the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
Amperage Probe Nonrecoverable
1205
Current sensor detected a non-recoverable value
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
If sensor type is not discrete: Current sensor value (in Amps): <Reading>
If sensor type is discrete: Discrete current state: <State>
Error
A current sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and current sensor value are provided.
Chassis Intrusion Traps
Chassis intrusion traps are a security measure. Chassis intrusion means that someone is opening the cover to a system's chassis. Alerts are sent to prevent unauthorized removal of parts from a chassis.
Table 25-7. Chassis Intrusion Traps
Trap ID
Description
Severity
Cause
Chassis Intrusion Normal
1252
Chassis intrusion returned to normal
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Chassis intrusion state: <Intrusion state>
Information
A chassis intrusion sensor in the specified system detected that a cover was opened while the system was operating but has since been replaced. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Chassis Intrusion Detected
1254
Chassis intrusion detected
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Chassis intrusion state: <Intrusion state>
Error
A chassis intrusion sensor in the specified system detected that the system cover was opened while the system was operating. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Redundancy Unit Traps
Redundancy means that a system chassis has more than one of certain critical components. Fans and power supplies, for example, are so important for preventing damage or disruption of a computer system that a chassis may have "extra" fans or power supplies installed. Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails but others are still operating. Redundancy is lost when the number of components functioning falls below the redundancy threshold.
The number of devices required for full redundancy is provided as part of the trap message when applicable for the redundancy unit and the platform. For more details on redundancy computation, please refer to the respective platform documentation.
Table 25-8. Redundancy Unit Traps
Trap ID
Description
Severity
Cause
Redundancy Normal
1304
Redundancy regained
Redundancy unit: <Redundancy location in chassis>
Chassis location: <Name of chassis>
Previous redundancy state was: <State>
Number of devices required for full redundancy: <Number>
Information
A redundancy sensor in the specified system detected that a "lost" redundancy device has been reconnected or replaced; full redundancy is in effect. The redundancy unit location, chassis location, and previous redundancy state are provided.
Redundancy Degraded
1305
Redundancy degraded
Redundancy unit: <Redundancy location in chassis>
Chassis location: <Name of chassis>
Previous redundancy state was: <State>
Number of devices required for full redundancy: <Number>
Warning
A redundancy sensor in the specified system detected that one of the components of the redundancy unit has failed but the unit is still redundant. The redundancy unit location, chassis location, and previous redundancy state are provided.
Redundancy Lost
1306
Redundancy lost
Redundancy unit: <Redundancy location in chassis>
Chassis location: <Name of chassis>
Previous redundancy state was: <State>
Number of devices required for full redundancy: <Number>
Warning or Error (depending on the number of units that are functional)
A redundancy sensor in the specified system detected that one of the components in the redundant unit has been disconnected, has failed, or is not present. The redundancy unit location, chassis location, and previous redundancy state are provided.
Power Supply Traps
Power supply traps provide status and warning information for power supplies present in a particular chassis.
Table 25-9. Power Supply Traps
Trap ID
Description
Severity
Cause
Power Supply Normal
1352
Power supply returned to normal
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Power Supply type: <type of power supply>
<Additional power supply status information>
If in configuration error state: Configuration error type: <type of configuration error>
Information
A power supply has been reconnected or replaced. The sensor location, chassis location, previous state, and additional information about the power supply event are provided.
Power Supply Warning
1353
Power supply detected a warning
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Power Supply type: <type of power supply>
<Additional power supply status information>
If in configuration error state: Configuration error type: <type of configuration error>
Warning
A power supply sensor has detected a warning condition. The sensor location, chassis location, previous state, and additional power supply status information are provided.
Power Supply Failure
1354
Power supply detected a failure
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was: <State>
Power Supply type: <type of power supply>
<Additional power supply status information>
If in configuration error state: Configuration error type: <type of configuration error>
Error
A power supply has been disconnected or has failed. The sensor location, chassis location, previous state, and additional information about the power supply event are provided.
Memory Device Traps
Memory device messages provide status and warning information for memory modules present in a particular system. Memory devices determine health status by counting the number of ECC memory corrections.
NOTE: A value of failure or non-recoverable does not indicate a system failure or loss of data, but rather that the specified system exceeded the specified ECC correction threshold. Although the system continues to function, you should perform system maintenance as described in Table 25-10.
Table 25-10. Memory Device Messages
Trap ID
Description
Severity
Cause
1403
Memory device status is <status>
Memory device location: <Location in chassis>
Possible memory module event cause: <list of causes>
Warning
A memory device correction rate exceeded an acceptable value. The memory device status and location are provided.
1404
Memory device status is <Status>
Memory device location: <Location in chassis>
Possible memory module event cause: <list of causes>
Error
A memory device correction rate exceeded an acceptable value, a memory spare bank was activated, or a Uncorrectable Memory Event occurred. The system continues to function normally (except for a Uncorrectable Memory Event). Clear the memory error on Uncorrectable Memory Event. Replace the memory module identified in the message during the system's next scheduled maintenance. The memory device status and location are provided.
Fan Enclosure Traps
Some systems are equipped with a protective enclosure for fans. Fan enclosure traps monitor enclosures for whether foreign objects are present and for how long a fan enclosure is absent from a chassis.
Table 25-11. Fan Enclosure Traps
Trap ID
Description
Severity
Cause
Fan Enclosure Insertion
1452
Fan enclosure inserted into system
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Information
A fan enclosure has been inserted into the specified system. The sensor location and chassis location are provided.
Fan Enclosure Removal
1453
Fan enclosure removed from system
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Warning
A fan enclosure has been removed from the specified system. The sensor location and chassis location are provided.
Fan Enclosure Extended Removal
1454
Fan enclosure removed from system for an extended amount of time
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Error
A fan enclosure has been removed from the specified system for a user-definable length of time. The sensor location and chassis location are provided.
AC Power Cord Traps
The AC power cord sensor monitors the presence of AC power for an AC power cord. AC power cord traps provide status and warning information for power cords that are part of an AC power switch, if your system supports AC switching.
Table 25-12. AC Power Cord Traps
Trap ID
Description
Severity
Cause
AC Power Cord No Power Nonredundant
1501
AC power cord is not being monitored
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Information
The AC power cord status is not being monitored. This occurs when a system's expected AC power configuration is set to nonredundant. The sensor location and chassis location information are provided.
AC Power Cord Normal
1502
AC power has been restored
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Information
An AC power cord that did not have AC power has had the power restored. The sensor location and chassis location information are provided.
AC Power Cord Failure
1504
AC power has been lost
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Error
An AC power cord has lost its power. The sensor location and chassis location information are provided.
Hardware Log Traps
Hardware logs provide hardware status messages to systems management software. On certain systems, the hardware log is implemented as a circular queue. When the log becomes full, the oldest status messages are overwritten when new status messages are logged. On some systems, the log is not circular. On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware log sensor messages provide status and warning information about the noncircular logs that may fill up, resulting in lost status messages.
Table 25-13. Hardware Log Traps
Trap ID
Description
Severity
Cause
Hardware Log Normal
1552
Log size is no longer near or at capacity
Log type: <Log type>
Information
The hardware log on the specified system is no longer near or at its capacity, usually as the result of clearing the log. The log type information is provided.
Hardware Log Warning
1553
Log size is near or at capacity
Log type: <Log type>
Warning
The size of a hardware log on the specified system is near or at the capacity of the hardware log. The log type information is provided.
Hardware Log Full
1554
Log size is full
Log type: <Log type>
Error
The size of a hardware log on the specified system is at the capacity of the hardware log. The log type information is provided.
Processor Device Status Traps
The BMC on some systems reports the status of processor devices. Processor device status traps provide status and warning information for processor devices present in a system with a BMC that reports the status of processor devices.
Table 25-14. Processor Device Status Traps
Trap ID
Description
Severity
Cause
Processor Device Status Normal
1602
Processor sensor returned to a normal value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Processor sensor status: <status>
Information
A processor sensor in the specified system transitioned back to a normal state. The sensor location, chassis location, previous state and processor sensor status are provided.
Processor Device Status Warning
1603
Processor sensor detected a warning value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Processor sensor status: <status>
Warning
A processor sensor in the specified system is in a throttled state. The sensor location, chassis location, previous state and processor sensor status are provided.
Processor Device Status Failure
1604
Processor sensor detected a failure value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Processor sensor status: <status>
Error
A processor sensor in the specified system is disabled, has a configuration error, or experienced a thermal trip. The sensor location, chassis location, previous state and processor sensor status are provided.
Pluggable Device Traps
Server Administrator monitors the addition and removal of pluggable devices such as memory cards. Device traps provide information about the addition and removal of such devices.
Table 25-15. Pluggable Device Traps
Trap ID
Description
Severity
Cause
Pluggable Device Addition
1651
Device added to system
Device Location: <Location in chassis>
Chassis Location: <Name of chassis>
Additional Details: <Additional details for the events>
Information
A device was added to the specified system. The device location, chassis location, and additional event details, if available, are provided.
Pluggable Device Removal
1652
Device removed from system
Device Location: <Location in chassis>
Chassis Location: <Name of chassis>
Additional Details: <Additional details for the events>
Information
A device was removed from the specified system. The device location, chassis location, and additional event details, if available, are provided.
Pluggable Device Configuration Error
1653
Device configuration error detected
Device Location: <Location in chassis>
Chassis Location: <Name of chassis>
Additional Details: <Additional details for the events>
Error
A configuration error was detected for a pluggable device in the specified system. The device may have been added to the system incorrectly. The device location, chassis location, and additional event details, if available, are provided.
Battery Traps
The BMC on some systems reports the status of batteries. Battery traps provide status and warning information for batteries present in a system with a BMC that reports the status of batteries.
Table 25-16. Battery Traps
Trap ID
Description
Severity
Cause
Battery Normal
1702
Battery sensor returned to a normal value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Battery sensor status: <status>
Informational
A battery sensor in the specified system detected that a battery transitioned back to a normal state. The sensor location, chassis location, previous state, and battery sensor status are provided.
Battery Warning
1703
Battery sensor detected a warning value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Battery sensor status: <status>
Warning
A battery sensor in the specified system detected that a battery is in a predictive failure state. The sensor location, chassis location, previous state, and battery sensor status are provided.
Battery Failure
1704
Battery sensor detected a failure value
Sensor Location: <Location in chassis>
Chassis Location: <Name of chassis>
Previous state was: <State>
Battery sensor status: <status>
Critical
A battery sensor in the specified system detected that a battery has failed. The sensor location, chassis location, previous state, and battery sensor status are provided.
RAC Traps
This section describes the traps that are generated by the SNMP agent of the Remote Access Controller (RAC). All of the enterprise-specific traps documented in this section belong to the MIB enterprise identified by OID 1.3.6.1.4.1.674.10892.2 and are sent with all of the trap variables documented in the section "Traps". The trap variables are sent in the order in which they are listed.
Table 25-17. Generic Traps
Trap ID
Name
Description
Severity
Category
Cause
Supported by RAC Platform
0
CodeStart
SNMP agent is initializing itself
Information
Status
RAC power on or reset.
All
1
Authentication Failure
Request received with an invalid community name
Critical
Error
SNMP request with an invalid community name.
All
Table 25-18. Enterprise-specific Traps
Trap ID
Name
Description
Severity
Category
Cause
Supported by RAC Platform
1001
alertDrscTest TrapEvent
The RAC generated a test trap event in response to a user request
Information
Status
A test SNMP trap generated by a RAC.
All
1002
alertDrscAuth Error
RAC Authentication failures during a time period have exceeded a threshold
Minor
Error
RAC login failure caused by authentication failure, number of concurrent logins exceed limit, or permission denied.
All
1003
alertDrscLost ESM
The RAC cannot communicate with the baseboard management controller (ESM)
Critical
Error
RAC lost communication with ESM.
Dell Remote Access Controller (DRAC) III
1004
alertDrscFound ESM
The RAC is communicating normally with the baseboard management controller (ESM)
Information
Error
RAC recovered communication with ESM.
DRAC III
1005
alertDrscPowerOff
The RAC has detected a system power state change to powered-off
Critical
Error
RAC detected a system power state change to power-off.
DRAC III
1006
alertDrscPowerOn
The RAC has detected a system power state change to powered-on
Information
Error
RAC detected a system power state change to power-on.
DRAC III
1007
alertDrsc Watchdog Expired
The RAC has detected that the system watchdog has expired indicating a system hang
Critical
Event
RAC has detected the system watchdog expired (normally indicating a system hang).
DRAC III
1008
alertDrscBatt Low
The RAC Battery charge is below 25% indicating that the battery may only be able to power the DRSC for 8-10 minutes
Minor
Error
RAC detected its battery charge is below 25% full.
DRAC III
1009
alertDrscTemp Normal
The RAC Temperature probe has returned to a normal value
Information
Status
RAC temperature probe reading returned to normal.
DRAC III
1010
alertDrscTemp Warning
The RAC Temperature probe has detected a Warning value
Minor
Status
RAC temperature probe reading exceeded warning threshold.
DRAC III
1011
alertDrscTempCritical
The RAC Temperature probe has detected a failure (or critical) value
Critical
Error
RAC temperature probe reading exceeded critical threshold.
DRAC III
1012
alertDrscVolt Normal
The RAC voltage has returned to a normal value
Information
Error
RAC voltage probe reading returns to normal.
DRAC III
1013
alertDrscVolt Warning
The RAC voltage probe has detected a warning value
Minor
Error
RAC voltage probe reading exceeded warning threshold.
DRAC III
1014
alertDrscVolt Critical
The RAC voltage probe has detected a failure (or critical) value
Critical
Error
RAC voltage probe reading exceeded critical threshold.
DRAC III
1015
alertDrscSEL Warning
The RAC has detected a new event in the System Event Log with Severity: Warning
Major
Error
RAC detected a new system event log with warning severity (detailed log info is in drsAlert Message varbind).
All
1016
alertDrscSEL Critical
The RAC has detected a new event in the System Event Log with Severity: Critical
Critical
Error
RAC detected a new system event log with critical severity (detailed log info is in drsAlert Message varbind).
All
1017
alertDrscSEL80 percentFull
The RAC system event log is 80% full
Major
Status
RAC detected system event log is 80% full.
All
1018
alertDrscSEL90 percentFull
The RAC system event log is 90% full
Major
Status
RAC detected system event log is 90% full.
All
1019
alertDrscSEL100 percentFull
The RAC system event log is 100% full
Major
Status
RAC detected system event log is 100% full.
All
1020
alertDrscSEL Normal
The RAC has detected a new event in the System Event Log with Severity: Normal
Information
Error
RAC detected a new system event log with normal severity (detailed log info is in drsAlert Message varbind).
All
BMC Traps
The BMC monitors the system for critical events by communicating with various sensors on the system board and by sending alerts and log events when certain parameters exceed their preset thresholds. All of the traps documented in this section belong to the MIB enterprise identified by OID 1.3.6.1.4.1.3183.1.1.1.
Table 25-19. BMC Traps
Trap ID
Description
Severity
262402
Generic Critical Fan Failure
Critical
262530
Generic Critical Fan Failure Cleared
Informational
131330
Under-Voltage Problem (Lower Critical - going low)