Manuals

Manuals
System Event Log Messages for IPMI Systems: Messages Reference Guide

Back to Contents Page

System Event Log Messages for IPMI Systems

Dell™ OpenManage™ Messages Reference Guide

  Temperature Sensor Events

  Voltage Sensor Events

  Fan Sensor Events

  Processor Status Events

  Power Supply Events

  Memory ECC Events

  BMC Watchdog Events

  Memory Events

  Hardware Log Sensor Events

  Drive Events

  Intrusion Events

  BIOS Generated System Events

  R2 Generated System Events

  Cable Interconnect Events

  Battery Events

  Power And Performance Events

  Entity Presence Events


The following tables list the system event log (SEL) messages, their severity, and cause.

NOTE: For corrective actions, see the appropriate documentation.

Temperature Sensor Events

The temperature sensor event messages help protect critical components by alerting the systems management console when the temperature rises inside the chassis. These event messages use additional variables, such as sensor location, chassis location, previous state, and temperature sensor value or state.

Event Message

Severity

Cause

<Sensor Name/Location> temperature sensor detected a failure <Reading> where <Sensor Name/Location> is the entity that this sensor is monitoring. For example, "PROC Temp" or "Planar Temp."

Reading is specified in degree Celsius. For example 100 C.

Critical

Temperature of the backplane board, system board, or the carrier in the specified system <Sensor Name/Location> exceeded the critical threshold.

<Sensor Name/Location> temperature sensor detected a warning <Reading>.

Warning

Temperature of the backplane board, system board, or the carrier in the specified system <Sensor Name/Location> exceeded the
non-critical threshold.

<Sensor Name/Location> temperature sensor returned to warning state <Reading>.

Warning

Temperature of the backplane board, system board, or the carrier in the specified system <Sensor Name/Location> returned from critical state to non-critical state.

<Sensor Name/Location> temperature sensor returned to normal state <Reading>.

Information

Temperature of the backplane board, system board, or the carrier in the specified system <Sensor Name/Location> returned to normal operating range.


Voltage Sensor Events

The voltage sensor event messages monitor the number of volts across critical components. These messages provide status and warning information for voltage sensors for a particular chassis.

Event Message

Severity

Cause

<Sensor Name/Location> voltage sensor detected a failure <Reading> where <Sensor Name/Location> is the entity that this sensor is monitoring.

Reading is specified in volts.
For example, 3.860 V.

Critical

The voltage of the monitored device has exceeded the critical threshold.

<Sensor Name/Location> voltage sensor state asserted.

Critical

The voltage specified by
<Sensor Name/Location> is in critical state.

<Sensor Name/Location> voltage sensor state de-asserted.

Information

The voltage of a previously reported
<Sensor Name/Location> is returned to normal state.

<Sensor Name/Location> voltage sensor detected a warning <Reading>.

Warning

Voltage of the monitored entity
<Sensor Name/Location> exceeded the warning threshold.

<Sensor Name/Location> voltage sensor returned to normal <Reading>.

Information

The voltage of a previously reported
<Sensor Name/Location> is returned to normal state.


Fan Sensor Events

The cooling device sensors monitor how well a fan is functioning. These messages provide status warning and failure messages for fans for a particular chassis.

Event Message

Severity

Cause

<Sensor Name/Location> Fan sensor detected a failure <Reading> where <Sensor Name/Location> is the entity that this sensor is monitoring. For example "BMC Back Fan" or "BMC Front Fan."

Reading is specified in RPM. For example, 100 RPM.

Critical

The speed of the specified <Sensor Name/Location> fan is not sufficient to provide enough cooling to the system.

<Sensor Name/Location> Fan sensor returned to normal state <Reading>.

Information

The fan specified by <Sensor Name/Location> has returned to its normal operating speed.

<Sensor Name/Location> Fan sensor detected a warning <Reading>.

Warning

The speed of the specified <Sensor Name/Location> fan may not be sufficient to provide enough cooling to the system.

<Sensor Name/Location> Fan Redundancy sensor redundancy degraded.

Information

The fan specified by <Sensor Name/Location> may have failed and hence, the redundancy has been degraded.

<Sensor Name/Location> Fan Redundancy sensor redundancy lost.

Critical

The fan specified by <Sensor Name/Location> may have failed and hence, the redundancy that was degraded previously has been lost.

<Sensor Name/Location> Fan Redundancy sensor redundancy regained

Information

The fan specified by <Sensor Name/Location> may have started functioning again and hence, the redundancy has been regained.


Processor Status Events

The processor status messages monitor the functionality of the processors in a system. These messages provide processor health and warning information of a system.

Event Message

Severity

Cause

<Processor Entity> status processor sensor IERR, where <Processor Entity> is the processor that generated the event. For example, PROC for a single processor system and PROC # for multiprocessor system.

Critical

IERR internal error generated by the <Processor Entity>.

<Processor Entity> status processor sensor Thermal Trip.

Critical

The processor generates this event before it shuts down because of excessive heat caused by lack of cooling or heat synchronization.

<Processor Entity> status processor sensor recovered from IERR.

Information

This event is generated when a processor recovers from the internal error.

<Processor Entity> status processor sensor disabled.

Warning

This event is generated for all processors that are disabled.

<Processor Entity> status processor sensor terminator not present.

Information

This event is generated if the terminator is missing on an empty processor slot.

< Processor Entity> presence was deasserted.

Critical

This event is generated when the system could not detect the processor.

<Processor Entity> presence was asserted.

Information

This event is generated when the earlier processor detection error was corrected.

<Processor Entity> thermal tripped was deasserted.

Information

This event is generated when the processor has recovered from an earlier thermal condition.

<Processor Entity> configuration error was asserted.

Critical

This event is generated when the processor configuration is incorrect.

<Processor Entity> configuration error was deasserted.

Information

This event is generated when the earlier processor configuration error was corrected.

<Processor Entity> throttled was asserted.

Warning

This event is generated when the processor slows down to prevent over heating.

<Processor Entity> throttled was deasserted.

Information

This event is generated when the earlier processor throttled event was corrected.


Power Supply Events

The power supply sensors monitor the functionality of the power supplies. These messages provide status and warning information for power supplies for a particular system.

Event Message

Severity

Cause

<Power Supply Sensor Name> power supply sensor removed.

Critical

This event is generated when the power supply sensor is removed.

<Power Supply Sensor Name> power supply sensor AC recovered.

Information

This event is generated when the power supply has been replaced.

<Power Supply Sensor Name> power supply sensor returned to normal state.

Information

This event is generated when the power supply that failed or removed was replaced and the state has returned to normal.

<Entity Name> PS Redundancy sensor redundancy degraded.

Information

Power supply redundancy is degraded if one of the power supply sources is removed or failed.

<Entity Name> PS Redundancy sensor redundancy lost.

Critical

Power supply redundancy is lost if only one power supply is functional.

<Entity Name> PS Redundancy sensor redundancy regained.

Information

This event is generated if the power supply has been reconnected or replaced.

<Power Supply Sensor Name> predictive failure was asserted

Warning

This event is generated when the power supply is about to fail.

<Power Supply Sensor Name> input lost was asserted

Critical

This event is generated when the power supply is unplugged.

<Power Supply Sensor Name> predictive failure was deasserted

Information

This event is generated when the power supply has recovered from an earlier predictive failure event.

<Power Supply Sensor Name> input lost was deasserted

Information

This event is generated when the power supply is plugged in.


Memory ECC Events

The memory ECC event messages monitor the memory modules in a system. These messages monitor the ECC memory correction rate and the type of memory events that occurred.

Event Message

Severity

Cause

ECC error correction detected on Bank # DIMM [A/B].

Information

This event is generated when there is a memory error correction on a particular Dual Inline Memory Module (DIMM).

ECC uncorrectable error detected on Bank # [DIMM].

Critical

This event is generated when the chipset is unable to correct the memory errors. Usually, a bank number is provided and DIMM may or may not be identifiable, depending on the error.

Correctable memory error logging disabled.

Critical

This event is generated when the chipset in the ECC error correction rate exceeds a predefined limit.


BMC Watchdog Events

The BMC watchdog operations are performed when the system hangs or crashes. These messages monitor the status and occurrence of these events in a system.

Event Message

Severity

Cause

BMC OS Watchdog timer expired.

Information

This event is generated when the BMC watchdog timer expires and no action is set.

BMC OS Watchdog performed system reboot.

Critical

This event is generated when the BMC watchdog detects that the system has crashed (timer expired because no response was received from Host) and the action is set to reboot.

BMC OS Watchdog performed system power off.

Critical

This event is generated when the BMC watchdog detects that the system has crashed (timer expired because no response was received from Host) and the action is set to power off.

BMC OS Watchdog performed system power cycle.

Critical

This event is generated when the BMC watchdog detects that the system has crashed (timer expired because no response was received from Host) and the action is set to power cycle.


Memory Events

The memory modules can be configured in different ways in particular systems. These messages monitor the status, warning, and configuration information about the memory modules in the system.

Event Message

Severity

Cause

Memory RAID redundancy degraded.

Information

This event is generated when there is a memory failure in a RAID-configured memory configuration.

Memory RAID redundancy lost.

Critical

This event is generated when redundancy is lost in a
RAID-configured memory configuration.

Memory RAID redundancy regained

Information

This event is generated when the redundancy lost or degraded earlier is regained in a RAID-configured memory configuration.

Memory Mirrored redundancy degraded.

Information

This event is generated when there is a memory failure in a mirrored memory configuration.

Memory Mirrored redundancy lost.

Critical

This event is generated when redundancy is lost in a mirrored memory configuration.

Memory Mirrored redundancy regained.

Information

This event is generated when the redundancy lost or degraded earlier is regained in a mirrored memory configuration.

Memory Spared redundancy degraded.

Information

This event is generated when there is a memory failure in a spared memory configuration.

Memory Spared redundancy lost.

Critical

This event is generated when redundancy is lost in a spared memory configuration.

Memory Spared redundancy regained.

Information

This event is generated when the redundancy lost or degraded earlier is regained in a spared memory configuration.


Hardware Log Sensor Events

The hardware logs provide hardware status messages to the system management software. On particular systems, the subsequent hardware messages are not displayed when the log is full. These messages provide status and warning messages when the logs are full.

Event Message

Severity

Cause

Log full detected.

Critical

This event is generated when the SEL device detects that only one entry can be added to the SEL before it is full.

Log cleared.

Information

This event is generated when the SEL is cleared.


Drive Events

The drive event messages monitor the health of the drives in a system. These events are generated when there is a fault in the drives indicated.

Event Message

Severity

Cause

Drive <Drive #> asserted fault state.

Critical

This event is generated when the specified drive in the array is faulty.

Drive <Drive #> de-asserted fault state.

Information

This event is generated when the specified drive recovers from a faulty condition.

Drive <Drive #>

drive presence was asserted

Informational

This event is generated when the drive is installed.

Drive <Drive #>

predictive failure was asserted

Warning

This event is generated when the drive is about to fail.

Drive <Drive #>

predictive failure was deasserted

Informational

This event is generated when the drive from earlier predictive failure is corrected.

Drive <Drive #>

hot spare was asserted

Warning

This event is generated when the drive is placed in a hot spare.

Drive <Drive #>

hot spare was deasserted

Informational

This event is generated when the drive is taken out of hot spare.

Drive <Drive #>

consistency check in progress was asserted

Warning

This event is generated when the drive is placed in consistency check.

Drive <Drive #>

consistency check in progress was deasserted

Informational

This event is generated when the consistency check of the drive is completed.

Drive <Drive #>

in critical array was
asserted

Critical

This event is generated when the drive is placed in critical array.

Drive <Drive #>

in critical array was deasserted

Informational

This event is generated when the drive is removed from critical array.

Drive <Drive #>

in failed array was asserted

Critical

This event is generated when the drive is placed in the fail array.

Drive <Drive #>

in failed array was deasserted

Informational

This event is generated when the drive is removed from the fail array.

Drive <Drive #>

rebuild in progress was asserted

Informational

This event is generated when the drive is rebuilding.

Drive <Drive #>

rebuild aborted was asserted

Warning

This event is generated when the drive rebuilding process is aborted.


Intrusion Events

The chassis intrusion messages are a security measure. Chassis intrusion alerts are generated when the system's chassis is opened. Alerts are sent to prevent unauthorized removal of parts from the chassis.

Event Message

Severity

Cause

<Intrusion sensor Name> sensor detected an intrusion.

Critical

This event is generated when the intrusion sensor detects an intrusion.

<Intrusion sensor Name> sensor returned to normal state.

Information

This event is generated when the earlier intrusion has been corrected.

<Intrusion sensor Name> sensor intrusion was asserted while system was ON

Critical

This event is generated when the intrusion sensor detects an intrusion while the system is on.

<Intrusion sensor Name> sensor intrusion was asserted while system was OFF

Critical

This event is generated when the intrusion sensor detects an intrusion while the system is off.


BIOS Generated System Events

The BIOS-generated messages monitor the health and functionality of the chipsets, I/O channels, and other BIOS-related functions.

Event Message

Severity

Cause

System Event I/O channel chk.

Critical

This event is generated when a critical interrupt is generated in the
I/O Channel.

System Event PCI Parity Err.

Critical

This event is generated when a parity error is detected on the PCI bus.

System Event Chipset Err.

Critical

This event is generated when a chip error is detected.

System Event PCI System Err.

Information

This event indicates historical data, and is generated when the system has crashed and recovered.

System Event PCI Fatal Err.

Critical

This error is generated when a fatal error is detected on the PCI bus.

System Event PCIE Fatal Err.

Critical

This error is generated when a fatal error is detected on the PCIE bus.

POST Err

POST fatal error #<number> or <error description>

Critical

This event is generated when an error occurs during system boot. See the system documentation for more information on the error code.

Memory Spared

redundancy lost

Critical

This event is generated when memory spare is no longer redundant.

Memory Mirrored

redundancy lost

Critical

This event is generated when memory mirroring is no longer redundant.

Memory RAID

redundancy lost

Critical

This event is generated when memory RAID is no longer redundant.

Err Reg Pointer

OEM Diagnostic data event was asserted

Information

This event is generated when an OEM event occurs.

System Board PFault Fail Safe state asserted

Critical

This event is generated when the system board voltages are not at normal levels.

System Board PFault Fail Safe state deasserted

Information

This event is generated when earlier PFault Fail Safe system voltages returns to a normal level.

Memory Add

(BANK# DIMM#) presence was asserted

Information

This event is generated when memory is added to the system.

Memory Removed

(BANK# DIMM#) presence was asserted

Information

This event is generated when memory is removed from the system.

Memory Cfg Err

configuration error (BANK# DIMM#) was asserted

Critical

This event is generated when memory configuration is incorrect for the system.

Mem Redun Gain

redundancy regained

Information

This event is generated when memory redundancy is regained.

Mem ECC Warning

transition to non-critical from OK

Warning

This event is generated when correctable ECC errors have increased from a normal rate.

Mem ECC Warning

transition to critical from less severe

Critical

This event is generated when correctable ECC errors reach a critical rate.

Mem CRC Err

transition to non-recoverable

Critical

This event is generated when CRC errors enter a non-recoverable state.

Mem Fatal SB CRC

uncorrectable ECC was
asserted

Critical

This event is generated when CRC errors occur while storing to memory.

Mem Fatal NB CRC

uncorrectable ECC was
asserted

Critical

This event is generated when CRC errors occur while removing from memory.

Mem Overtemp

critical over temperature was asserted

Critical

This event is generated when system memory reaches critical temperature.

USB Over-current

transition to non-recoverable

Critical

This event is generated when the USB exceeds a predefined current level.

Hdwr version err

hardware incompatibility
(BMC/iDRAC Firmware and CPU mismatch) was asserted

Critical

This event is generated when there is a mismatch between the BMC and iDRAC firmware and the processor in use or vice versa.

Hdwr version err

hardware incompatibility
(BMC/iDRAC Firmware and CPU mismatch) was deasserted

Information

This event is generated when the earlier mismatch between the BMC and iDRAC firmware and the processor is corrected.

Hdwr version err

hardware incompatibility
(BMC/iDRAC Firmware and CPU mismatch) was deasserted

Information

This event is generated when an earlier hardware mismatch is corrected.

SBE Log Disabled

correctable memory error logging disabled was asserted

Critical

This event is generated when the ECC single bit error rate is exceeded.

CPU Protocol Err

transition to
non-recoverable

Critical

This event is generated when the processor protocol enters a non-recoverable state.

CPU Bus PERR

transition to
non-recoverable

Critical

This event is generated when the processor bus PERR enters a non-recoverable state.

CPU Init Err

transition to
non-recoverable

Critical

This event is generated when the processor initialization enters a non-recoverable state.

CPU Machine Chk

transition to
non-recoverable

Critical

This event is generated when the processor machine check enters a non-recoverable state.

Logging Disabled

all event logging disabled was asserted

Critical

This event is generated when all event logging is disabled.

LinkT/FlexAddr: Link Tuning sensor, device option ROM failed to support link tuning or flex address (Mezz XX) was asserted

Critical

This event is generated when the PCI device option ROM for a NIC does not support link tuning or the Flex addressing feature.

LinkT/FlexAddr: Link Tuning sensor, failed to program virtual MAC address (<location>) was asserted.

Critical

This event is generated when BIOS fails to program virtual MAC address on the given NIC device.

PCIE NonFatal Er: Non Fatal IO Group sensor, PCIe error(<location>)

Warning

This event is generated in association with a CPU IERR.

I/O Fatal Err: Fatal IO Group sensor, fatal IO error (<location>)

Critical

This event is generated in association with a CPU IERR and indicates which device caused the CPU IERR.

Unknown system event sensor

unknown system hardware failure was asserted

Critical

This event is generated when an unknown hardware failure is detected.


R2 Generated System Events

Table 3-13. R2 Generated Events

Description

Severity

Cause

System Event: OS stop event OS graceful shutdown detected

Information

The OS was shutdown/restarted normally.

OEM Event data record (after OS graceful shutdown/restart event)

Information

Comment string accompanying an OS shutdown/restart.

System Event: OS stop event runtime critical stop

Critical

The OS encountered a critical error and was stopped abnormally.

OEM Event data record (after OS bugcheck event)

Information

OS bugcheck code and paremeters.


Cable Interconnect Events

The cable interconnect messages are used for detecting errors in the hardware cabling.

Table 3-14. Cable Interconnect Events

Description

Severity

Cause

<Cable sensor Name/Location>

Configuration error was asserted.

Critical

This event is generated when the cable is not connected or is incorrectly connected.

<Cable sensor Name/Location>

Connection was asserted.

Information

This event is generated when the earlier cable connection error was corrected.


Battery Events

Table 3-15. Battery Events

Description

Severity

Cause

<Battery sensor Name/Location>

Failed was asserted

Critical

This event is generated when the sensor detects a failed or missing battery.

<Battery sensor Name/Location>

Failed was deasserted

Information

This event is generated when the earlier failed battery was corrected.

<Battery sensor Name/Location>

is low was asserted

Warning

This event is generated when the sensor detects a low battery condition.

<Battery sensor Name/Location>

is low was deasserted

Information

This event is generated when the earlier low battery condition was corrected.


Power And Performance Events

The power and performance events are used to detect degradation in system performance with change in power supply.

Table 3-16. Power And Performance Events

Description

Severity

Cause

System Board Power Optimized: Performance status sensor for System Board, degraded, <description of why> was deasserted

Normal

This event is generated when system performance was restored.

System Board Power Optimized: Performance status sensor for System Board, degraded, <description of why> was asserted

Warning

This event is generated when change in power supply degrades system performance.


Entity Presence Events

The entity presence messages are used for detecting different hardware devices.

Table 3-17. Entity Presence Events

Description

Severity

Cause

<Device Name>

presence was asserted

Information

This event is generated when the device was detected.

<Device Name>

absent was asserted

Critical

This event is generated when the device was not detected.


Back to Contents Page

 

snWEB4