Maintenance covers a broad spectrum of activities. Its goal is to keep a storage array operational and available to all hosts. This chapter provides descriptions of command line interface (CLI) and script commands that you can use to perform storage array maintenance. The commands are organized into four sections:
Routine maintenance
Performance tuning
Troubleshooting and diagnostics
Recovery operations
The organization is not a rigid approach, and you can use the commands as appropriate for your storage array. The commands listed in this chapter do not cover the entire array of commands you can use for maintenance. Other commands, particularly the set commands, can provide diagnostic or maintenance capabilities.
Routine Maintenance
Routine maintenance involves those tasks you might perform periodically to ensure that the storage array is running as well as possible or to detect conditions before they become problems.
Running a Media Scan
Media scan provides a method of detecting physical disk media errors before they are found during a normal read from or write to the physical disks. Any errors detected are reported to the Major Event Log (MEL). Media scan provides an early indication of a potential drive failure and reduces the possibility of encountering a media error during host operations. A media scan is performed as a background operation and scans all data and consistency information in defined user virtual disks. A media scan runs on all virtual disks in the storage array with the following conditions:
An Optimal status
No modification operations in progress
Media scan enabled
Errors detected during a scan of a user virtual disk are reported to the MEL and handled as:
Unrecovered media error The physical disk could not read the requested data on its first attempt or on any subsequent retries. For virtual disks with redundancy protection, the data could not be reconstructed from the redundant copy. The error is not corrected but it is reported to the MEL.
Reconstructed media error The physical disk could not read the requested data on its first attempt or on any subsequent retries. The data is reconstructed from the redundant copy, rewritten to the drive, verified, and the error is reported to the MEL.
Recovered media error The physical disk could not read the requested data on its first attempt. The result of this action is that the data is rewritten to the physical disk and verified. The error is reported to the MEL.
Consistency mismatches Consistency errors are found, and a media error is forced on the block stripe so that it is found when the physical disk is scanned again. If consistency is repaired, this forced media error is removed. The result of this action is that the first ten consistency mismatches found on a virtual disk are reported to the MEL.
Unfixable error The data could not be read and consistency information could not be used to regenerate it. For example, consistency information cannot be used to reconstruct data on a degraded virtual disk. The result of this action is that the error is reported to the MEL.
The script command set provides two commands to define media scan properties:
set virtualDisks
set storageArray
The set virtualDisk command enables a media scan for the virtual disk. The following syntax is the general form of the command:
The set storageArray command defines how frequently a media scan is run on a storage array. The following syntax is the general form of the command:
set storageArray mediaScanRate=(disabled | 1-30)
Running a Consistency Check
Consistency checks are performed when media scans are run, if consistency check is enabled on the virtual disk. (See Running a Media Scan for an explanation about setting up and running media scans.) During a consistency check, all data blocks in a virtual disk are scanned, and deteriorated data is corrected. The method of correction depends on the redundant array of independent disks (RAID) levels:
RAID 5 and RAID 6 virtual disks Consistency is checked and repaired.
RAID 1 virtual disks The data is compared between the mirrored physical disks, and data inconsistencies are repaired.
RAID 0 virtual disks No redundancy exists.
Before attempting a consistency check, you must enable the process with the set virtualDisk command, which uses the following general form:
NOTICE: When you reset a RAID controller module, the RAID controller module is not available for I/O operations until the reset is complete. If a host is using virtual disks owned by the RAID controller module being reset, the I/O directed to the RAID controller module is rejected. Before resetting the RAID controller module, ensure that a multipath driver is installed on all hosts using these virtual disks. If a multipath driver is not installed, the virtual disks will not be available.
Resetting a RAID controller module is the same as rebooting the RAID controller module processors. To reset a RAID controller module, run the following command:
reset controller [(0 | 1)]
Enabling RAID Controller Module Data Transfer
At times, a RAID controller module might become quiescent while running diagnostics. If this occurs, the RAID controller module might become unresponsive. To revive a RAID controller module, run the following command:
enable controller [(0 | 1)] dataTransfer
Resetting Battery Age
NOTE: A smart battery module does not require the battery age to be reset.
After replacing the batteries in the storage array, you must reset the age of the battery, either for an entire storage array or one battery in a specific RAID controller module. To reset the age to zero days, run the following command:
Persistent reservations preserve virtual disk registrations and prevent hosts, other than the host defined for the virtual disk, from accessing the virtual disk. You must remove persistent reservations before you perform the following changes to your configuration:
Change or delete logical unit number (LUN) mappings on a virtual disk holding a reservation.
Delete virtual disk groups or virtual disks that have any reservations.
To determine which virtual disks have reservations, run the following command:
To synchronize the clocks on both RAID controller modules in a storage array with the host clock, run the following command:
set storageArray time
Locating Physical Disks
At times, you might need to locate a specific physical disk. In very large storage array configurations, this can sometimes be awkward. If you need to locate a specific physical disk, turn on the indicator LED on the front of the physical disk. To turn on the indicator LED on a physical disk, run the following command:
start physicalDisk [enclosureID,slotID] blink
To turn off the indicator LED after locating the physical disk, run the following command:
stop physicalDisk blink
Performance Tuning
Over time, as a storage array exchanges data between the hosts and physical disks, its performance can degrade. Monitor the performance of a storage array and make adjustments to the storage array operational settings to improve performance.
Monitoring Performance
Monitor the performance of a storage array by using the save storageArray performanceStats command. This command saves performance information to a file that you can review to determine how well the storage array is running. Table 6-1 lists the performance information saved to the file.
Table 6-1. Storage Array Performance Information
Type of Information
Description
Devices
Devices are:
RAID Controller Modules The RAID controller module in slot 0 or 1 and a list of the virtual disks owned by the RAID controller module
Virtual Disk A list of the virtual disk names
Storage Array Totals A list of the totals for both RAID controller modules in an active-active RAID controller module pair, regardless if one, both, or neither are selected for monitoring
Total I/Os
Number of total I/Os performed since the storage array was started
Read Percentage
Percentage of total I/Os that are read operations (calculate the write percentage by subtracting the read percentage from 100 percent)
Cache Hit Percentage
Percentage of reads that are fulfilled by data from the cache rather than requiring an actual read from a physical disk
Current KB/second
Current transfer rate in kilobytes per second (current means the number of kilobytes per second since the last time the polling interval elapsed, causing an update to occur)
Maximum KB/second
Highest data transfer value achieved in the current kilobyte-per-second statistic block
Current IO/second
Current number of I/Os per second (current means the number of I/Os per second since the last time the polling interval elapsed, causing an update to occur)
Maximum IO/second
Highest number of I/Os achieved in the current I/O-per-second statistic block
The general form of the command is:
save storageArray performanceStats file="filename"
where file is the name of the file in which you want to save the performance statistics. You can use any file name your operating system can support. The default file type is.csv. The performance information is saved as a comma-delimited file.
Before using the save storageArray performanceStats command, run the set session performanceMonitorInterval and set session performanceMonitorIterations commands to specify how often statistics are collected.
Changing RAID Levels
When creating a disk group, define the RAID level for the virtual disks in that group. You can later change the RAID level to improve performance or provide more secure protection for your data. To change the RAID level, run the following command:
set diskGroup [diskGroupNumber] raidLevel=(0|1|5|6)
wherediskGroupNumber is the number of the disk group for which to change the RAID level.
Changing Segment Size
When creating a new virtual disk, define the segment size for that virtual disk. You can later change the segment size to optimize performance. In a multi-user database or file system storage environment, set your segment size to minimize the number of physical disks needed to satisfy an I/O request. Use larger values for the segment size. Using a single physical disk for a single request leaves other disks available to simultaneously service other requests. If the virtual disk is in a single-user large I/O environment, performance is maximized when a single I/O request is serviced with a single data stripe; use smaller values for the segment size. To change the segment size, run the following command:
set virtualDisk ([virtualDiskName] | <wwid>) segmentSize=segmentSizeValue
wheresegmentSizeValueis the new segment size you want to set. Valid segment size values are 8, 16, 32, 64, 128, 256, and 512. You can identify the virtual disk by name or World Wide Identifier (WWID) (see Set Virtual Disk).
Defragmenting a Disk Group
When you defragment a disk group, you consolidate the free capacity in the disk group into one contiguous area. Defragmentation does not change the way in which the data is stored on the virtual disks. As an example, consider a disk group with five virtual disks. If you delete virtual disks 1 and 3, your disk group is configured in the following manner:
space, virtual disk 2, space, virtual disk 4, virtual disk 5, original unused space
When you defragment this group, the space (free capacity) is consolidated into one contiguous location after the virtual disks. After being defragmented, the disk group is:
virtual disk 2, virtual disk 4, virtual disk 5, consolidated unused space
To defragment a disk group, run the following command:
start diskGroup [diskGroupNumber] defragment
where diskGroupNumber is the identifier for the disk group.
NOTE: Defragmenting a disk group starts a long-running operation.
Troubleshooting and Diagnostics
If a storage array exhibits abnormal operation or failures, you can use the commands described in this section to determine the cause of the problems.
Collecting Physical Disk Data
To gather information about all the physical disks in a storage array, run the save allPhysicalDisks command. This command collects sense data from all the physical disks in a storage array and saves the data to a file. The sense data consists of statistical information maintained by each of the physical disks in the storage array.
Diagnosing a RAID Controller Module
The diagnose controller command's testID parameter takes the following options, which you can use to verify that a RAID controller module is functioning correctly:
1 Reads the test
2 Performs a data loop-back test
3 Writes the test
The read test initiates a read command as it would be sent over an I/O data path. The read test compares data with a known, specific data pattern, checking for data integrity and errors. If the read command is unsuccessful or the data compared is not correct, the RAID controller module is considered to be in error and is placed offline.
Run the data loopback test only on RAID controller modules that have connections between the RAID controller module and the physical disks. The test passes data through each RAID controller module physical disk-side channel out onto the loop and back again. Enough data is transferred to determine error conditions on the channel. If the test fails on any channel, this status is saved so that it can be returned if all other tests pass.
The write test initiates a write command as it would be sent over an I/O data path to the diagnostics region on a specified physical disk. This diagnostics region is then read and compared to a specific data pattern. If the write fails or the data compared is not correct, the RAID controller module is considered to be in error, and it is failed and placed offline.
For best results, run all three tests at initial installation. Also, run the tests any time you make changes to the storage array or to components connected to the storage array (such as hubs, switches, and host adapters).
A custom data pattern file called diagnosticsDataPattern.dpf is included on the Utility directory of the installation CD. You can modify this file, but the file must have the following properties to work correctly for the tests:
The file values must be entered in hexadecimal format (00 to FF) with only one space between the values.
The file must be no larger than 64 bytes in size. Smaller files can be used, but larger files can cause an error.
The test results contain a generic, overall status message and a set of specific test results. Each test result contains the following information:
Test (read/write/data loopback)
Port (read/write)
Level (internal/external)
Status (pass/fail)
Events are written to the MEL when diagnostics are started and when testing is completed. These events help you to evaluate whether diagnostics testing was successful or failed and the reason for the failure.
Recovery Operations
Recovery operations involve replacing failed RAID controller modules and physical disks, restoring data, and restoring the storage array to operation.
Setting RAID Controller Module Operational Mode
A RAID controller module has three operational modes:
Online
Offline
Service
NOTICE: Placing a RAID controller module offline can cause loss of data.
Placing a RAID controller module online sets it to the Optimal state and makes it active and available for I/O operations. Placing a RAID controller module offline makes it unavailable for I/O operations and moves its disk groups to the other RAID controller module if failover protection is enabled.
Taking a RAID controller module offline can seriously impact data integrity and storage array operation.
If you take a RAID controller module offline, the second RAID controller module in the pair takes over. Disk groups and their associated virtual disks that were assigned to the offline RAID controller module are automatically reassigned to the remaining RAID controller module.
NOTICE: Place a RAID controller module in Service mode only under the direction of Technical Support.
Use Service mode when you want to perform an operation, such as replacing a RAID controller module. Placing a RAID controller module in Service mode makes it unavailable for I/O operations. Placing a RAID controller module in Service mode also moves the disk groups from the RAID controller module to the second RAID controller module without affecting the disk groups' preferred path. Moving disk groups might significantly reduce performance. The disk groups are automatically transferred back to the preferred RAID controller module when it is placed back online.
NOTICE: A multipath driver is required on all hosts and is the only supported configuration. If the multipath driver is not installed, the virtual disks will not be accessible.
Before you place a RAID controller module in Service mode, ensure that a multipath driver is installed on all hosts using these virtual disks.
To change the operational mode of a RAID controller module, run the following command:
set controller [(0 | 1)] availability=(online | offline | serviceMode)
Changing RAID Controller Module Ownership
You can change which RAID controller module owns a virtual disk by using the set virtualDisk command. The following syntax is the general form of the command:
NOTICE: When you initialize a physical disk, all data on the physical disk is lost.
You must initialize a physical disk when you have moved physical disks that were previously part of a disk group from one storage array to another. If you do not move the entire set of physical disks, the disk group and virtual disk information on the physical disks that you move is incomplete. Each physical disk that you move contains only part of the information defined for the virtual disk and disk group. To be able to reuse the physical disks to create a new disk group and virtual disk, you must erase all old information from the physical disks by initializing the physical disk.
When you initialize a physical disk, all old disk group and virtual disk information is erased, and the physical disk is returned to an unassigned state. Returning a physical disk to an unassigned state adds unconfigured capacity to a storage array. You can use this capacity to create additional disk groups and virtual disks.
To initialize a physical disk, run the following command:
where enclosureID and slotID are the identifiers for the physical disk.
Reconstructing a Physical Disk
If two or more physical disks in a disk group have failed, the virtual disk shows a status of Failed. All of the virtual disks in the disk group are no longer operating. To return the disk group to an Optimal status, you must replace the failed physical disks. After replacing the physical disks, reconstruct the data on physical disks. The reconstructed data is the data as it would appear on the failed physical disks.
To reconstruct a physical disk, run the following command:
where enclosureID and slotID are the identifiers for the physical disk.
NOTE: You can use this command only when the physical disk is assigned to a RAID 1, 5, or 6 disk group.
Initializing a Virtual Disk
NOTICE: When you initialize a virtual disk, all data on the virtual disk and all information about the virtual disk are destroyed.
A virtual disk is automatically initialized when you first create it. If the virtual disk starts exhibiting failures, you might be required to re-initialize the virtual disk to correct the failure condition.
The initialization process cannot be cancelled once it has begun. This option cannot be used if any modification operations are in progress on the virtual disk or disk group. To initialize a virtual disk, run the following command:
start virtualDisk [virtualDiskName] initialize
where virtualDiskName is the identifier for the virtual disk.
Redistributing Virtual Disks
Redistributing virtual disks returns the virtual disks to their preferred RAID controller module owners. The preferred RAID controller module ownership of a virtual disk or disk group is the RAID controller module of an active-active pair that is designated to own the virtual disks. The preferred owner for a virtual disk is initially designated when the virtual disk is created. If the preferred RAID controller module is being replaced or undergoing a firmware download, ownership of the virtual disks is automatically shifted to the second RAID controller module. The second RAID controller module becomes the current owner of the virtual disks. This change is considered to be a routine ownership change and is reported in the MEL.
NOTICE: Ensure that a multipath driver is installed, or the virtual disks will not be accessible.
To redistribute virtual disks to their preferred RAID controller modules, run the following command:
reset storageArray virtualDiskDistribution
NOTE: You cannot run this command if all virtual disks are currently owned by their preferred RAID controller module or if the storage array does not have defined virtual disks.
Under some host operating systems, you must reconfigure the multipath host driver. You might also need to make operating system modifications to recognize the new I/O path to the virtual disk.