Manuals

Manuals
Interpreting and using performance data: Dell OpenManage™ Data Analyzer Installation and Operation Guide

Back to Contents Page

Interpreting and using performance data: Dell OpenManage Data Analyzer

OverviewStorage-system array componentsPerformance and I/O load as seen by the default performance formula sheetPerformance and I/O load as seen by the SP performance formula sheetPerformance and I/O load as seen by the LUN performance formula sheetPrimary performance dataCreating custom performance formula sheetsCache tuning


Overview

This chapter describes how to interpret and use the performance data displayed by the Data Analyzer. Major topics chapter are

  • Storage-system array components
  • Performance and I/O load as seen by the default formula sheet
  • Performance and I/O load as seen by the SP formula sheet
  • Performance and I/O load as seen by the LUN formula sheet
  • Detailed performance data
  • Creating custom performance sheets
  • Cache tuning

Storage-system array components

The logger records and the Data Analyzer displays information on the following storage-system components:

Table 3-1. Storage-system array components

SPs The SP controls array operation by implementing RAID features. It manages the server-array communication and controls the disk modules in the array. Each array can have one or two SPs, SP A and SP B. Some SP performance information is included in the default formula sheet file, default.pfs; there is also a formula sheet file designed specifically for SPs, sp.pfs.
LUNs A LUN is one or more disks seen as a unit (Logical Unit, LUN) by the server operating system. The array sees a LUN as a single entity, though it may include up to 16 disk modules. Some LUN performance information is included in the default formula sheet file, default.pfs; there is also a formula sheet file designed specifically for LUNs, lun.pfs.
Disks The disks (also known as disk modules) connect to the SP. The maximum number of disks depends on the type of disk-array storage system. Disk performance information is included in the default formula sheet file, default.pfs.

Performance and I/O load as seen by the default performance formula sheet

With the default performance formula sheet, each disk, LUN, and SP has its own data categories in the default performance sheet. The pathname of the default formula sheet file is drive:\Program Files\Dell\Dell OpenManage™ Data Analyzer\ Data\default.pfs. Information displayed includes device

  • Utilization
  • Total throughput, read throughput, and write throughput
  • Total I/O, read I/O, and write I/O
  • Read size and write size (average)
  • Queue lengths, average and busy
  • Service time
  • Response time and seek distance

The Data Analyzer computes data items by observing the system over a time interval called the observation period. You can set the observation period to the entire period covered by the log file or to a shorter period using filters.

One approach to performance analysis is to time applications over both a long period (such as a work week, with an interval of 10 minutes or more) and a short peak period (such as 10 minutes, with an interval of 10 seconds). Poll rates of less than 10 seconds are not recommended because errors introduced by sampling might distort the results.

The following sections explain the data categories.

note.gif (1135 bytes) NOTE: The color of the display value tells you whether it is relatively high, normal, or low. Data values within a normal range are displayed in black, high values in blue, and low values in green. The range between high and low is about 10%, with normal values between. Thresholds are based on standard deviation.

Utilization

Utilization is the ratio of busy time to available processing time. It shows how close the system is to maximum performance. In most systems, the utilization of all components rises consistently with throughput from 0 to 100%. The component that reaches 100% utilization first determines the maximum throughput of the system

Figure 3-1. Utilization statistics

anaint01.gif (9617 bytes)

Using Utilization bar charts to analyze loads

The utilization percentage of one device tells only part of the story. You can compare the utilization figures for all devices in a hierarchy - SPs and LUNs, or SPs, LUNs, and disks - by selecting the utilization cells for the SP and the devices it owns, and then clicking the bar chart button.

You can determine the owner SP for each LUN, and the disks in the LUN, by selecting the LUN(s) in the device column and selecting
Data Device Information.

For example, if SP A owns LUN-1, LUN-2, and LUN-3, you can compare the utilization of the SP with the LUNs it controls by selecting SP A, LUN-1, LUN-2, and LUN-3 and creating a bar chart. The resulting display shows which device is most likely to become the bottleneck. A sample utilization bar chart follows.

Figure 3-2. Utilization bar chart

anaint02.gif (9617 bytes)

You can also create bar charts to compare the utilization of similar devices - LUNs or disks in a LUN - to determine the workload distribution among them.

Acting on Utilization data

In most applications, no component will exceed 50% utilization. This means the system can support more load without a significant increase in response time. The component with the highest utilization number will be the bottleneck when maximum performance is reached.

Often, the bottleneck device type is the disks, especially if they are full of data and seek distances are long. If the I/O size is small and the seek distances are short, then the SP may be the bottleneck component.

If a component has a high utilization, try to reduce the workload of that component. There are three ways to reduce workload:

  1. Reduce the demand on the component. This might be as simple as scheduling some operations to another time of day. You can reduce disk workload by using caching so that data can be accessed in the cache rather than from disk. You may also be able to reduce workload by choosing a more appropriate RAID level.
  2. Move some workload to another component. You can do this by moving files and databases (or index tables of databases) to reside on different LUNs. If the disk module utilizations within a striped LUN are not equal be improved. Reducing the stripe size increases the chances of localizing I/O to one disk, but it also can have the detrimental effect of increasing the number of striper of stripe-crossing I/Os.
  3. Add components. If the Data Analyzer shows disks as the bottleneck, then you can improve performance by adding more disks or LUNs. But if the Data Analyzer shows an SP as a bottleneck, adding more disks will not help performance. For example, adding disks will not help if disk utilization is 60% and the SP utilization is 65%, since the SP is the limiting factor.

Before drawing conclusions from utilization data, consider the time dependencies. Performance might be acceptable most of the time but unacceptable at certain times of the day (or week, or month). The Data Analyzer can record samples over long periods and identify peaks for detailed analysis.

Total Throughput, Read Throughput, and Write Throughput (MB/s)

Throughput is the rate at which data is read and written by the system. the Data Analyzer calculates data throughput for reads and writes to each component of the system. A sample throughput display follows.

Figure 3-3. Sample throughput statistics

anaint03.gif (9617 bytes)

The importance of throughput depends on I/O size. Small I/Os of fewer than eight blocks use more resources to process the request than to transfer data. As the I/O size increases, the resources needed to transfer data become more significant. For example,

  • A 0.5-KByte transfer to a disk takes 8 ms for seek and rotation, and only 0.125 ms to transfer data.
  • A 64-KByte transfer takes the same 8 ms for seek and rotation, but requires 16 ms to transfer the data.

Total I/O, Read I/O, and Write I/O (per second)

The Data Analyzer uses three columns to display the total I/O, read I/O, and write I/O rates per second. A sample I/O display follows.

Figure 3-4. Sample I/O display

anaint04.gif (9617 bytes)

The number of I/Os displayed for a LUN is the sum for all disks in the LUN. The distribution of I/Os among a LUN's disks should be relatively even; if not, possibly another RAID type would be more efficient.

The number of I/Os processed by the disks depends upon these rules:

  • Each read operation within stripe boundaries results in a single read operation from a disk.
  • If the read operation crosses a stripe boundary, one disk read is issued for each stripe.
  • I/Os that are greater than 64 KBytes are divided into 64-KByte pieces.
  • RAID-5 write operations require two reads and two writes to the disks.
  • RAID-5 write optimizations reduce the number of disk I/Os when stripe crossings take place.
  • Read and write cache can reduce disk I/Os, depending on the hit rate.
  • Server I/O requests are broken up to maximize efficiency and keep the server, SP, buses, and disks working in parallel.

There is a complex relationship between I/O requests made by the server (which includes user I/O) and the physical I/O performed by the disks. The Data Analyzer lets you plot physical I/O over time and identify problems caused by poor distribution or excessive I/O.

Read Size and Write Size (Average) in Kbytes

The Average Read and Write Size columns show the average sizes of the actual reads and writes. A sample Read/Write size display follows.

Figure 3-5. Sample Read/Write Size display

anaint05.gif (9617 bytes)

For a disk, large differences between I/O sizes might result from too many stripe boundary crossings - caused by an inappropriate RAID-5 stripe element size.

Queue lengths, Average and Busy

Queue length information helps you to delve further into the load on the system and see which components are responsible for holding up jobs. A sample queue length display follows.

Figure 3-6. Sample queue length display

anaint06.gif (9617 bytes)

The following figure shows how the queue length components interrelate.

Figure 3-7. The interrelationship between queue length components

anaint07.gif (9617 bytes)

The queuing figure above shows arrivals while jobs 2 and 3 are queued. When job 1 is done, job number 2 is processed. The Data Analyzer displays two items of queuing information: the average queue length over the entire observation period, and the queue length when the device is busy. Queue length when busy (AvgBusyQueue) is always greater than 1 and is used to calculate response time:

ResponseTime = ServiceTime * AvgBusyQueue

Queuing models use the average queue length over the entire observation period (including zero length when the device is idle) to predict how the system will react when additional load is applied to the system.

Service Time

Service time is the average time taken by a disk or SP to process an I/O. For a disk, this means reading or writing, not including time spent in a queue waiting for service. For an SP, the software computes service time using utilization and throughput data. A sample service time display follows.

Figure 3-8. Sample service time display

anaint08.gif (9617 bytes)

The accuracy of the service times depends upon counting enough busy samples. Service time numbers are accurate only if the observation period is at least 30 seconds and utilization is at least 10%.

The service time of a component depends upon the kind of workload applied. You can use service time information to help identify abnormal behavior. For example, long disk seeks caused by fragmentation produce longer service times.

Response Time and Seek Distance

Response time is the total time a disk and/or SP need to answer a server request. For a disk, it includes SP and disk seek time; for an SP, it includes SP processing time, I/O time, and caching overhead. A sample Response Time/Seek Distance display follows.

Figure 3-9. Sample Response Time/Seek distance display

anaint09.gif (9617 bytes)

Response time is in microseconds; seek distance is in units of 1,000 sectors (a sector is a disk block, 512 bytes).

Seek distance is the sum of the distance the disk's read/write heads travelled for all I/O requests during the interval. The seek distance lets you gauge the locality of reference of your disk based applications. In many applications, seek distance affects I/O response time more than any other I/O component. Some common ways to reduce seek distance follow:

  • Delete unused data from the disk — Provided your system has effective space management, this will help consolidate the useful data.
  • Reduce fragmentation on the LUN(s) — If your software does not have an effective defragmentation tool, you may need to delete unused files, back up the remaining files, delete all files, and then reload the files. This makes the data contiguous and reduces seeks. Software products are available that perform data compacting (defragmentation) without requiring you to unload and reload the disk. Some have optimization features that let you consolidate the most frequently used data.
  • Concentrate your most frequently accessed data — Do not distribute the most frequently accessed data across lots of partitions. If possible, choose partitioning that has most of the frequently accessed data in one partition near the middle of the LUN. Place less used partitions near the beginning and end of the LUN.
  • Use more disks — If your data is spread thinly over lots of disks, then each disk has less distance to seek; consequently response time is better. This also means that if your applications are multi-tasked, disks can work in parallel, increasing system throughput.

Over time, data on a LUN tends to become more fragmented, and this leads to longer seek times.


Performance and I/O load as seen by the SP performance formula sheet

With the SP performance formula sheet, seven categories of SP-oriented information appear before the default categories (Utilization, Total Tput, Read Tput, and so on). The pathname of the SP formula sheet file is drive:\Program Files\Dell\Dell OpenManage™ Data Analalyzer\Data\sp.pfs. The SP information pertains to the write cache only, so it is useful only if write caching is enabled and is occurring. Information displayed includes

  • High watermark flush on — Number of times a write flush operation occurred because the cache contained its High Watermark percentage of modified pages. When write caching is enabled, the SP begins to flush data to disk when the cache contains more than the High Watermark percentage of modified pages.
  • Idle flush on — Number of times a write flush operation occurred because a unit was idle. If a LUN is idle for a period of time, the SP flushes modified pages to the LUN's disks.
  • Low watermark flush off — Number of times a write flush operation stopped because the level reached the Low Watermark percentage of modified pages. When the SP begins flushing the cache, it continues until the number of modified unflushed pages reaches the Low Watermark.
  • Blocks (512 bytes) flushed per second
  • Write cache flushes per second
  • Flush ratio
  • Percentage of dirty pages
  • Utilization, Total Tput, Read Tput, and so on. These are explained under the default performance sheet starting on Utilization.

The Data Analyzer computes data items by observing the system over a time interval called the observation period. You can set the observation period to the entire period covered by the log file or to a shorter period using filters.

note.gif (1135 bytes) NOTE: The color of the display value tells you whether it is relatively high, normal, or low. Data values within a normal range are displayed in black, high values in blue, and low values in green. The range between high and low is about 10%, with normal values between. Thresholds are based on standard deviation.

Performance and I/O load as seen by the LUN performance formula sheet

With the LUN performance formula sheet, seven categories of LUN oriented information are display between the default categories Utilization and Total Tput. The pathname of the SP formula sheet file is
drive:\Program Files\Dell\Dell OpenManage™ Data Analyzer\Data\lun.pfs. Some LUN information pertains to caching and some to stripe crossings. For caching details, see "Cache tuning,". The display includes

  • Utilization
  • Read Cache Hit Ratio and Write Cache Hit Ratio
  • Percent of used prefetches
  • Forced flushes

The Data Analyzer computes data items by observing the system over a time interval called the observation period. You can set the observation period to the entire period covered by the log file or to a shorter period using filters.The LUN performance formula sheet is as follows.

Figure 3-10. LUN Performance formula sheet

anaint10.gif (9617 bytes)

note.gif (1135 bytes) NOTE: The color of the display value tells you whether it is relatively high, normal, or low. Data values within a normal range are displayed in black, high values in blue, and low values in green. The range between high and low is about 10%, with normal values between. Thresholds are based on standard deviation.

Utilization

Utilization is described starting on "Utilization".

RC (read cache) Hit Ratio and WC (write cache) Hit Ratio

The LUN performance sheet has two columns that show how often sought data was found in the read or write cache, avoiding a disk access.

RC hit ratio Percentage of total read requests that were satisfied by the read cache, avoiding a disk access.
WC hit ratio Percentage of total write requests that were satisfied by the write cache, avoiding a disk access. The write occurs later and will be counted by a flush counter. Some writes may require a disk read for part of their data; such a write does not count as a write-cache hit.

Percent of used prefetches

This column shows the percent of time that information that the SP prefetched from disk was actually needed. If component utilization is high (over 50%), tuning the prefetch efficiency is important. To tune the read cache, use the controls that set the read cache memory size, the prefetch size, and the prefetch type. You can tune the cache for each LUN according its workload. Use the following guidelines when you tune your read cache:

  • More memory allocated to read cache increases the chance of read hits on cached data.
  • Prefetches are always queued to disks at low priority. This means that small prefetches will not interfere with writes or random reads.
  • Large prefetches can slow writes and random read requests by increasing disk utilization.

Forced flushes

A forced flush happens when the cache is full of modified pages that have not yet been flushed. An incoming page forces a page to be flushed from cache. Forced flushes are undesirable since they incur the overhead of cache management with little chance of optimizing writes. As a general rule, if the number of forced flushes is more than half the number of writes, performance will be better without the write cache.

Stripe crossings and percentage of stripe crossings

Stripe crossings pertain only to RAID-5, RAID-1/0, and RAID-0 LUNs. The number includes all crossings that occurred during reads and writes for all disk modules in the group. Stripe crossings are undesirable, since each one requires an additional I/O. As a rough rule of thumb, the ideal stripe element size is the smallest size that will not cause an additional I/O to one disk module.

The percentage of stripe crossings shows the percentage of I/Os that required a stripe boundary crossing. A relatively low percentage indicates a relatively efficient stripe element size.


Primary performance data

Primary performance data includes statistics. To display this data, select a device (disk, LUN, or SP) on the current sheet, and then select Data Primary Data. Each component has different characteristics, so there are different data tags that describe its activity.

Most values are based on counters that record the occurrences of an event. The Data Analyzer calculates the difference in the counters over the observation period and displays these in the delta column. You can change the start and end times of the delta period using the "Performance Values at" and the "Delta with Values at" pull-down lists. When you change the "Performance Values at" setting, the "Delta with Values at" value also changes.

note.gif (1135 bytes) NOTE: The case of characters in data tag names is important. For example, for the number of SP idle ticks, you must use SP_Idle_Ticks, not SP_idle_ticks.

SP primary data

A sample SP primary data display, with details on each data tag, follows.

Figure 3-11. Sample SP primary data display

anaint11.gif (9617 bytes)

Table 3-2. SP primary data tags

Time stamp Time at which the values you see were written to the log. (The values displayed in the time stamp column are not the same as the start and end values you can set using the "Performance Values at" and "Delta with Values at" pull-down lists.) Delta is the number of seconds between sample values; it shows the difference between the setting of the "Performance Values at" and the "Delta with Values at" pull-down lists.
SP_Idle_Ticks Number of real-time clock ticks at which the SP was idle. The Data Analyzer uses this, and Busy Ticks, to calculate utilization. The clock ticks every 100 ms.
SP_Busy_Ticks Number of real-time clock ticks at which the SP was busy.
FE_Reads Number of read requests made by the server processor. (FE stands for front end, which means the part of the SP that communicates with the server.)
FE_Writes Number of write requests made by the server processor.
FE_Blocks_Read Number of 512-byte read blocks requested by the server processor. (FE stands for front end, which means the part of the SP that communicates with the server.)
FE_Blocks_Written Number of 512-byte write blocks requested by the server processor.
Sum_Queue_Lengths Sum of the lengths of the queues seen by arriving requests during the delta interval.
Arrivals_to_Non_Zero_Queue Arrivals to a busy queue. This counter increments when an arriving request finds that the SP is already processing a request. The Data Analyzer uses it to calculate average queue length.
FE_Idle_Ticks Number of clock ticks during which the host bus was idle (the server was not using the array). As above, FE stands for front end, which means the part of the SP that communicates with the server.
FE_Busy_Ticks Number of clock ticks during which the host bus was busy (the server was using the array).
BE_Idle_Ticks_Busn Number of clock ticks during which the SP was communicating with disks over BE (back end) bus n. The back end is the SP that communicates with the disks.
BE_Busy_Ticks_Busn Number of clock ticks during which the SP was communicating with its disks over its secondary (B) port.
High_Watermark_Flush_On Number of times a write flush operation occurred because the cache contained its High Watermark percentage of modified pages. When write caching is enabled, the SP begins to flush data to disk when the cache contains more than that High Watermark percentage of modified pages.
Idle_Flush_On Number of times a write flush operation occurred because a unit was idle. If a LUN is idle for a period of time, the SP flushes modified pages to the LUN's disks.
Low_Watermark_ Flush_Off Number of times a write flushing operation stopped because the level reached the Low Watermark percentage of modified pages. When the SP begins flushing pages, it continues until the number of modified unflushed pages in the cache reaches the Low Watermark percentage.
WC_Flush_Count The write-cache counters involve data that cannot be attributed to a particular LUN. They deal with the flushing of cache, which the SP does without regard to which LUNs are participating in caching.
WC_Blocks_Flushed The write-cache counters involve data that cannot be attributed to a particular LUN. They deal with the flushing of cache, which the SP does without regard to which LUNs are participating in caching.
Hard_errors The number of hard errors reported to the server.

LUN primary data

A sample LUN primary data display, with details on each data tag, follows

Figure 3-12. Sample LUN primary data display

anaint12.gif (9617 bytes)

A LUN appears to the server system as an individual disk, but it is really a group of disks that are bound into a RAID group. The LUN data window lets you observe the detailed performance of each LUN.

Table 3-3. LUN primary data tags

Time stamp Time at which the values you see were written to the log. (The values displayed in the time stamp column are not the same as the start and end values you can set using the "Performance Values at" and "Delta with Values at" pull-down lists.) Delta is the number of seconds between sample values; it shows the difference between the setting of the "Performance Values at" and the "Delta with Values at" pull-down lists.
Host_Read_Requests Number of reads from this LUN.
Host_Write_Requests Number of writes to this LUN.
Host_Blocks_Read Number of disk blocks (512 bytes) read from this LUN.
Host_Blocks_Written Total number of blocks written to this LUN.
Read_Cache_Hits Number of times a read request found its data in the read cache, avoiding a disk access.
Read_Cache_Misses Number of times a read request did not find its data in the read cache.
Blocks_Prefetched Number of disk blocks prefetched by the read cache.
Unused_Prefetched_Blocks Number of disk blocks prefetched by the read cache but never accessed after the read that caused the prefetch.
Write_Cache_Hits Number of times a write operation was completely satisfied by the cache, avoiding a disk access. The write occurs later and will be counted by a flush counter. Some writes might require a disk read for part of their data; such a write does not count as a write-cache hit.
Forced_Flushes Number of times a write forced the SP to flush a page to make room in the cache (this counts as a cache miss). See the section on cache tuning,
Cache tuning, for more information.
Stripe_Crossings Number of times an I/O crossed a stripe boundary on a RAID-5, RAID-0 or RAID-1/0 LUN.
Read_Histo_n Data for read-size histograms, where n is 0 through 9. You can create a histogram from the current sheet by selecting a LUN and then clicking the Histogram button.
Read_Histo_Overflows The number of reads larger than 1023 blocks.
Write_Histo_n Data for write-size histograms, where n is 0 through 9. You can create a histogram from the current sheet by selecting a LUN and then clicking the Histogram button.
Write_Histo_Overflows The number of writes larger than 1023 blocks.

Histograms

Histograms let you see how often I/Os of differing sizes occur with a LUN. You can create a histogram from the current sheet by selecting a LUN and then clicking the Histogram button. A sample histogram follows

Figure 3-13. Sample histogram

anaint13.gif (9617 bytes)

The preceding histogram indicates that during about 26 seconds, the following number and sizes of writes occurred to LUN 1:

Writes of 1 sector (512 bytes) of data: 0
Writes of more than 1 up to 2 sectors of data (2 sectors): 0
Writes of more than 2 up to 4 sectors of data (4 sectors): 0
Writes of more than 4 up to 8 sectors of data (8 sectors): 650
Writes of more than 8 up to 16 sectors: 0
Writes of more than 16 up to 32 sectors (32 sectors): 10
Writes of more than 32 up to 64 sectors (64sectors): 60
Writes of more than 64 up to 128 sectors (128 sectors): 420
Writes of more than 128 sectors: 0

Most writes were 8 sectors (2,048 bytes) and 128 sectors (65,536 bytes). The latter number is the default RAID-5 stripe size.

Disk primary data

A sample disk primary data display, with details on each data tag, follows.

Figure 3-14. Sample disk primary data display

anaint14.gif (9617 bytes)

Table 3-4. A sample disk primary data display tags and descriptions

Time stamp Time at which the values you see were written to the log. (The values displayed in the time stamp column are not the same as the start and end values you can set using the "Performance Values at" and "Delta with Values at" pull-down lists.) Delta is the number of seconds between sample values; it shows the difference between the setting of the "Performance Values at" and the "Delta with Values at" pull-down lists.
Number_Hard_Read_Errors Number of times a read from this disk failed through 14 retries. Hard read errors may involve data loss.
Number_Hard_Write_Errors Number of times a write to this disk failed through 14 retries.
Number_Soft_Read_Errors Number of times a read from this disk failed but succeeded after a retry. A growing number of errors may mean that the disk is nearing the end of its useful life.
Number_Soft_Write_Errors Number of times a write to this disk failed but succeeded after a retry.
Average_Disk_Request_Service_Time Average time taken by the disk to process an I/O. This does not include time spent in a queue waiting for service.
Average_Address_Difference Average number of 512-byte sectors the disk read/write heads had to move to execute a request.
Number_Reads Number of reads from the disk.
Number_Writes Number of writes to the disk.
Number_Read_Retries Number of read retries after a soft or hard I/O error on the disk.
Number_Write_Retries Number of write retries after a soft or hard error on the disk.
Maximum_Requests_in_Queue Largest number of I/O requests on the disk queue. With the Average number (next), this gives some measure of relative disk load.
Average_Requests_in_Queue Average number of I/O requests on the disk queue.
Number_Blocks_Read Number of 512-byte disk blocks (sectors) read from the disk.
Number_Blocks_Written Number of 512-byte disk blocks (sectors) written to the disk.
Sum_Blocks_Seeked_Hi Combined with the next entry, the sum of the disk blocks covered by seek operations from one I/O to the next. This sum can be a very large number so we allocate 64 bits for it - 32 bits for Sum_Blocks_Seeked_Hi and 32 bits for Sum_Blocks_Seeked_Lo (next).
Sum_Blocks_Seeked_Lo See previous entry.
Sum_Queue_Lengths_on_Arrival Sum of the length of all queues to this disk seen by arriving requests. When a request arrives for SP processing, the Data Analyzer measures the current queue length and adds 1 to the length. The Data Analyzer uses this value to calculate the Average queue length for a LUN.
Number_Arrivals_With_Non_Zero_Queues Arrivals to a busy queue. This counter is incremented when an arriving request finds that the SP is already processing a request. The Data Analyzer uses this value to calculate the Average queue length for a LUN.
Idle_Ticks Number of ticks during which the disk was idle.
Busy_Ticks Number of ticks during which the disk was busy.

Creating custom performance formula sheets

You can create your own custom performance formula sheets, using the default, SP, or LUN performance sheet as a basis and specifying your own custom formulas.

Data Analyzer formulas

In a formula you can use any of the valid data tags (variable names) shown in the Primary Data displays.

Data tags you can use for SPs appear in SP primary data; Data tags you can use LUNS appear in LUN primary data; Data tags you can use for disks appear in Table 3-4

You can combine a data tag with the appropriate specifier for its device type using the device specifier, a period (.) and the data tag; for example,

SP A.Number_Reads
Disk 0-5.Busy_Ticks

note.gif (1135 bytes) NOTE: Since spaces (blanks) might be part of a device specifier (as shown in the examples above), you cannot use spaces freely in expressions. For example, the following two expressions are different:

3 * 2 and 3*2

You can also include real numbers or numbers in exponential format. For example, the following numbers are valid:

1, 14.32, -1.45, -5E34

Period length (Period_Length)

An important data tag name that pertains to all device types is Period_Length, which yields the length of the observation period in seconds. For example, the expression Number_Blocks_Read/Period_Length yields blocks read per second.

Operators

Table 3-5. Operators descriptions and examples

Operator Description Example
+ Add 100*Busy_Ticks/(Idle_Ticks+Busy_Ticks)
yields the percentage of time a disk was busy
- Subtract Maximum_Requests_in_Queue-Average_Requests_in_Queue yields the difference between the maximum and the average number of requests in the queue
* Multiply by 100*Busy_Ticks/(Idle_Ticks+Busy_Ticks)
/ Divide by Busy_Ticks/500
( ) (Parentheses) Change the precedence of operations. 100*Busy_Ticks/(Idle_Ticks+Busy_Ticks)
{ } (Curly brackets) Delimit a set. {Busy_Ticks,Idle_Ticks}
[ ] (Square brackets) Delimit a set that uses a device specifier. [Busy_Ticks:Lun.Disks] yields the Busy_Ticks of all disks in a LUN
, (Comma) Separates values in a set. {Busy_Ticks,Idle_Ticks}
> Yields the maximum value in a set. >{Busy_Ticks,Idle_Ticks}
< Yields the minimum value in a set. <{Busy_Ticks,Idle_Ticks}
~ (Tilde) Yields the average value in a set. ~{Busy_Ticks,Idle_Ticks}
& (Ampersand) Yields the sum of values in a set. &{Busy_Ticks,Idle_Ticks}
# (Pound sign) Yields the number of values in a set. #{Busy_Ticks,Idle_Ticks}
: (Colon) Yields all entities in a device specifier in a set. ~[100*Busy_Ticks/(Idle_Ticks +Busy_Ticks):Lun.Disks] yields the average percentage of busy time for all disks in a LUN

Formula error messages

To display error messages, either turn on the error message display using View Show Errors or select Performance Menu Evaluate Formula.

The Data Analyzer displays one of the following error messages when it cannot evaluate an expression:

Bad token: The expression contains an unknown token (character).

Unexpected symbol: The shown symbol was found in the expression at a position where a different symbol was expected.

']' missing: There is an open parenthesis without a matching closing one.

Unknown identifier: The listed identifier cannot be resolved in the given context. See SP primary data through Creating custom performance formula sheets or use the device Data Primary Data window to find the valid identifier.

Creating custom performance formula sheet

To create a custom performance formula sheet

  1. Plan the new work sheet: Decide which of the original data categories you want to keep, the order in which you want them, and any new data categories you want to create.
  2. Use an existing sheet that is closest to the one you want as a base. The supplied sheets (in directory drive:\Program Files\Dell\Dell OpenManage™ Data Analyzer\Data) are default.pfs, sp.pfs, and lun.pfs.
    1. Remove all data categories you don't want. You can do this by selecting the top in the column(s) and choosing Performance Remove Data Category. Or you can rename any column by selecting it and choosing Performance Rename Data Category.
    2. Save the new performance sheet with a new name using File Save As. Repeat this step periodically — using File Save As with the same name — so your work won't be lost if system problems occur before you have a chance to save.
    3. Move any data categories you want from left to right by selecting the category(ies) and then selecting Performance Move Category to Right or Move Category to Left.
  3. Add each data category you want as follows.
    1. Create the data category using Performance Add Data Category; then type the data category name and click OK. For clarity when you use the new data category in expressions, avoid using spaces in the name. For example, assume you want to create a data category called Read_Throughput[Kb/s].
    2. The program then creates the column under your specified name and places it at the far right of the sheet. To see the new column, you must scroll to the right side of the sheet.

    3. Define the formula for your new category by selecting any cell in the column and then using Performance Edit Formula. Before doing this, you might find it helpful to display formulas for the existing categories (using View #174; Show Formulas). In the Show Formula display the columns are very wide; you must scroll to see any column except the first one.
    4. Enter the new formula and click OK. For example, for the Read_Throughput[Kb/s] formula above, you can enter any of the following, to create a new data category for disks, LUNs, or SPs.
    5. For disks:Number_Blocks_Read/2*Period_Length

      For LUNs:Host_Blocks_Read/2*Period_Length

      For SPs:FE_Blocks_Read/2*Period_Length

      note.gif (1135 bytes) NOTE: The case of characters in data tag names is important. A formula will not work if the case of any character is wrong.
    6. Verify that your formula works by selecting Performance Evaluate Formula. Error messages are explained on Formula error messages. Usually they result from improper case of characters, typing errors, or use of the wrong symbol type for the device. If there is an error, fix it using Performance Edit Formula and click OK.
    7. When the Data Analyzer accepts your formula, you will see your newly created data category column fill with values.
    8. Move your new column as desired.

  4. Save the formula sheet using File Save As, explained in step Creating a custom performance formula sheet. You can access it at any time in the future using File Open Formula Sheet and specifying the name.

Cache tuning

The storage-system array cache is designed to minimize the number of disk I/Os. For best performance with most applications, each SP should have its maximum amount of cache memory and you should use the default settings for the cache parameters. The Data Analyzer shows how the cache affects the array and lets you tune the cache parameters to best suit your application.

An storage-system array cache has two parts: a write cache and a read cache. The write cache buffers writes and optimizes them by absorbing peak loads, combining small writes, and eliminating rewrites. The read cache uses a read-ahead mechanism that lets the array prefetch data from the disk, so the data will be ready in the cache when the application needs it.

Write cache

The write cache optimizes I/O traffic when data is flushed to disk. To do this it uses parameters called High Watermark and Low Watermark. Flushing starts when a LUN is idle or the number of modified pages in the cache has reached the High-Watermark percentage. Flushing stops when the Low-Watermark percentage is reached.

The default values for High Watermark and Low Watermark provide good all-around performance. You can tune these to provide better performance in specific cases.

The following guidelines will help you optimize the Watermark controls.

  • It is better to have High Watermark processing provide free (clean) cache pages than to run your cache in contention where an incoming request must pend while it waits for memory to be freed by a forced flush. Each page that is flushed reduces the chances of optimizing or eliminating a write operation. Lowering High Watermark will maintain a higher count of clean pages; however, it will also increase the number of disk I/Os.
  • Lowering Low Watermark will similarly increase the number of pages flushed and reduce the chances of optimization and I/O elimination. As a benefit, however, lowering Low Watermark makes more room for incoming data to find free pages.
  • A major benefit of the write cache is its ability to absorb writes and provide low response times (about 3 ms) as long as there are free cache pages. Once the cache is full of modified unflushed pages, then response time reverts to its uncached level. For RAID-5 writes on a relatively unloaded system, this can be as high as 30 ms. When High Watermark and Low Watermark are set far apart, the amount of free cache available is maximized and this allows bigger bursts at full cache speed. This benefits an application that does burst of write activity; for example a database that uses a 15-minute checkpoint will do a burst of wrting every 15 minutes. If the data will fit in a free cache pages, the overall system performance will not suffer because of these burst of activity.
  • The write cache provides sustained write speed by combining sequential RAID-5 write operations and writing them in RAID-3 mode. This eliminates the need to read old data and parity before writing the new data. To take advantage of this feature, the cache must have enough space for one entire stripe of sequential data (typically 64 KBytes* 4= 256 KBytes) before starting to flush. Note that the sequential stream can be contained in other streams of sequential or random data.

The Data Analyzer displays write cache hit ratios and the number of hits on a per LUN basis. (It displays the hit ratio on the LUN performance formula spreadsheet and the number of hits in the LUN Primary data display.) Both the ratio and number are needed to judge if your cache is effective. The hit ratio shows cache efficiency and hit number tells you how many I/Os you are saving. Even if the hit ratio is 99%, if you are saving only 20 I/Os per interval (two I/Os per second), you may decide to disable caching for this LUN and use the memory more effectively elsewhere.

The LUN Primary Data display shows the number of forced flushes. A forced flush happens when the cache is full of modified pages that have not yet been flushed. An incoming page forces a page to be flushed from cache. Forced flushes are undesirable since they incur the overhead of cache management with little chance of optimizing writes. As a general rule, if the number of forced flushes is more than half the number of writes, performance will be better without the write cache.

Read cache

The read cache helps applications that use a single task to read sequential data. When a read operation occurs, the data is read into a buffer and passed back to the server. If the system detects that the disk address of the buffer is the next in sequence to a buffer already in memory, then it initiates a read-ahead operation. Read-ahead size is either fixed or a multiple of the read request. Memory is reused by overwriting the oldest buffers first. You can extend the retention period by increasing the prefetch priority. The read cache also provides read-again caching for applicatins that are designed carelessly enough to read the same data twice (there are more than you might think)

The Data Analyzer displays the read cache (RC) and write cache (WC) hit ratios on the Performance spreadsheet. It displays the number of cache hits on the LUN Primary Data display. The hit ratio is a measure of the efficiency of your tuning and should exceed 20%. The hit rate shows the impact of caching on your application. Fewer than 5 hits per second means that the cache will have no impact on your application.

The LUN Primary Data display shows the number of prefetch requests and the number of blocks that were prefetched but not used. The ratio of these is the prefetch efficiency. If component utilization is high (over 50%), tuning the prefetch efficiency is important. To tune the read cache, use the controls that set the read cache memory size, the prefetch size, and the prefetch type. You can tune the cache for each LUN according its workload. Use the following guidelines when you tune your read cache:

  • More memory allocated to read cache increases the chance of read hits on cached data.
  • Prefetches are always queued to disks at low priority.This means that small prefetches will not interfere with writes or random reads.
  • Large prefetches can slow writes and random read requests by increasing disk utilization.

Back to Contents Page

Laptops | Desktops | Business Laptops | Business Desktops | Workstations | Servers | Storage | Services | Monitors | Printers | LCD TVs | Electronics
© 2012 Dell | About Dell | Terms & Conditions | Unresolved Issues | Privacy Statement | Ads and Emails | Dell Recycling | Contact | Site Map | Feedback
AT | AU | BE | BR | CA | CH | CL | CN | CO | DE | DK | ES | FR | HK | IE | IN | IT | JP | KR | ME | MX | MY | NL | NO | PA | PR | RU | SE | SG | UK | VE | ALL

snWEB9