SarCheck version 6.01 reference guide for HP-UX systems


Table of Contents:

Introduction
Features
Restrictions
Known limitations
How to install SarCheck (diskette)
How to install SarCheck (electronic copies)
How to set up sar
How to set up ps -elf data collection
How to set up gnuplot
How to activate the software
How to deinstall SarCheck
How to run SarCheck, menu driven
How to run SarCheck, command driven
How to run SarCheck, crontab entry
How to analyze multiple sar reports
How to analyze real time data
How to change the menu defaults
How to change the SarCheck's algorithms
How to create graphs from a sar report
Examples of how to use the switches
How to access the online instructions/help text
How to move SarCheck to another directory
How to produce the most accurate analysis
How to get the most from SarCheck
How to interpret the analysis
The summary section
The recommendations section
The resource analysis section
The capacity planning section
The custom settings section
Tuning strategies specific to HP-UX
How to order SarCheck
How to get technical support for SarCheck
How to get technical support for gnuplot
Files included in this release
An example of a SarCheck report
The summary section
The recommendations section
The resource analysis section
The capacity planning section
The custom settings section
Tabular summary
Frequently Asked Questions (The FAQ)
Bibliography
Special thanks
Appendix A: SarCheck parms file keywords
Appendix B: Options available when running 'analyze9000'


Introduction:

The SarCheck utility analyzes your system for possible performance bottlenecks such as memory shortages and ‘leaks’, disk load imbalances, CPU bottlenecks, runaway processes, and improperly set tunable parameters. It also tells you approximately how much of an additional workload the system can support at peak times of the day. It does this by analyzing a user-specified sar report, the output of ps -elf, scanning various kernel structures, and producing a plain English report. The report explains the resource bottlenecks seen and makes recommendations which can improve system performance.

SarCheck will convert sar reports into a CSV-formatted form which is used by many popular graphing tools. It will also produce graphs if you have a version of gnuplot installed that supports PNG or JPEG output. If you ask for HTML output and the production of graphs, SarCheck will insert the graphs into the HTML document. The graphs will be inserted by using <img> tags complete with a description in the tag's alt attribute in order to meet accessibility requirements.

Important: SarCheck has not been designed to analyze sar data from one system on another. This is because not all of the data needed is in the sar and ps reports, and SarCheck has to look for additional data in the kernel. If SarCheck uses sar data from one system and kernel data from another system, it will issue a warning.

SarCheck's recommendations are designed to produce incremental improvements, so SarCheck should be run regularly. No attempt is made to guess the ultimately correct value for any parameter based on a single day's sar data. Instead, SarCheck will recommend that you increase or decrease values based on the data available, and will continue to recommend changes until there is no more room for improvement. Performance tuning is, by definition, a process of trial and error. SarCheck will not only help you to make those changes, but will also explain the reasons for each recommendation.

SarCheck is different from other performance tools because it does not monitor system activity. In much the same way that a UNIX performance expert would approach the problem, SarCheck uses the sar utility along with several tools, and analyzes the data available. Since sar is included with the operating system, we didn't see a need to create yet another monitor for you to buy.

SarCheck can be run from the command line, or from a menu-driven front end script. For reasons of safety and security, SarCheck will not attempt to change tunable parameters or anything else in the kernel.


Features:

The following conditions are identified by SarCheck Version 6:

  1. CPU bottleneck detection
  2. I/O bottleneck detection
  3. Detection of improper disk load balancing
  4. Detection of unusually slow disk devices
  5. Memory bottleneck detection
  6. Inefficient system buffer cache sizing
  7. Improper system table and inode cache sizes
  8. Inefficient sizing of many other tunable parameters
  9. Limited capacity for an increase in workload or users
  10. Impossible data, such as negative CPU utilization
  11. Runaway process detection
  12. Memory leak detection
  13. Inefficient PATH variables
  14. The relationship between free memory and gpgslim
  15. The actual size of the dynamic buffer cache

Based on its' analysis of the resources and statistics described above, SarCheck may recommend a variety of steps which can be taken to improve system performance.


Restrictions:

SarCheckTM is designed to work with HP-UX versions 10.10 and up, including HP-UX 11i. Support for pre-10.20 operating systems is likely to be discontinued at some point later this year. SarCheck is also available for most versions of Solaris SPARC, Linux x86, and AIX.


Known limitations:

  1. On systems with 12 disks or less, SarCheck will recommend disk balancing based solely on disk activity. If a disk is nearly full but idle, SarCheck may recommend that you move a filesystem to that disk. This limitation is due to the fact that disk space is not reported by sar. Because sar does not break down disk statistics by filesystem, SarCheck does not have enough information to recommend moving specific filesystems from one disk to another.
  2. An analysis of the load from an unusually quiet day, such as a holiday, will produce recommendations that may be inappropriate. For this reason, we recommend analyzing activity only from times and days when the system is busy. Please refer to the section entitled "How to produce the most accurate analysis" for details.
  3. SarCheck will occasionally report slightly different averages than the sar report, primarily due to rounding errors. SarCheck calculates its' own averages because errors are common in the averages calculated by many implementations of sar.
  4. SarCheck's recommendations are not listed in order of significance or potential for performance improvement.
  5. The default formulas used by the operating system to calculate the values of some tunable parameters have the potential to cause some unanticipated side effects. We recommend replacing the formulas with specific values whenever possible.


How to install SarCheck (diskette):

Note: This is a PC-formatted diskette and the software is a compressed tar archive. If your HP 9000 has a diskette drive, the doscp utility can be used to copy the software to /tmp or wherever you choose. If not, you should be able to easily FTP the software from a PC or other device that can read PC-formatted diskettes.

To install the software, log in as root, put the compressed file in /tmp, then uncompress and detar it. This only takes a few seconds.

  1. Log in as root.
  2. Change the working directory to /tmp

    cd /tmp

  3. Make sure that the /opt/sarcheck/bin and /opt/sarcheck/etc directories exist, and make them if they'’re not there.
  4. Now uncompress & detar:

    zcat < schp.taz | tar xvf -

This will install SarCheck on your system. See the section entitled Files included in this release for details. The installation of SarCheck does not require rebuilding the kernel. This is important because it means that SarCheck will not increase the size of your kernel, and you won't have to reboot your system. Setting up sar may require a reboot.

To test SarCheck, type

/opt/sarcheck/bin/analyze9000 hpsar22 | more

Warning: Do not implement the recommendations produced by analyzing the test file hpsar22! This file has been included for test purposes only.

To reduce typing, you may want to add /opt/sarcheck/bin to root’s PATH.


How to install SarCheck (electronic copies):

In some cases, usually for those with a software subscription, we will email SarCheck to customers. Emailed software is a compressed tar archive.

To install the software, log in as root, put the compressed file in /tmp, then uncompress, and detar it. This only takes a few seconds.

  1. Log in as root.
  2. Change the working directory to /tmp

    cd /tmp

  3. Make sure that the /opt/sarcheck/bin and /opt/sarcheck/etc directories exist, and make them if they'’re not there.
  4. Now uncompress & detar:

    zcat < schp.taz | tar xvf -

This will install SarCheck on your system. See the section entitled “Files included in this release for details. The installation of SarCheck does not require rebuilding the kernel. This is important because it means that SarCheck will not increase the size of your kernel, and you won’t have to reboot your system. Setting up sar may require a reboot.

To test SarCheck, type

/opt/sarcheck/bin/analyze9000 hpsar22 | more

Warning: Do not implement the recommendations produced by analyzing the test file hpsar22! This file has been included for test purposes only.

To reduce typing, you may want to add /opt/sarcheck/bin to root’s PATH.


How to set up sar

First, make sure the /var/adm/sa directory exists, and make it if it doesn't.

Here are some recommended cron entries. These entries will capture data once an hour at non-peak times and every 20 minutes during the system’s busiest times. Feel free to modify these entries to best capture statistics from your system’s busiest times. We recommend capturing sar data every 10 to 60 minutes.

Add these to your /usr/spool/cron/crontabs/root file using SAM or crontab -e, and type them exactly as seen:

#collect sar data
0 * * * * /usr/lbin/sa/sa1
20,40 8-17 * * 1-5 /usr/lbin/sa/sa1

#reduce the sar data
5 18 * * * /usr/lbin/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A

In some cases, it may be necessary to reboot the system for the changes to take effect.

NOTE: We found a bug in early versions of the HP-UX 11.00 implementation of sar that prevents the -s switch from working. If the above crontab entries do not produce /var/adm/sa/sar* files or you get mail messages with the strange error message "sar: Starting time must be more than ending time", try changing crontab entry for running sa2 to the following:

#reduce the sar data
5 18 * * * /usr/lbin/sa/sa2 -A

If it's important to have starting and ending times defined in your sar reports, you'll have to adjust the crontab entries for running the /usr/lbin/sa/sa1 script.


Setup of ps -elf data collection:

One of SarCheck's most powerful features is its' ability to analyze other data in addition to its' analysis of sar. The ps1 script will collect data from ps -elf and data relating to memory utilization, and the vg1 script will collect data from the vgdisplay and pvdisplay utilities. A few simple steps are required to take advantage of this powerful new feature:

  1. Log in as root

  2. Make the directory /opt/sarcheck/ps:
    mkdir /opt/sarcheck/ps

  3. Add the following entries to root'’s crontab file for typical 8:00AM to 5:00PM, Monday thru Friday monitoring:
    0,20,40 8-17 * * 1-5 /opt/sarcheck/bin/ps1
    5 17 * * 1-5 /opt/sarcheck/bin/ps2
    30 12 * * 1-5 /opt/sarcheck/bin/vg1

As an alternative, the following cron entries are oriented towards the 24x7 monitoring that many administrators prefer:

0,20,40 * * * * /opt/sarcheck/bin/ps1
45 23 * * * /opt/sarcheck/bin/ps2
30 12 * * * /opt/sarcheck/bin/vg1

We recommend using SAM or crontab -e to modify the crontab file.

The ps1 script calls the program /opt/sarcheck/bin/freemem in addition to running ps -elf. The freemem program collects data related to memory utilization including the values of freemem, lotsfree, gpgslim, and both the size of the buffer cache and the limits of set by the dbc_min_pct and dbc_max_pct parameters.

The vg1 script calls the program /opt/sarcheck/bin/vgparse. The vgparse program collects data from the vgdisplay and pvdisplay programs. This data is used by SarCheck to find disks which are inefficient because of fragmented logical volumes. The vg1 script only needs to be run once and should be run at a time when the data will be included in SarCheck's analysis.

The ps -elf data collected on large systems can take up a considerable amount of space. If you want to store this data somewhere other than /opt/sarcheck/ps, you can specify a different directory with the PSELFDIR keyword in the sarcheck_parms file.

WARNING: If you choose to specify a different directory, be sure to pick a directory that is not used for anything else. The purpose of the ps2 script is to remove any file in the ps -elf directory which is more than 14 days old and you don't want to accidentally remove files which contain something other than ps -elf data.


How to set up gnuplot:

If you ask SarCheck to produce PNG or JPEG graphs, it will look for gnuplot and will try to use it to generate graphs. There are a number of different places where you can find gnuplot and it is generally available as source code or source code with a precompiled binary.

For the sake of consistency and convenience, we use the precompiled binary available from Ready to Run Software, Inc. Their URL is http://www.rtr.com/ and they sell precompiled binaries for all of the RISC platforms that SarCheck runs on.

To set up gnuplot, please follow the instructions that come with the source or binary.


How to activate the software:

Eval software is typically shipped 'live' and an activation key is not necessary. Feel free to install eval software on as many HP 9000 systems as you want.

Purchasers of a SarCheck software license should note that the software will expire shortly after installation. To permanently activate the software, please run /opt/sarcheck/bin/analyze9000 -o to produce the registration form and fax or email it to us. Registration will remove the expiration date and will lock the software to a system’s machine ID. If you wish to move the software from one system to another, please call us and we’ll help you. SarCheck was not designed to be regularly moved from one system to another, in an effort to provide 'quick fixes' to a number of systems. Quick fixes will not allow you to take advantage of the long term iterative tuning which SarCheck makes possible.

To reactivate or change the expiration date of evaluation software, we will need the SarCheck serial number and the machine ID. These can be most easily found with the following command:

/opt/sarcheck/bin/analyze9000 -s


How to deinstall SarCheck:

Remove the following files which are described in the section entitled "Files included in this release". If you are loading a new version of SarCheck over an old one, deinstallation is not necessary.


How to run SarCheck (menu driven):

First, log in as root. To analyze sar statistics from a menu, type:

/opt/sarcheck/bin/sarcheck

To reduce typing, you may want to add /opt/sarcheck/bin to root’s PATH.

A series of choices will appear on the screen. If you accept all the defaults by pressing the Enter key, the previous day's sar data will be analyzed, and this is the easiest way to get started. For security reasons, your account must have permission to access the sar data or report files that you wish to analyze.

The first question will ask you whether you want to analyze sar data or a sar report. Sar data is usually found in /usr/adm/sa/sann or /var/adm/sa/sann, where nn is the day of the month. Sar reports are already reduced into a readable form and are usually found in /usr/adm/sa/sarnn or /var/adm/sa/sarnn.

Analyzing the reports will be marginally faster than analyzing the data, but an advantage to analyzing the data is that you can control the start and end times by changing sarcheck's defaults. To change any of the defaults, see the section "How to change the SarCheck menu defaults".

Analyze what?     d   sar data (sa files)
                  r   A sar report (sar files)
                  c   Concatenate existing sar reports
                  *   Accept all defaults
                  x   Exit SarCheck
(keyword = DR, default = d): _

After you pick the d or r options, you will be prompted to enter the name of the data or report file. In either case the default will be the statistics from the previous business day. The c option will concatenate all of the sar reports present and will not ask you for the name of a file. The sarcheck script will change your working directory, so you do not have to use the absolute address of the file. To accept all defaults, enter an asterisk; to exit sarcheck, enter an x. To change any of the defaults, see the section "How to change the SarCheck menu defaults".

Sar data is usually found in /var/adm/sa/sann. Based on user-definable defaults, data from 08:00 to 17:00 will be analyzed. Enter the name of the sar data file that you wish to analyze.

Available data files in /var/adm/sa:
sa20, sa21, sa25, sa26, sa27, sa28, sa29, sa30
(default = sa29):

Note that if you run sarcheck on a Saturday, Sunday or Monday, Friday's statistics will be analyzed by default. This is because weekend statistics are usually not representative of a loaded system, and there is a possibility that misleading recommendations would be generated. To change the default of excluding the analysis of weekend data, see the section "How to change the SarCheck menu defaults".

The next option allows you to pick formatting. The default will produce a report with page numbers and page breaks (ctrl-L) included. For users that prefer to paginate the report with another tool, such as pg, the p option will suppress these page breaks. You can also choose to produce an HTML document at this point, and can decide whether to format the disk analysis in table form. HTML documents are best viewed with a web browser. If you wish to exit sarcheck, enter an x.

Pick formatting:   n   Normal, with page breaks
                   p   Page breaks suppressed
                   h   Create HTML document, no disk 
                       table
                   t   Create HTML document, with disk 
                       table
                   o   Create HTML document, disk 
                       table only
                   *   Accept remaining defaults
                   x   exit sarcheck
(keyword = OPT, default = n): _

The verbosity option controls how verbose the SarCheck report is. The default verbose mode may produce a report 5 pages long, while Superquiet mode may only contain 5 lines of text. Please note that instructions for implementing recommendations, explanations, and alternate tuning strategies may be suppressed by the quiet modes. When you’re first using SarCheck, we recommend using the verbose mode so that you don’t miss anything. The superquiet mode will automatically suppress page breaks.

Verbosity level:   v   Verbose mode
                   q   Quiet, most verbiage
                       suppressed
                   Q   Superquiet, all verbiage
                       suppressed
                   *   accept all defaults
                   x   Exit SarCheck
(keyword = VERBOSE, default = v): _

Analysis of ps -elf data will provide you with a closer look at memory bottlenecks and the ability to detect runaway processes and memory leaks. The enhanced sensitivity option increases the probability of generating false alarms. See the section entitled Setup of ps -elf data collection” for more information. If you wish to exit sarcheck, enter an x.

Analyze ps -elf    n   No, analyze sar data only
data?              y   Yes, analyze sar and ps
                       data
                   e   Enhanced sensitivity of
                       ps data analysis
                   *   accept all defaults
                   x   exit sarcheck
(keyword = PSELFOPT, default = n): _

The disk filtration option is useful for large systems and has no effect on smaller systems. If more than 12 disks are present, sarcheck will only print a paragraph on each disk which has significant activity. SarCheck will try to determine what "significant" means on your system. If you find that SarCheck's filtration is too aggresive, use "z" to show all disks with some activity. We have discovered that on systems with several thousand disk devices, it is common for most of them to be completely idle. The "z" option will filter out only disks with no activity at all.

Disk filtration:    y  Filter disk analysis more than 12
                       disks.
                    n  Analyze all disk activity if more
                       than 12 seen.
                    z  Analyze all disks with some
                       activity.
                    *  Accept all defaults
                    x  Exit sarcheck
(keyword = DISKFLTR, default = y): 

The tabular summary replaces the incorrectly named comma-delimited option. It is used to print a summary of statistics in table form at the end of the report, and an example is included later in the manual. If HTML output has been selected, an HTML table is created. This option is useful for transferring statistics to a spreadsheet or graphics program, producing output which can be easily parsed by other programs. and for generating an easy to read table at the end of an HTML page.

Tabular Summary?   y  Print a tabular summary at the 
                      end of the report
                   i  Print a tabular summary instead of 
                      the report
                   n  Print the report without a summary
                   *  Accept remaining defaults
                   x  exit sarcheck
(keyword = TABULAR, default = n):

This option is used to decide where to send the analysis. Note that some of these choices will be different based on the pager you use and any modifications made to the defaults.

If you choose to send the output to a file, you'll be prompted for the name of the file. The default file name is /tmp/yyyymmddhhmmss, which is a date/time stamp. You can modify the default by editing the sarcheck script. If you wish to exit sarcheck, enter an x.

Send output to:    1   more (the screen)
                   2   lp -s (a printer)
                   3   A file
                   x   exit sarcheck
(keyword = OUTOPT, default = 1): _

How to run SarCheck (command driven):

To analyze a sar report file called sar12, type:

/opt/sarcheck/bin/analyze9000 sar12

For best results, pipe the output to more so that you can read it, or redirect it to a file if you want to save it. A report will be produced which contains information about your system, a brief summary, a recommendations section (if applicable), a resource analysis section, and a capacity planning section (if not suppressed). For more information, see the section entitled "How to Interpret the Analysis".

For users that prefer to paginate the report with another utility, such as pg, the -p option will suppress page numbers and page breaks. To take advantage of this option, type:

/opt/sarcheck/bin/analyze9000 -p sar12|pg

To reduce typing, you may want to add /opt/sarcheck/bin to root’s PATH.

Other options for the analyze9000 program can be found in the section entitled "Options available when running analyze9000".


How to run SarCheck (crontab entry):

SarCheck can be run automatically by adding an entry to root’s crontab file, ideally using SAM or crontab -e. Determine the time that /usr/lib/sa/sa2 is run (you'll have to set this up), and use cron to run /opt/sarcheck/bin/analyze9000 after that time. Here are two examples which assume sa2 is run at 18:00, the analysis will be done at 18:05:

In order to print a SarCheck analysis every weeknight, use the following entry:

5 18 * * 1-5 /opt/sarcheck/bin/analyze9000 /var/adm/sa/sar`date +\%d` | lp -s (this should all be on one line)

To keep all of SarCheck’s recommendations in the /usr/ops directory, use the following entry:

5 18 * * 1-5 /opt/sarcheck/bin/analyze9000 /var/adm/sa/sar`date +\%d` > /usr/ops/`date +\%y\%m\%d` (this should all be on one line)

Because the output of the analyze9000 program is stdout, you can pipe or redirect it in lots of ways. It can be printed, mailed, stored... whatever works best in your environment.


How to analyze multiple days of sar reports at once:

SarCheck has the ability to analyze multiple days of data at once when the reports are concatenated. The only limitation is that these reports must actually exist and be valid. Here is an example of how to analyze all sar reports from the first seven days of the month.

  1. First concatenate the reports, creating a single report called ‘/tmp/multisar’:

  2. cat /var/adm/sa/sar0[1-7] > /tmp/multisar

  3. Now analyze the concatenated report and pipe it to more:

  4. /opt/sarcheck/bin/analyze9000 /tmp/multisar | more

Please note that the analyze9000 program does not work if wildcard characters are used as a filename. Wildcard characters should be used with the cat command in order to produce a single file for the analyze9000 program. Once you’'ve become used to working with concatenated sar reports, you'’ll probably discover that the find command in /usr/lib/sa/sa2 removes the sar reports too quickly, and the naming convention used in /usr/lib/sa/sa1 is too restrictive. Make copies of the sa1 and sa2 scripts, and then modify them to meet your needs.


How to analyze real time data:

The script /opt/sarcheck/bin/ondemand can be used to analyze resource utilization and it works much like sar or vmstat. To collect 10 sar and ps -elf samples of 30 seconds each and analyze them, use the following command:

/opt/sarcheck/bin/ondemand 30 10 | more

Because the output of the ondemand script can be lengthy and is stdout, you'll probably want to pipe it to more or redirect it to a file. It's best to use a sampling interval of at least 15 seconds because this script runs sar and ps with enough frequency to create a noticeable load on the system when the sampling interval is short.


How to change the menu defaults:

You have the ability to change SarCheck's menu defaults. In earlier versions this was done by editing the sarcheck script, but those changes would be lost when a new version of SarCheck was installed.

The sarcheck script will now look for the file named /opt/sarcheck/etc/sarcheck_parms and will use any values found there instead of the normal defaults. This file is not included as part of the SarCheck distribution and you'll need to create it if you want to use it. The syntax for the sarcheck_parms file is very simple. Create a line in the file with the keyword and its' new default value, separated by a space. The keyword can be found when running the sarcheck script, and the value should be one of the choices on the menu. For example, if you want the output of SarCheck to include a tabular summary at the end of the report, here is the menu selection that you will see:

Tabular Summary?   y  Print a tabular summary at the 
                      end of the report
                   i  Print a tabular summary instead of 
                      the report
                   n  Print the report without a summary
                   *  Accept remaining defaults
                   x  exit sarcheck
(keyword = TABULAR, default = n):

You can see the name of the keyword and the options available. To change the default from 'n' to 'y' for this menu item, add the following line to the sarcheck_parms file:

TABULAR y

Now when you run the sarcheck script, the default behavior will be to print a tabular summary at the end of the report. After those two fields are parsed by the sarcheck script, the rest of the line is ignored and is available as a comment. Any line that starts with something other than a valid keyword is also treated as a comment and is ignored.

Once you have decided to change the defaults for the sarcheck script, create or edit the sarcheck_parms file. Here is an example of a sarcheck_parms file where the starting and ending times used for analysis have been changed, page numbering is suppressed, and a tabular summary is printed at the end of the report. Note that since the sarcheck script only looks at the first two fields on each line, the rest of the line is treated as a comment and lines that don't start with valid keywords are also treated as comments.

	# file to customize sarcheck created by 
# Jess the sys admin on March 23, 2002
# ST 06:00 starting time is 6AM EN 15:00 ending time is 3PM OPT p suppress page numbering TABULAR y add a tabular summary


How to change SarCheck's algorithms:

The sarcheck_parms file can also be used to change the thresholds used by SarCheck's tuning algorithms. Default values of SarCheck's thresholds have been established based on feedback from hundreds or thousands of systems and these values should not be overridden without good reason. A complete list of keywords supported in the sarcheck_parms file can be found in Appendix A.


How to create graphs from sar reports:

There are two ways to make graphs from the sar data used by SarCheck.

The first way is to let SarCheck build graphs with the gnuplot utility. By adding the -png, -jpg, or -jpeg switches, SarCheck can use gnuplot to produce PNG or JPEG graphs and can insert those graphs in the HTML output of SarCheck. This will enable you to post some really interesting SarCheck reports on your corporate intranet. To produce an HTML report with PNG graphs, use the -html and -png switches when running analyze9000. For example, the command

analyze9000 -html -png sar12 > rpt12.html

will produce an HTML report which can be read by your favorite browser. The most important parts of the report will be printed in bold type and headings are used to clarify what you'’re looking at. Graphs are inserted in appropriate places in the body of the report and some additional text is added to help explain the significance of the graphs. For more information, see the section entitled "Options available when running 'analyze9000'".

If you want SarCheck to produce graphs without the accompanying SarCheck report, use the -gonly switch to produce "graphs only".

The second way to produce graphs is by exporting CSV (Comma Separated Value) formatted data to a graphing program. The -gr switch in the analyze program will turn a sar report into output in CSV format. This format is easily understood by most spreadsheets.

To make the most of the -gr switch, the sar report should be formatted in a way which spreadsheets can understand. Sar reports tend to come in one of two formats, and some versions of HP-UX 10.x seem to have trouble with this.

This is what the wrong format looks like. Use the command sar -A and you'll probably see a mess that looks like this:

HP-UX hippie B.11.00 A 9000/785 07/04/00


09:40:01  %usr    %sys    %wio   %idle
          device   %busy   avque   r+w/s  blks/s  avwait
          runq-sz %runocc swpq-sz %swpocc
          bread/s lread/s %rcache bwrit/s lwrit/s 
          swpin/s bswin/s swpot/s bswot/s pswch/s
          scall/s  sread/s  swrit/s   fork/s   exec/
          iget/s namei/s dirbk/s
          rawch/s canch/s outch/s rcvin/s xmtin/s
          text-sz  ov  proc-sz  ov  inod-sz  ov  file-sz  ov
          msg/s  sema/s
10:00:01       1       0       2      97
           c0t6d0    1.91    1.39       3      62    9.
             1.1      13     0.0       0
               0       9      98       1       4      
            0.00     0.0    0.00     0.0      33
             442       27       14     0.12     0.13    
              0       4       0
               0       0       0       0       0       0
           N/A   N/A  66/276   0  430/476   0  281/920   0
            0.00    0.00
10:20:00       0       0       0     100
           c0t6d0    0.27    0.50       0       6    3.49
             1.0      12     0.0       0
               0       1      92       0       0      32
            0.00     0.0    0.00     0.0      22
             408        9        6     0.02     0.01  
               0       2       0
               0       0       0       0       0       0
           N/A   N/A  70/276   0  428/476   0  285/920   0
            0.00    0.00

The alternative format which allows the report to be easily parsed can be selected using the command:

sar -A -f /var/adm/sa/sann

where nn is the day of the month. If your release of HP-UX supports this more useful format, the sar report will look like this:


HP-UX hippie B.11.00 A 9000/785    07/04/00

09:40:01 %usr %sys %wio %idle 10:00:01 1 0 2 97 10:20:00 0 0 0 100 10:40:01 1 0 2 97

Average 1 0 1 98

09:40:01 device %busy avque r+w/s blks/s avwait avserv 10:00:01 c0t6d0 1.91 1.39 3 62 9.24 14.46 10:20:00 c0t6d0 0.27 0.50 0 6 3.49 10.39 10:40:01 c0t6d0 2.35 2.53 3 64 15.83 16.61

Average c0t6d0 1.51 1.91 2 44 12.20 15.29

09:40:01 runq-sz %runocc swpq-sz %swpocc 10:00:01 1.1 13 0.0 0 10:20:00 1.0 12 0.0 0 10:40:01 1.1 12 0.0 0

Average 1.1 12 0.0 0

We have no details on when this format stopped working and when it was fixed. It worked in HP-UX 9.04, was broken in HP-UX 10.10, and was fixed in 11.00. It may depend on patch levels as well as operating system version. In any event, SarCheck can take this output and produce a CSV file which can be used to produce beautiful graphs.

Some examples of how to use the switches:

The number of switches and options available in SarCheck continues to grow, and this section is designed to help you decide how to do what you want. In order to maintain some level of clarity, all examples will analyze the sar report file /var/adm/sa/sar23. The output of the analyze9000 program is stdout, so you'll probably want to pipe it to more or redirect it to a file.

Example 1: Analyzing a sar report. We're going to start with the simplest possible example. The command below will run the analyze9000 program and tell it to analyze the sar report file /var/adm/sa/sar23.

analyze9000 /var/adm/sa/sar23

Example 2: Removing the page breaks. The -p switch removes the page breaks. This is especially useful when piping the report to pg instead of more. A number of other switches, the -html switch for example, will automatically invoke -p where it makes sense.

analyze9000 -p /var/adm/sa/sar23

Example 3: Analyzing ps -elf output in conjunction with the sar report. The /opt/sarcheck/bin/ps1 script is used to collect ps -elf data which can be used in conjunction with the sar report. The -ps switch tells the analyze9000 program to search for the ps -elf data and include it in the report if possible.

analyze9000 -ps /var/adm/sa/sar23

Example 4: Creating an HTML-formatted report. Let's combine a few switches this time. The -html switch makes the -p switch unnecessary, but we want to incorporate ps -elf data into the analysis and we want to see a quick table of some statistics at the end of report.

analyze9000 -html -ps -t /var/adm/sa/sar23

Example 5: Creating an HTML-formatted report with embedded graphs. This is the same as the previous example, except that graphs are now embedded in the HTML output and they will be visible when the output is viewed with a browser. For this to work properly, you must have a copy of gnuplot installed, SarCheck must be able to find it, and the output must be redirected to a file. The file is then opened with a browser that can display PNG graphs. You can also use the -jpeg or -jpg switches if your version of gnuplot supports jpeg output.

analyze9000 -html -ps -t -png /var/adm/sa/sar23 > sar23.html

Example 6: Creating an HTML-formatted report, but this time on a big system. This time we're analyzing sar data from a SuperDome system, and there are 100 disks on the system. By default, SarCheck will filter out information on 'uninteresting' disks, but it will still produce a paragraph on each disk. This can get a little ridiculous and hard to read, so we'll use the -dtbl switch to format the disk information into an HTML table, and the -dbusy switch to sort the disk information so that the busiest disks are at the top of the table. As with the -t switch, cells will be colored if SarCheck wants to draw your attention to specific data. Note that the -dtbl switch requires the -html switch.

analyze9000 -html -ps -t -dtbl -dbusy /var/adm/sa/sar23 > sar23.html

Example 7: Emailing the output. If you manage a network of 200 systems, you may want to email interesting SarCheck reports to yourself, but you probably don't want to be spammed with 200 messages a day saying that everything's okay. The -r switch prevents SarCheck from producing any output at all if there are no recommendations. The -Q switch reduces verbiage to a minimum.

analyze9000 -ps -r -Q /var/adm/sa/sar23 | mail root@wherever.com

Example 8: Suppressing the detection of memory leaks. False alarms are not uncommon when SarCheck attempts to detect memory leaks. Some programs, such as those found on systems running Oracle, will grow over time. This is apparently a deliberate memory leak, and depending on the behavior of programs running on your system, you may want to suppress the reporting of memory leaks, runaway processes, or unusually large processes. The -pml switch can be used to suppress memory leaks as follows:

analyze9000 -ps -pml /var/adm/sa/sar23 | more

Example 9: Specifying a different ps -elf file. If you move the sarcheck files to a directory other than /opt/sarcheck and you want to analyze ps -elf data, you have to tell the analyze program where to find the data. This example looks at the sar report sar23 and the ps -elf data in pself file:

analyze9000 -ps -pv /tmp/pselffile /var/adm/sa/sar23 | more

Example 10: Generating graphs from a sar report. A sar report can be turned into a .csv file which can be graphed with a spreadsheet using the -gr switch:

analyze9000 -gr /var/adm/sa/sar23 > graph.csv


How to access the online instructions / help text:

To display the online help text, type:

/opt/sarcheck/bin/analyze9000 -h

This sends a subset of the instructions found in this manual to standard output (stdout), which defaults to the screen. More details are found in this manual.

For other help text, type:

/opt/sarcheck/bin/analyze9000 -hp or
/opt/sarcheck/bin/analyze9000 -hm or
/opt/sarcheck/bin/analyze9000 -hg

A FAQ section can also be found at the end of this manual, and updated information can be found on the SarCheck web site: http://www.sarcheck.com/


How to move SarCheck to another directory:

SarCheck by default is located in the /opt/sarcheck directory. You may move SarCheck to another directory by using the sarcheck_parms file. We will give you an example and will use /tmp/sarcheck as the new location of SarCheck.

Example:

  1. Create the following directories for SarCheck. /tmp/sarcheck/bin, /tmp/sarcheck/etc, /tmp/sarcheck/ps, /tmp/sarcheck/doc
  2. Move the existing files to the new directories
    mv /opt/sarcheck/bin/* /tmp/sarcheck/bin
    mv /opt/sarcheck/etc/* /tmp/sarcheck/etc
    mv /opt/sarcheck/ps/* /tmp/sarcheck/ps
    mv /opt/sarcheck/doc/* /tmp/sarcheck/doc
    
  3. Create the file /opt/sarcheck/etc/sarcheck_parms (yes, you might have just moved a file with this name) and add the following line to the file with your favorite editor:
    SARCHECKDIR  /tmp/sarcheck


How to produce the most accurate analysis from the SarCheck menu:

Important: Understand that you can inadvertently cause SarCheck to produce misleading or incorrect recommendations. SarCheck looks at an individual day of data reported by sar. That data should reflect the most active time of the day when performance is most important and the most active days of the week or month.

Analyze the data or report files that represent the busiest days. Determine the busiest times of the day, and if necessary, modify the menu defaults to analyze data from only that time period. To change any of the defaults, see the section "How to change the SarCheck menu defaults".


How to produce the most accurate analysis from the command line:

Important: Understand that you can inadvertently cause SarCheck to produce misleading or incorrect recommendations. SarCheck looks at an individual day of data reported by sar, and that data should reflect the most active time of the day when performance is most important, and the most active days of the week or month.

For example, if your system is in use from 8:00AM to 5:00PM, and runs batch jobs and backups at night with plenty of time to spare, performance is probably most important during the day. To help SarCheck produce the most useful analysis, use a command such as:

sar -A -s8 -e17 > reportfile

This command will use all options (-A), and will only include data collected between 8:00AM (-s8) and 5:00PM (-e17) in the report. If the busiest day in the last few weeks was the 29th of the month, and you want to produce a report of system activity between 8:00 and 5:00 on that day, use the following command:

sar -A -s8 -e17 -f /var/adm/sa/sa29 > report29

Of course, this will only work if /var/adm/sa/sa29 actually exists. Sar data should be regularly collected by a crontab entry.

It's important to analyze reports from days when the processing load was greatest. It may be that on those days, SarCheck will find resource bottlenecks which did not exist on days when the system did less work.

If you analyze data from the weekend, SarCheck may tell you how to optimize the system for weekend processing. Whether that makes sense or not in your environment (and it frequently won't) is up to you.

SarCheck does not need (and cannot use) sar reports produced with sar's -M switch. The -M switch is useful for gathering per-processor statistics but SarCheck does not have enough information to make use of this data. Please do not use the -M switch when producing reports for SarCheck to analyze.


How to get the most from SarCheck:

To get the most benefit out of SarCheck, we recommend using it as follows:

  1. Review the recommendations based on several days of sar statistics, especially days when peak processing occurs, and implement the recommendations that occur consistently.
  2. Implement recommendations one at a time. All performance tuning involves trial and error and every system is unique, therefore some recommendations may occasionally hurt performance. This is an uncommon occurrence (in fact, we’ve never heard of it happening with SarCheck), but these unsuccessful attempts to improve performance can only be identified if recommendations are implemented individually.
  3. Continue using SarCheck on a regular basis. While SarCheck will probably make recommendations when it is first run, that's only the beginning. Since many of the changes recommended by SarCheck are small, they will gradually lead you towards a truly optimized system. As new users and applications are added, older programs are modified, and file sizes increase, SarCheck will help to keep you from being surprised by the changing demands of your applications and users.

How to interpret the analysis:

At the beginning of the analysis, the name of the sar report file, the date, time, number of intervals, number of processors seen, amount of memory, and system name is printed for identification purposes.

Important: When data from one system is analyzed on another, it is likely to result in incorrect or misleading recommendations. A warning message will appear if the name of the system on the sar report is different than the name of the system running SarCheck, or if the operating system version recorded by sar is different than the version reported by the uname command. This is because the values of tunable parameters, memory size, etc., are likely to be different on different systems.

Warning messages will appear if impossible data is seen in the sar report. Examples would be CPU utilization of 313% or a swap queue occupancy of -88%. The type of sar data which contains the problem will be identified. Sarcheck will still produce a report, but you should realize that the analysis of anomalous data is, as always, likely to follow the rule of 'garbage in, garbage out'.

The Summary section will highlight any bottlenecks that were seen in the areas of CPU, memory, or I/O, and will indicate if any kernel parameters need to be changed. If no bottlenecks are seen, the summary will say so, and point out that no recommendations will be made.

If runaway processes, memory leaks, or suspiciously large processes have been detected, a message will appear at the end of the Summary section.

The Recommendations section is present only if SarCheck has recommendations to make. If SarCheck thinks that everything is fine, no recommendations will be made. This is a normal condition and once the system is properly tuned, you should not be surprised to see a lack of recommendations.

The recommendations are based solely on the data contained in the sar file and the values of various tunable parameters, and should be taken in that context. For example, if batch jobs are run on Saturdays, and SarCheck analyzes statistics from that day, it may decide that an I/O bottleneck existed and spare memory was present, and an increase in buffer size may be appropriate. Following these recommendations may improve performance on Saturdays, but could hurt performance during the week by reducing the amount of memory available to users.

The changes to tunable parameters recommended by SarCheck are designed to cause slow, gradual improvement in order to prevent surprises. If, for example, nfile should be larger and a memory bottleneck is not seen, SarCheck will recommend up to a twenty-five percent increase in nfile. After implementing this change, another increase of up to twenty-five percent may be recommended in a subsequent run of SarCheck. These gradual changes are designed to prevent any unanticipated side effects of a major change in a tunable parameter.

Due to the interrelationships between tunable parameters and system resources, sarcheck goes beyond the basic rules of thumb whenever possible.

The Resource Analysis section translates the data contained in the sar report into English. Much of this data is provided for reference, and explanations are given where appropriate. The implications of various statistics regarding CPU utilization, buffer sizing, memory utilization, system table sizes, and disk I/O bandwidth are presented in this section.

The times when key resources are most heavily used appear in this section. If these times correlate well with the times that performance degradation was reported, it can be inferred that exhaustion of these resources may be a cause of performance problems. Peak usage statistics are also used by the capacity planning section.

The Capacity Planning section can be used to approximate the amount of capacity left on the system, based solely on the sar data being analyzed. CPU, memory, disk, and system table statistics are examined in order to determine which resource is likely to become exhausted first.

This section is not meant to perform the same functions as the more expensive tools available for large systems. It is designed to help meet the needs of system administrators, many of which are managing growing systems and need to know how much room is left before various resources become exhausted.

The exhaustion of resources is defined as any single interval in which CPU usage exceeded 90 percent, a disk was busy more than 75 percent of the time, swapping was detected, or a system table was more than 80 percent full. Instructions for modifying these defaults can be found in the section "How to change SarCheck's algorithms". Because the interval with the greatest resource usage is used, the capacity planning report will be less accurate if peak resource use occurred during an interval of less than 10 minutes.

The Custom Settings section is where both successful and unsuccessful changes to SarCheck's default thresholds are reported.

Disclaimers, trademark information, etc. At the end of the report is a disclaimer, trademark and copyright information, your software serial number, code version, licensee, and if applicable, the software's expiration date.


Tuning strategies specific to HP-UX:

There are several features in the HP-UX operating system which make the system easier to manage, but this ease of use has a cost. These features can waste memory, CPU, and disk resources, preventing your system from achieving peak performance. By using SarCheck to help manage your system’s tunable kernel parameters, we believe that your system can perform better without losing the ease of use which has made HP-UX so popular.

HP-UX uses formulas to define a number of tunable parameters. The advantage of this is that you can raise the values of many tunable parameters at once, by simply changing the ‘maxusers’ parameter, and you never have to spend time trying to determine the optimum value for each parameter. The disadvantage is that the values of various tunables are never set optimally and your system will not reach its' peak performance and throughput potential. If you'’re using SarCheck, we recommend that you get away from the formulas and set the values of parameters in accordance with SarCheck'’s recommendations

HP-UX uses a dynamic buffer cache by default. Again, this is a technique which eliminates the need for the system administrator to manually tune parts of the system, but there is usually an increased paging overhead involved in using a dynamic buffer cache and there are times when the buffer cache should be static. SarCheck identifies when a static (or fixed) buffer cache is likely to offer better performance and helps you to find the optimum size of the buffer cache.

If significant memory pressure is seen, SarCheck will make recommendations that will help you tune the dynamic buffer cache in several small steps. First, it will recommend that you use the bufpages parameter to fix the size of the buffer cache based on the value of dbc_min_pct. If memory pressure remains high, SarCheck will recommend slowly reducing the size of the buffer cache.


How to order SarCheck:

Use the -o option of /opt/sarcheck/bin/analyze9000 to produce an order form and email or fax it to us, or ask your reseller to contact us. SarCheck’s pricing is based on HP’s pricing tiers and is the same throughout the world. The cost of shipping SarCheck via US Mail is included in the price of SarCheck. If you’d like the software shipped via Federal Express, DHL, etc., please provide your account number and we will be happy to accommodate you.

In some parts of the world, local resellers may charge prices which are higher than our list price because they pay for the currency conversions, international shipping, duties, support, etc. We urge our customers to support their resellers.


How to get technical support for SarCheck:

Please read the FAQ section of this manual and visit the FAQ section of our website first. This will always be the fastest way to get the answer to a frequently asked question. If that doesn'’t do it

Call us at +1-603-382-4200,
fax us at +1-603-382-4247,
write to us at PO Box 1033, Plaistow NH 03865, USA,
use our email address: support@sarcheck.com,
visit our web site at http://www.sarcheck.com/
or contact the party from whom you purchased SarCheck.


How to get technical support for gnuplot

See the FAQ at http://www.ucc.ie/gnuplot/gnuplot-faq.html for more information. If you received a precompiled binary from Ready to Run Software, Inc., contact them or visit their website http://www.rtr.com/ for support.


Files included in this release:

This release contains the following files:

/opt/sarcheck/bin/analyze9000: This program performs the analysis.

/opt/sarcheck/bin/freemem: This program collects data about memory and buffer cache utilization from the kernel. It is run by the /opt/sarcheck/bin/ps1 script and the data is stored with the ps -elf data collected by the same script.

/opt/sarcheck/bin/sarcheck: This is the front end for analyze9000. It’s a simple Bourne shell script which allows you to analyze the previous business day’s sar data by pressing the enter key a few times. Feel free to customize this script to meet your needs, but save an unmodified copy in order to be safe.

/opt/sarcheck/etc/analyze_txt: This file contains the text used to produce the analysis. In general, we recommend that you do not modify this file, because it may leave us unable to support the software. Users outside of the United States may modify the spelling of certain words in the file if they wish. For example, the word 'utilization' can be changed to 'utilisation'. If you would like a non-English version of SarCheck, please call us.

/opt/sarcheck/etc/analyze_key: This file contains the activation key. Tampering with this file may permanently disable SarCheck.

/opt/sarcheck/bin/ps1: This is a script that collects ps -elf data, and is roughly analogous to the sa1 script used by sar.

/opt/sarcheck/bin/ps2: This is a script that cleans up ps -elf data, and is roughly analogous to a subset of the sa2 script used by sar.

/opt/sarcheck/bin/ondemand: This is a script which can be used to get recommendations that are almost real-time. If your system is slow and you want to collect and analyze data while the system is slow, this script will enable you to do it. This script is new and we are trying to determine if it meets your needs. Please let us know what you think.

/opt/sarcheck/doc/hpman600.html: This is a HTML copy of the manual.

/opt/sarcheck/bin/vg1: This is a script that runs the program /opt/sarcheck/bin/vgparse.

/opt/sarcheck/bin/vgparse: This program runs the vgdisplay and pvdisplay utilities and puts the output in the ps -elf used by SarCheck. SarCheck uses this data to report on volume groups and physical volumes.

./hpsar22: A sample sar report.


An example of a SarCheck report:

The following examples were produced with the -w option, used to suppress page breaks and newlines. This option sounds pretty odd, but it’s really useful when exporting SarCheck reports to a Word Processing program. Please note that the text of the SarCheck report is printed in Courier font, and the explanation immediately follows the text of the report.

SarCheck(TM): Automated Analysis of HP-UX sar and ps data (English text version 6.00.00)

The title line prints the version of text file which was in use. Different versions can be used for languages other than English. If you are interested in a non-English version of SarCheck, please call us.

This is an analysis of the data contained in the file /tmp/rpt. There were 5 days of data collected from 07/13/2004 to 07/20/2004, from the HP9000/785/C360 system 'hippie'. There were 300 data records used to produce this analysis. The operating system used to produce the sar report was HP-UX Release B.11.00. 1 processor is present. 64 megabytes of memory are present.

This introductory paragraph prints the name of the sar report which was analyzed, when the data was collected, the number of records contained in the sar report, and other information about the system environment.

Data collected by the ps -elf command during 5 days between 07/13/2004 and 07/20/2004 will also be analyzed. This program will attempt to match the starting and ending times of the ps -elf data with those of the sar report file named /tmp/rpt.

This paragraph prints the name of the file containing the ps -elf data. This data is used primarily to find runaway processes or memory leaks, and is also useful in quantifying memory bottlenecks.

If the operating system version or system name reported by sar do not match the system that you're using to do the analysis, a warning will be printed.

SUMMARY

Because no resource bottlenecks were seen in this data, no recommendations can be made. SarCheck will not recommend changes unless it is trying to fix a specific problem. It typically does not find problems every day and is most effective when it is run regularly. Limits to future growth have been noted in the Capacity Planning section.

At least one possible memory leak has been detected. See the Resource Analysis section for details.

The summary lists any bottlenecks detected, any problems which may impact the accuracy of the analysis, and whether or not the thresholds used by SarCheck's algorithms have been changed. If SarCheck found problems and had recommendations to make, that fact would be mentioned here. If anything unusual is seen in the ps -elf data, it will be summarized here too. The following recommendations section is from a different report and is here to explain recommendations.

RECOMMENDATIONS SECTION

All recommendations contained in this report are based solely on the conditions which were present when the performance data was collected. It is possible that conditions which were not present at that time may cause some of these recommendations to result in worse performance. To minimize this risk, analyze data from several different days, implement only regularly occurring recommendations, and implement them one at a time.

The first paragraph of the recommendations section explains how to implement the recommendations. More information on this topic can be found in the "How to produce the most accurate analysis..." and "How to get the most from SarCheck" sections of this manual.

Change the value of 'dbc_max_pct' from 40 to 48. This recommendation has been made because the buffer cache statistics indicate that a larger buffer cache might improve performance.

Change the value of 'dbc_min_pct' from 20 to 24. This recommendation has been made because the buffer cache statistics indicate that a larger buffer cache might improve performance.

Change the value of 'nfile' from 920 to 1048. The parameter 'nfile' sets the size of the file descriptor table, which determines the total number of files which can be simultaneously open on the system.

These are examples of parameter tuning recommendations. As a rule, SarCheck will recommend small, incremental changes to the system's tunables in order to produce gradual change.

Please note that the formulas typically used to set many parameters can cause problems when manual adjustments are being made. For more information on this topic, please see the section entitled "Tuning strategies specific to HP-UX".

No disk recommendations have been made because no bottleneck was seen.

SarCheck will help you to use your existing hardware whenever possible. Disk balancing, faster disks, or additional disks will be recommended here if they might help.

Use the System Administration Manager (SAM) to change the values of tunable parameters. More information on the SAM utility and relinking the kernel is available in the System Administration Tasks manual.

This paragraph will appear if changes to tunable parameters have been recommended. You should read all of the information on relinking and rebooting if you've never done this before.

RESOURCE ANALYSIS SECTION

The resource analysis section is the place where various aspects of resource utilization are discussed regardless of whether a problem was seen.

Average CPU utilization was only 0.1 percent. This indicates that spare CPU capacity exists. If any performance problems were seen during the entire monitoring period, they were not caused by a lack of CPU power. User CPU as measured by the %usr column in the sar -u data averaged 0.06 percent and system CPU (%sys) averaged 0.01 percent. The sys/usr ratio averaged 0.12 : 1.

CPU graph

CPU utilization statistics from the sar -u report is analyzed here. In addition to average CPU utilization, occasionally heavy utilization and peak utilization is noted. The times of peak resource utilization are noted throughout this section and are provided to help you detect any correlation between peak resource utilization and peak performance degradation.

The CPU was waiting for I/O an average of 0.1 percent of the time. This statistic does not indicate the presence of an I/O bottleneck. The time that the system was waiting for I/O peaked at 9 percent from 10:30:01 to 10:40:01, on 07/15/2004.

The amount of time that the system spent waiting for I/O is analyzed in order to confirm the presence of a disk I/O bottleneck or the 'false alarm' generated by tape activity.

The CPU was idle (neither busy nor waiting for I/O) and had nothing to do an average of 99.8 percent of the time. If overall performance was good, this means that on average, the CPU was lightly loaded. If performance was generally unacceptable, the bottleneck may have been caused by remote file I/O which cannot be directly measured with sar and therefore cannot be considered by SarCheck.

In cases where the system is frequently idle, the percentage of idle time is analyzed. This is an indication of the average amount of time the CPU was neither busy nor waiting for I/O.

The run queue had an average depth of 1.0 which indicates that processes were generally not bound by latent demand for CPU resources.

Run queue graph

The run queue size indicates the average number of 'ready to run' processes. The average length of this queue and the percent of time it was occupied are analyzed, and are used to confirm the presence of a CPU bottleneck.

The syncer daemon used 0.006 percent of the CPU from 08:00:01 to 18:00:00. The syncer is responsible for writing data from the buffer cache to disk. It's activity indicates that it is not so active as to cause a problem.

This system's buffer cache is dynamic, meaning that its' size is determined by the amount of free memory on the system. Some buffer cache statistics were poor but there was insufficient disk activity to justify further investigation. Based on the current values of dbc_min_pct and dbc_max_pct, the buffer cache can range in size from 9.6 to 25.6 megabytes of memory. The actual size of the dynamic buffer cache ranged from 9.6 to 9.8 megabytes of memory.

SarCheck is aware of the dynamic buffer cache used in modern HP-UX releases and adjusts its' recommendations accordingly. It will determine whether a fixed buffer cache size makes more sense based on the latest information available from experts at HP. We continue to research tuning techniques and will feed the results of our work into SarCheck's knowledge base as they become available.

No evidence of an overall memory shortage was seen in the following statistics: The swap queue was occupied an average of 0 percent of the time. Note that on HP-UX systems, swap queue occupancy does not necessarily infer a memory poor condition. The average swap out rate was 0.25 per second.

At least 149 pages of memory were always free. The value of lotsfree was 586 pages and the value of gpgslim peaked at 256 pages. Since the number of free pages dropped to a value less than the peak seen for gpgslim, there was at least some memory pressure which may have occasionally impacted performance.

Free memory graph

Some swap out activity was seen in 28.0 percent of the samples, indicating that the system is not extremely memory-rich.

The swap out rate peaked at 1.04 per second during multiple time intervals.

Swap out graph

Data collected with ps -elf shows that the sched daemon used no CPU time during the monitoring period. Data collected with ps -elf shows that the vhand daemon used -13 seconds of CPU time. This indicates a possible memory shortage, which is not confirmed by other statistics related to memory utilization.

This is where SarCheck explains whether it thinks the system is memory poor or not.

The fs_async flag is not set. This may result in reduced disk performance, but keeps filesystem data structures consistent in the event of a system crash. This option is currently in the state recommended for production systems. Since no disk I/O bottleneck was seen on this system, setting the fs_async flag would be unlikely to provide enough of an improvement to justify the additional risk.

The average context switching rate was 15.7 per second. This works out to an average of one context switch every 63.76 milliseconds. No recommendations have been made to the timeslice parameter because no problems were seen with the context switching rate.

No unusual configurable parameter values were seen in those parameters which relate to the process accounting system. The current values of acctsuspend and acctresume are unlikely to have an impact on system performance.

The values of various tunable parameters are checked in the resource analysis section, and informative messages are printed where appropriate.

The inode cache did not overflow, but was completely full in 1.3 percent of the samples collected during the monitoring period. With UNIX operating systems such as HP-UX which use the inode table as a cache, this indicates that the inode cache may actually be somewhat larger than necessary. Since this system did not seem to have a memory bottleneck, this possibly oversized inode cache should be worth the extra memory.

Indoe cache graph

The graph above shows the utilization of the inode cache to help you understand when it is reported as being full.

The process and open file tables were less than 80.0 percent full. Peak table usage statistics (max used/table size) as reported by sar: Process table: 72/276. Open file table: 306/920.

The file table, controlled by the nfile parameter, was much larger than necessary. There is nothing to gain by reducing the size of this table, so no change to the parameter 'nfile' is recommended.

Open file graph

These messages indicate that the file table monitored by sar -v were in danger of overflowing. SarCheck explains the problem and recommends an appropriate action. The fact that HP-UX uses the inode table as a cache is understood and explained by SarCheck.

If system tables are grossly oversized, SarCheck will point this out. Please note that this is not a recommendation. SarCheck will only recommend reducing the size of the process table if it is oversized and a memory-poor condition was detected. SarCheck will never recommend reducing the file table because its' entries use a trivial amount of memory.

No System V semaphore activity was seen. No problems have been seen, and no changes have been recommended for System V semaphore parameters. Note that SarCheck only checks these parameter's relationships to each other since semaphore usage data is not available. Algorithms used by SarCheck to check these relationships are available in the help text of SAM.

No System V message activity was seen. No problems have been seen, and no changes have been recommended for System V message parameters. Note that SarCheck only checks these parameter's relationships to each other since message usage data is not available. Algorithms used by SarCheck to check these relationships are available in the help text of SAM, and in the file /usr/include/sys/msg.h.

System V semaphore and message activity is monitored here, and the time of peak activity is reported. If this time period coincides with noticeable slowdowns, you may want to look at these system resources more closely.

Semaphore and message parameters are among the most confusing of all kernel parameters. The complex relationships between parameters are checked automatically by SarCheck, and problems are reported.

The ratio of exec to fork system calls was 0.80. This indicates that PATH variables are efficient.

Inefficient PATH variables are frequently responsible for increasing overhead and degrading system performance. Using the ratio of execs to forks is an old trick which still works well.

One volume group was seen and the maxvgs parameter was set to 10. This leaves plenty of room for growth and no changes to maxvgs have been recommended.

The volume group /dev/vg00 contained 1 physical volume and 9 logical volumes. All of the logical volumes were open. The size of the group was 4.00 gigabytes, of which 50.24 percent was allocated and 49.76 percent was free.

The disk device c0t6d0 was busy an average of 0.33 percent of the time and had an average queue depth of 2.1 (when occupied). This indicates that the device is not a performance bottleneck. The average service time reported for this device and its' accompanying disk subsystem was 12.8 milliseconds. This service time is acceptable. Service time is the delay between the time a request was sent to a device and the time that the device signaled completion of the request. The disk device c0t6d0 was reported by pvdisplay as being a 4.00 gigabyte disk. 2036 megabytes of space was reported as being free and 2056 megabytes have been allocated. This disk device was a part of volume group /dev/vg00 and contained 9 logical volumes. At least one logical volume occupied noncontiguous physical extents on the disk. The following paragraph will provide more details.

The logical volume /dev/vg00/lvol5 was located in more than one place on disk c0t6d0. If this logical volume is busy and it is not mirrored, performance will suffer because the disk's read/write heads are likely to travel back and forth in an inefficient manner. The gap between two places where the logical volume was located was 386 blocks in size. This was more than one third of the disk's total size and is a large gap. If /dev/vg00/lvol5 was an active logical volume and was not mirrored, large gaps are likely to have increased the average service time seen on disk volume c0t6d0.

Disk %busy graph

Disk activity is analyzed in depth. Peak and average busy time, queue depth, and service time are used to identify problems in disk load, load balancing, buffer sizing, and hardware recommendations. The most common SarCheck support question is "Why does SarCheck say that my fast new disk drives are slow?". This is usually due to the physical location of data on the drives or performance problems inherent in some implementations of RAID arrays. If the problem can be helped by increasing the size of the buffer cache, SarCheck will recommend it.

At multiple peak times on 07/15/2004 ps -elf data indicated that there were 72 processes present. This was the largest number of processes seen with ps -elf but it is not likely to be the absolute peak because the operating system does not store the true "high-water mark" for this statistic. There were an average of 67.4 processes present.

Process count graph

A possible memory leak was seen in /usr/bin/X11/X, owned by root, pid 1563. Between 08:10:01 and 09:00:01, this process grew from 734 to 982 pages. Memory usage grew at an average rate of 297.6 pages/hr during that interval.

Details of unusual process activity will be reported here, at the end of the resource analysis section. SarCheck will tell you about any process which used more than 20 percent of the CPU, grew at more than 200 pages (800 kb) per hour, had a size of at least 25 percent of physical memory, or had a size of at least 16 megabytes. For information about changing these thresholds, see the section entitled "How to change SarCheck's algorithms".

CAPACITY PLANNING SECTION

The section is designed to provide the user with a rudimentary linear capacity planning model and should be used for rough approximations only. These estimates assume that an increase in workload will affect the usage of all resources equally. These estimates should be used on days when the load is heaviest to determine approximately how much spare capacity remains at peak times.

The Capacity Planning section can help you to understand how much additional load your system can support. This feature is not designed to replace the features found in mainframe-type capacity planning tools, but rather to give you an approximation of how much room for growth remains. Please note the disclaimer in the paragraph above.

Based on the limited data available in this single sar report, the system can support moderate increase in workload at peak times, and memory is likely to be the first resource bottleneck. Implementation of some of the suggestions in the recommendations section may help to increase the system's capacity.

This paragraph summarizes the amount of capacity remaining in your system during peak times and identifies the first likely resource bottleneck. In this case, the fact that the file table became almost full indicates that the system could not support an increase in peak workload. If all system resources monitored could support an increase in workload of at least 100 percent, the summary will say that no impending capacity limits were seen. If the first bottleneck is likely to occur in memory, the amount of capacity remaining will not be quantified. This is because the data required for that kind of complex memory modeling cannot be found in the sar report.

The CPU can support an increase in workload of at least 100 percent at peak times. The busiest disk can support a workload increase of at least 100 percent at peak times. For more information on peak CPU and disk utilization, refer to the Resource Analysis section of this report.

All system tables measured by sar -v can hold at least twice as many entries as were seen.

Capacity planning graph

The two paragraphs and graph above give a more detailed breakdown of remaining capacity. Again, please note that these numbers are approximate and will vary from day to day. After analyzing a number of sar reports, you will have a pretty good idea of how much capacity remains in your system.

CUSTOM SETTINGS SECTION

The default SYSUSR threshold was changed in the sarcheck_parms file from 2.5 to 1.3.

The default HSIZE value was changed in the sarcheck_parms file from 0.75 to 1.20 times the default gnuplot width.

You can change SarCheck's algorithms and output format by creating a file named /opt/sarcheck/etc/sarcheck_parms. For more information about what can be changed with sarcheck_parms, see Appendix A of this manual.

Please note: In no event can Aptitune Corporation be held responsible for any damages, including incidental or consequent damages, in connection with or arising out of the use or inability to use this software. All trademarks belong to their respective owners. This software licensed for the exclusive use of: Your Company. SC9000 Code version: 6.00.00. Serial number: 11111222.

This software is updated frequently. For information on the latest version, contact the party from whom SarCheck was originally purchased, or visit our web site.

This is the message that shows up in beta software. Licensed versions will display a different message.

(c) copyright 1995-2004 by Aptitune Corporation, Plaistow NH, USA, All Rights Reserved. http://www.sarcheck.com

The disclaimers, copyright notices, and expiration date (if any) are all important and you should read them.

    Statistics for system, hippie 
    System model number is, 9000/785/C360 
    Statistics collected from, 07/13/2004 
    Statistics collected until, 07/20/2004 
    Average CPU utilization, 0.1% 
    Peak CPU utilization, 3% 
    Average user CPU utilization, 0.1% 
    Average sys CPU utilization, 0.0% 
    Average waiting for I/O, 0.1% 
    Average run queue depth, 1.0 
    Peak run queue depth, 1.2 
    Average swap queue occupancy, 0.0% 
    Average swap out rate, 0.25/sec 
    Average cache read hit ratio, 98.4% 
    Average cache write hit ratio, 58.1% 
    Disk device w/highest peak, c0t6d0 
    Avg pct busy for that disk, 0.33% 
    Peak pct busy for that disk, 9.28% 
    Avg number of processes seen by ps, 67.4 
    Max number of processes seen by ps, 72 
    Percent of process tbl used, 26.1% 
    Process table overflows, No 
    Percent of file table used, 33.3% 
    File table overflows, No 
    Inode cache pct of time full, 1.3% 
    Inode cache overflows, No 
    Approx CPU capacity remaining, 100%+ 
    Approx I/O bandwidth remaining, 100%+ 
    Remaining process tbl capacity, 100%+ 
    Remaining file table capacity, 100%+ 
    Can memory support add'l load, Moderate 

This is the output that you'll see if you're using the -t or -tonly switches. This table is much easier to parse than the standard text-based SarCheck report. When used in conjunction with the -html switch, this information is formatted into a table and unusual values are flagged by coloring the cells which contain those values. It you have a browser handy, you'll want to try using the -html and -t switches together.

Thanks for your interest and support!


Frequently asked questions (FAQ):

Q. Why does SarCheck tell me that my fast new disks are slow?

A. The speed of disks (as reported by the manufacturer) is usually better than the speed reported by sar. Soft I/O errors, poor locality of reference, and problems with the disk controllers are frequently responsible. Unfortunately, sar doesn'’t give us enough information to identify the true cause or recommend a solution. If a change to the buffer cache can be useful in circumventing the problem, SarCheck will recommend it.

SarCheck uses the following thresholds when commenting on disk service time:
> 100msNot likely to be a magnetic disk
50 - 100msMay not be a magnetic disk
25 - 50 msVery slow
16 - 25 msSomewhat slow
11 - 16 msAcceptable
6 - 11 msRelatively fast
0.1 - 6 msVery fast, possibly cached

Another problem is that the drivers for aftermarket disk drives may not work correctly with sar. One beta site reported that sar was showing 15 - 30 millisecond service times for an EMC RAID array that is supposed to be faster than a solid state disk.

Q. Should I implement recommendations which only show up occasionally?

A. Feel free to try, but first implement the regularly occurring recommendations, since those will address the most frequently occurring problems. If SarCheck occasionally recommends increasing the amount of memory, you should certainly try it. On systems with some extra memory, SarCheck will be able to make additional recommendations that could not be made on systems where memory pressure is high.

Q. Every time I make changes based on SarCheck'’s recommendations, it makes more recommendations. Why doesn'’t it just figure out the correct values for all the parameters?

A. That'’s not how real performance tuning works. There are no correct values because tuning is a series of compromises between various system resources. Performance tuning involves a certain degree of trial and error, and gradual change is the only way to do it.

Q. When I try to run sarcheck, I get the message “sarcheck: not found”.What's wrong?

A. Check the following:
Does the sarcheck script really exist?
(look for /opt/sarcheck/bin/sarcheck).
Is /opt/sarcheck/bin in your PATH variable?
(echo $PATH)

Q. Does SarCheck work on the new 64-bit systems?

A. Yes, SarCheck works with 32 or 64 bit kernels and can be run whether or not you're root. If you are not root, you will need permission to read the sar reports.

Q. Why do I get the odd message "sar: Starting time must be more than ending time" with HP-UX 11.00?

A. This is caused by a bug in early versions of sar in HP-UX 11. There is a problem with that version of sar's ability to process starting times with the -s option. Since SarCheck uses sar to produce a sar report, you may encounter this bug. To fix it, make a backup copy of the script /opt/sarcheck/bin/sarcheck and then search for, and remove the -s$ST text string. If you need assistance with this, let us know.

Q. Why did SarCheck stop producing reports?

A. There are two common causes of this: The software has expired. Run '/opt/sarcheck/bin/analyze9000' and look for the expiration date at the bottom of the usage text. If you've licensed SarCheck and the expiration date doesn't make sense to you, run 'analyze9000 -s' and send us the output. The kernel was rebuilt, but the system was not rebooted. After the kernel is rebuilt (frequently to implement a change recommended by SarCheck), sar may have trouble until the system is rebooted. SarCheck can't produce reports if sar isn't working.

Q. Why does the analyze9000 program produce the error message “crt0: ERROR couldn't open dld.sl errno:000000002” ?

A. This error message is caused by the analyze9000 program looking for dld.sl in /lib. HP changed the location of dld.sl and this error message occasionally occurs when the operating system is upgraded. We're told that the fix is in the release notes, and more information can be found by searching for dld.sl in DejaNews.

Q. How do I collect data over a 24 hour period?

A. The crontab entries should look like this:

0,20,40 * * * 0-6 /usr/lib/sa/sa1
45 23 * * 1-5 /usr/lib/sa/sa2 -i 1200 -A

Q. How do I collect data every 10 minutes from 08:00 to 18:00?

A. The crontab entries should look like this:

0 * * * 0-6 /usr/lib/sa/sa1
10,20,30,40,50 8-17 * * 1-5 /usr/lib/sa/sa1

5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 600 -A (this should be all on one line)


Bibliography:

UNIX System V Performance Management. 1994. Englewood Cliffs, NJ.: PTR Prentice Hall. ISBN 0-13-106429-1.

Alomari, A. Oracle & UNIX Performance Tuning, 1997, Upper Saddle River, NJ.: PTR Prentice Hall. ISBN 0-13-849167-4.

Loukides, M. System Performance Tuning. 1991. Sebastopol, CA.: O'Reilly & Associates, Inc. ISBN 0-937175-60-9.

Majidimehr, A. Optimizing UNIX for Performance. 1996. Englewood Cliffs, NJ.: PTR Prentice Hall. ISBN 0-13-111551-0.

Poniatowski, M. The HP-UX System Administrators “How To” Book. 1994. Englewood, Cliffs, NJ.: PTR Prentice Hall. ISBN 0-13-099821-4.

Poniatowski, M. HP-UX 11.x System Administrators “How To” Book. 1999. Upper Saddle River, NJ.: PTR Prentice Hall. ISBN 0-13-012515-6.

Sauers & Weygant. HP-UX Tuning and Performance. 2000. Upper Saddle River, NJ.: PTR Prentice Hall. ISBN 0-13-102716-6.


Special thanks:

We'’d like to thank the following people for their suggestions, ideas, support, and yes, even a few bug reports:

D. J. Blackwood, Calvin Breyley, William Drescher, Robert P. Fries, Steve Gardiner, Jeff Hyman, Berni Jubb, Peter Kettle, Bob Long, Nancy Lorenz, Bela Lubkin, Rich Marotta, Gene Martin, Tom Melvin, Lee Penn, Tom Podnar, Jean-Pierre Radley, Charlie Russel, David Simons, Dave Venus, and a number of others.


Appendix A:
SarCheck parms file keywords

Many of the keywords and the defaults can be found by looking at the questions that the sarcheck script asks. Here is a complete list:

PAGERThe pager to be used to display the analysis on the screen. The default is more, but pg or less are common alternatives.
LPSThe command for printing the analysis. The default is lp -s.
PSELFDIRThe directory where SarCheck will look for the ps -elf data. The analyze9000 program and the ps1 and ps2 scripts will use this new directory. WARNING! Please pick a directory that contains nothing but ps -elf data! The ps2 script will use the find command to remove any file in the specificed directory which is more than 14 days old. We have tried to limit the potential damage by adding the -name switch to the find command but you should still be very careful with this.
SCDIRThe directory where the analyze9000 program resides. This can be changed from the default of /opt/sarcheck/bin.
ETCDIRThe location of the file analyze.dlr. This file will be used if we ever use resellers who want to offer their own support. There is no point in changing this parameter at this time.
STThe starting time for the analysis. The default is 08:00 and this should be entered in 24 hour format.
ENThe ending time for the analysis. The default is 17:00 and this should be entered in 24 hour format.
DRWhether to analyze sar data or a sar report. The default is 'd'. For a list of options, run the sarcheck script and see what options are on the screen when the keyword is DR.
OPTHow to format the report. The default is 'n'.
VERBOSEWhether the output should be verbose or quiet. The default is 'v'.
PSELFOPTHow verbose the ps -elf output should be. This option is used primarily to increase the SarCheck's sensitivity to problems in the ps -elf data. The default is 'n'.
DISKFLTRWhether or not to filter the disk analysis. Filtering sar's disk data is useful on large system with many disks when you're not using the HTML disk table option. The default is 'y'.
TABULARWhether or not to print a tabular summary at the end of the report or print a tabular summary instead of the report. The default is 'n'.
OUTOPTThis option controls where the output of the sarcheck script should go. The default is '1'.
GNUPLOTThe version of gnuplot present on your system. The default value is 3.7.
GNUPLOTDIRThe directory in which you've installed gnuplot.
GRAPHDIRThe directory in which the graphs will be stored.
HSIZEChange the default width of the graphs generated by gnuplot. If you want to see graphs that are wider than the ones produced by the default width of 0.7, this keyword can be used to produce wider graphs.
HTMLGRAPHDIRThe directory referenced in the HTML image tag.
SARCHECKDIRThe directory where the SarCheck programs reside. This can be changed from the default directory of /opt/sarcheck/bin. If this is used, all other entries in the file named /opt/sarcheck/etc/sarcheck_parms will be ignored and the sarcheck_parms file in the specified directory will be used instead. Refer to the section 'How to move SarCheck to another directory'.
DMYChange the default date format to dd/mm/yyyy.
YMDChange the default date format to yyyy/mm/dd.

SarCheck will allow you to change the thresholds used by SarCheck's tuning algorithms. These changes can be implemented using the /opt/sarcheck/etc/sarcheck_parms file.

Please note that the default values of SarCheck's thresholds have been established based on feedback from thousands of systems. These values should not be overridden without good reason. Here is a list of thresholds which can be overridden, and the meaning of each is described below. Values outside the allowed range will be ignored and values outside the expected range will generate warnings.

KeywordAllowed rangeExpected RangeDefault
AVGCPU50 - 10060 - 10080
MAXCPU50 - 10060 - 10095
AVGWIO1 - 505 - 257
AVGRQ1 - 502.5 - 503.5
MAXRQ1 - 5002.5 - 5005.0
AVGRC110 - 9960 - 9990
AVGRC210 - 9965 - 9996
AVGWC110 - 9950 - 9070
AVGWC210 - 9950 - 9080
CAPCPU25 - 10060 - 10090
CAPDSK10 - 10025 - 9075
CAPTBL25 - 10040 - 9080
AVSWPT0.01 - 1000.3 - 1001.5
AVSWOC1 - 1001 - 505
CPULIM0.05 - 10010 - 10020
MLRATE1+100+200
LGPROC32+ pagesformulaformula
DCLPanyany10
DCMLanyany10
DCRPanyany10
SYSUSR0 - 9990 - 9992.5

AVGCPU: When average CPU utilization exceeds this value, SarCheck considers the system to be busy enough to cause concern.

MAXCPU: When Peak CPU Utilization exceeds this value, SarCheck assumes that performance degradation is likely.

AVGWIO: When the average value of the sar -u %wio column exceeds this value, SarCheck looks for evidence to corroborate an I/O bottleneck.

AVGRQ: When the average length of the run queue exceeds this value, SarCheck considers it to be an indication of a CPU bottleneck.

MAXRQ: When the maximum length of the run queue exceeds this value, SarCheck assumes that performance degradation is likely.

AVGRC1: The lowest acceptable value for the buffer cache read hit ratio. Note that several other factors are used to evaluate buffer cache effectiveness.

AVGRC2: The lowest acceptable value for the buffer cache read hit ratio when I/O is heavy. Note that several other factors are used to evaluate buffer cache effectiveness.

AVGWC1: The lowest acceptable value for the buffer cache write hit ratio. Note that several other factors are used to evaluate buffer cache effectiveness.

AVGWC2: The lowest acceptable value for the buffer cache write hit ratio when I/O is heavy. Note that several other factors are used to evaluate buffer cache effectiveness.

CAPCPU: The value used to calculate the increase in CPU load that the system can support at peak times.

CAPDSK: The value used to calculate the increase in I/O load on the busiest disk that the system can support at peak times.

CAPTBL: The value used to calculate how much additional load can be supported before the process and open file tables become full.

AVSWPT: When the number of swap-outs per second reported by sar exceeds this value, SarCheck considers memory pressure to be excessive.

AVSWOC: When the percentage of time the swap queue is occupied exceeds this value, SarCheck considers memory pressure to be excessive.

CPULIM: The threshold in computed CPU utilization SarCheck uses to decide if a runaway process has been detected in ps -elf data.

MLRATE: The threshold in pages of memory per hour used by SarCheck to decide if a memory leak has been detected in ps -elf data.

LGPROC: The minimum size in pages of a process which SarCheck will report as being suspiciously large. The formula used to calculate the default threshold is 256 megabytes or one quarter the size of memory, whichever is smaller.

DCLP: Disable the feature which limits the number of suspiciously large processes that are reported or change the number being reported. Using the keyword DCLP without a second field will disable the limit. Using a second field (for example: DCLP 25) will change the limit to the value in the second field.

DCML: Disable the feature which limits the number of processes with memory leaks that are reported or change the number being reported. Using the keyword DCML without a second field will disable the limit. Using a second field (for example: DCML 25) will change the limit to the value in the second field.

DCRP: Disable the feature which limits the number of runaway processes that are reported or change the number being reported. Using the keyword DCRP without a second field will disable the limit. Using a second field (for example: DCRP 25) will change the limit to the value in the second field.

SYSUSR: The threshold used to decide if it's worth mentioning if there is an unusual amount of %sys activity relative to %usr activity. The default of 2.5 means that %sys activity needs to be at least 2.5 times greater than %usr activity for this to be reported.

It is possible to set these parameters to values which can make SarCheck's recommendations meaningless or incorrect. Please override the default values with care.

The sarcheck_parms file can also be used to change the defaults used to generate HTML output.

KeywordAllowed rangeDefault
BGCOLORAny valid color#FFEE88
TEXTCOLORAny valid colorblack
REDCOLORAny valid color#FF9999
PINKCOLORAny valid color#FFCC99

BGCOLOR: The background color specified in the bgcolor attribute of the HTML tag.

TEXTCOLOR: The text color specified in the text attribute of the HTML tag.

REDCOLOR: The background color specified in the bgcolor attribute of certain tags. The color used to highlight the cells of an HTML table when the values exceed certain thresholds. The default color is a shade of red and this keyword exists to give you an option if you want to use the color red as the text or background color.

PINKCOLOR: The background color specified in the bgcolor attribute of certain tags. The color used to highlight the cells of an HTML table when the values exceed certain thresholds. The default color is a shade of pink and this keyword exists to give you an option if you want to use the color pink as the text or background color.


Appendix B:
Options available when running 'analyze9000'

-cTurn off the capacity planning section.
-csvProduce output in comma separated value (CSV) format. If the -html switch is used in conjunction with the -csv switch, these statistics will be printed as HTML tables. If the -html switch is not used, the -csv switch will cause a SarCheck report to be generated with CSV output of statistics only. Disk statistics will be generated if the -dtbl or -dtoo switches were used, a volume group table will be generated if the -vgtbl or -vgtoo switches were used and volume group data was found, and a tabular summary will be generated if the -t switch was used.

Please note that the -csv switch puts parts of the SarCheck analysis into CSV format. The -gr switch is used to put the sar report and the tabular summary (see the -t switch) into CSV format. Volume group statistics will only be produced if volume group data was found in the ps -elf data file.

-dPrint info on all disks if more than 12 disk drives were seen in sar. Because the SarCheck report will produce a paragraph on each disk, reports may get too verbose on systems with 30 or more disk devices. Without this option, SarCheck will filter out information on disks which are lightly used.
-dblpSuppress warnings about suspiciously large database processes.
-dbmlSuppress warnings about possible memory leaks in database processes.
-dbrpSuppress warnings about possible runaway database processes.
-dclpDisable limiting the number of warnings about suspiciously large processes.
-dcmlDisable limiting the number of warnings about possible memory leaks in processes.
-dcrpDisable limiting the number of warnings about possible runaway processes.
-dnzSuppress the reporting of disks with no activity. This option is most likely to be useful when SarCheck is used on systems with thousands of disk devices. In one case where data on all 2,505 disks were reported in an HTML report using both tables and text, the size of report approached one megabyte, The size of the report was reduced by 90 percent with this switch.
-dbusyIf the -dtbl switch is used, -dbusy will sort the disk information by average percent busy.
-dservIf the -dtbl switch is used, -dserv will sort the disk information by average service time.
-dtblIf the -html switch is used, -dtbl will produce a table of disk statistics instead of generating a paragraph on each disk. Cells in the table will be color coded to highlight interesting disk statistics. This option is recommended for large systems where 50 or more individual paragraphs on disk activity would be hard to comprehend.

If the -html switch is not used, -dtbl will cause disk statistics to be output in a comma separated value (CSV) format. CSV output should generally be produced with the -csv switch, but it can be done by using -dtbl too.

-dtooIf the -html switch is used, -dtoo will produce a table of disk statistics in addition to generating a paragraph on each disk. Cells in the table will be color coded to highlight interesting disk statistics and will link to the appropriate paragraph.

If the -html switch is not used, -dtoo will cause disk statistics to be output in a comma separated value (CSV) format in addition to generating a paragraph on each disk. CSV output should generally be produced with the -csv switch, but it can also be done by using -dtoo.

-diagThis option will add a paragraph to the report showing how full SarCheck's internal tables have become. If a table comes too close to becoming full, a message should appear in the SarCheck report asking you to send a copy of the report to support@sarcheck.com
-dmyThis switch causes the date format used in the SarCheck report to appear in the format dd/mm/yyyy. This change does not affect the order form or any output except for the SarCheck report.
-enSpecify the ending time for data to be analyzed in a 24 hour format. Specifying 17 will cause data through 17:00:00 to be analyzed, and specifying 17:30 will cause analysis to stop with any data after 17:30:00. This switch will work on single day or multiple days of data and is usually used in conjunction with the -st switch. The default for this value is controlled by the sarcheck_parms keyword EN.
-g24This switch will change the appearance of multiday graphs. It changes the graph to be displayed with an X-axis of up to 24 hours and data from different days will be superimposed. This can help to spot activity that occurs at the same time each day.
-gonlyProduce graphs only. This switch should be used together with the -jpeg, -jpg, or -png switches. The names of the graphs produced will be sent to stdout and no report will be produced.
-gdChange the directory in which SarCheck puts graphs created with gnuplot. SarCheck will still determine the filenames of the graphs and the purpose of this switch is to allow you to store them wherever you want. The graphs can take up a considerable amount of space, especially JPEG graphs.
-grProduce output which can be used by graphing tools. While the output is in comma separated value (CSV) format, this option is different from the -csv switch because it reformats the sar report instead of reformatting SarCheck's analysis.
-hDisplays brief instructions and shows all of the possible switches.
-hgHow to produce graphs using the -jpg, -jpeg, and -png switches.
-hgdChange the directory where the graphs appear to be in the HTML output's image tags.
-hmHow to analyze multiple days of sar data.
-hpHow to analyze supplemental ps -elf data.
-htmlInsert HTML tags in text for use by a browser.
-jpeg or -jpgThese switches will cause SarCheck to look for gnuplot and use it to produce graphs in JPEG format. The naming convention used by SarCheck will append either ".jpeg" or ".jpg" to the file name of the graph, depending on the switch you use. JPEG formatted graphs are larger and do not look as crisp as PNG graphs, but they are much more likely to display correctly with older browsers.
-kAllows you to change the activation key and software expiration date.
-mdyForce the default mm/dd/yyyy date format to be used if it's overridden by the use of a non-English text file. Non-English text files are under development and were not available at the time of this printing.
-oPrints an order/registration form for those wishing to purchase a software license or register their licensed software.
-pSuppress page numbering & page breaks. This is especially useful when the output is piped to pg.
-psIncorporate the analysis of a single ps -elf file called /opt/sarcheck/ps/yyyymmdd where the date is extracted from the sar data.
-pfInclude analysis of a specified file containing ps -elf data
-pdChange the directory in which SarCheck expects to find ps -elf data. SarCheck will still determine the name of the ps -elf data file and the purpose of this switch is to allow you to store ps -elf data wherever you want. This data can take up a considerable amount of space.
-pvVerbose analysis of ps -elf data, overridden by -Q and -q
-plpSuppress warnings about suspiciously large processes.
-pmlSuppress warnings about possible memory leaks.
-prpSuppress warnings about possible runaway processes.
-pngThis switch will cause SarCheck to look for gnuplot and use it to produce graphs in PNG format. The naming convention used by SarCheck will append ".png" to the file name of the graph. PNG formatted graphs are smaller and look cleaner than JPEG graphs, but may not display correctly with older browsers.
-ptblIf the -html switch is used, -ptbl will produce a table of ps -elf statistics instead of generating a paragraph on each process whose resource utilization exceeds the threshold. Cells in the table will be color coded to highlight the interesting statistics. This option is recommended for systems where a large number of individual paragraphs would be hard to comprehend.

If the -html switch is not used, -ptbl will cause ps -elf statistics to be output in a comma separated value (CSV) format. CSV output should generally be produced with the -csv switch, but it can be done by using -ptbl too.

-ptooIf the -html switch is used, -ptoo will produce a table of ps -elf statistics in addition to generating a paragraph on each process whose resource utilization exceeds the threshold. Cells in the table will be color coded to highlight interesting statistics.

If the -html switch is not used, -ptoo will cause ps -elf statistics to be output in a comma separated value (CSV) format. In addition, the -csv switch generates a paragraph on each process whose resource utilization exceeds the threshold. CSV output should generally be produced with the -csv switch, but it can also be done by using -ptoo.

-QPrint a non-verbose (super-Quiet) analysis. This option automatically sets the -p option.
-qPrint a less verbose analysis.
-rPrint an analysis only if recommendations are made.
-ret0Force a return code of zero. The analyze9000 program normally returns zero if no recommendations are made and one if it makes recommendations. This option exists because some scheduling tools report non-zero return codes as errors or exceptional conditions.
-sDisplay all the information needed to activate SarCheck.
-stSpecify the starting time for data to be analyzed in a 24 hour format. Specifying 09 (or just 9) will cause data starting at 09:00:00 to be analyzed, and specifying 9:30 will cause analysis to start with any data collected at or after 09:30:00. This switch will work on a single day or multiple days of data and is usually used in conjunction with the -en switch. The default for this value is controlled by the sarcheck_parms keyword ST.
-summThis option causes the report to stop printing after the summary has been produced.
-tThis option will produce a summary of interesting statistics in a tabular format. This output can be parsed with relative ease. If the -html switch is used, the statistics will be presented in an HTML table, and cells in the table will be color coded to highlight noteworthy statistics. This option works well with -dtbl.
-tonlyThis option will produce nothing but a summary of interesting statistics in a tabular format. All recommendations, analysis, and other hopefully interesting text will vanish. If the -html switch is used, the statistics will be presented in an HTML table, and cells in the table will be color coded to highlight noteworthy statistics.
-vgtblIf the -html switch is used and ps -elf analysis has been requested, -vgtbl will produce a table of volume group statistics instead of generating a paragraph on each volume group. Cells in the table will be color coded to highlight interesting volume group statistics. This option is recommended for large systems where a large number of individual paragraphs on volume group activity would be hard to comprehend.

If the -html switch is not used, -vgtbl will cause volume group statistics to be output in a comma separated value (CSV) format. CSV output should generally be produced with the -csv switch, but it can be done by using -vgtbl too.

-vgtooIf the -html switch is used and ps -elf analysis has been requested, -vgtoo will produce a table of volume group statistics in addition to generating a paragraph on each volume group. Cells in the table will be color coded to highlight interesting volume group statistics and will link to the appropriate paragraph.

If the -html switch is not used, -vgtoo will cause volume group statistics to be output in a comma separated value (CSV) format in addition to generating a paragraph on each volume group. CSV output should generally be produced with the -csv switch, but it can also be done by using -vgtoo.

-wSuppress newline characters, primarily for export to PC-based word processing programs.
-wideChange the width of the graphs generated by gnuplot. If you want to see graphs that are wider than the ones produced by the default width of 0.7, this switch can be used to produce wider graphs. For more flexibility, use the sarcheck_parms keyword HSIZE.
-ymdThis switch causes the date format used in the SarCheck report to appear in the format yyyy/mm/dd. This change does not affect the order form or any output except for the SarCheck report.