SarCheckTM version 6.01.03 reference guide for Linux systems


Table of Contents:

Introduction
Features
Restrictions
Known Limitations
How to Install SarCheck (diskette)
How to Install SarCheck (electronic copies)
Setup of ps -elf data collection
How to set up gnuplot
How to activate the software
How to deinstall SarCheck
How to run SarCheck, menu driven
How to run SarCheck, command driven
How to run SarCheck, crontab entry
How to analyze multiple days of data
How to change the menu defaults
How to change SarCheck's algorithms
How to create graphs in SarCheck reports
Examples of how to use the switches
How to produce the most accurate analysis from the menu
How to produce the most accurate analysis from the command line
How to access the online instructions/help text
How to get the most from SarCheck
How to interpret the analysis
   The summary section
   The recommendations section
   The resource analysis section
   The capacity planning section
   The custom settings section
   Disclaimers, trademarks, etc.
How to order SarCheck
How to get technical support for SarCheck
How to get technical support for gnuplot
Files included in this release
An example of a SarCheck report
   The summary section
   The recommendations section
   The resource analysis section
   The capacity planning section
   The custom settings section
   Table of statistics
Frequently Asked Questions (The FAQ)
Bibliography
Appendix A: SarCheck parms file keywords
Appendix B: Options available when running 'analyze'

Introduction:

The SarCheck utility analyzes your system for possible performance bottlenecks such as memory shortages and 'leaks', CPU bottlenecks, runaway processes, and improperly set tunable parameters. It also tells you approximately how much of an additional workload the system can support at peak times of the day. It does this by analyzing data from the /proc filesystem and the output of ps -elf to produce a plain English report. The report explains the resource bottlenecks seen and makes recommendations which can improve system performance.

It will also produce graphs if you have a version of gnuplot installed that supports PNG or JPEG output. If you ask for HTML output and the production of graphs, SarCheck will insert the graphs into the HTML document. The graphs will be inserted by using <img> tags complete with a description in the tag's alt attribute in order to meet accessibility requirements.

SarCheck's recommendations are designed to produce incremental improvements, so SarCheck should be run regularly. No attempt is made to guess the ultimately correct value for any parameter based on a single day's /proc data. Instead, SarCheck will recommend that you increase or decrease values based on the data available, and will continue to recommend changes until there is no more room for improvement. Performance tuning is, by definition, a process of trial and error. SarCheck will not only help you to make those changes, but will also explain the reasons for each recommendation.

SarCheck is different from other performance tools because it does not monitor system activity but uses the /proc filesystem and the output of ps -elf. Since the /proc filesystem is a part of the operating system, we didn't see a need to create yet another monitor for you to buy.

SarCheck can be run from the command line, as a cron job, or from a menu-driven front end script. For reasons of safety and security, SarCheck will not attempt to change tunable parameters or anything else in the kernel.


Features:

The following conditions are identified by SarCheck Version 6.01:

  1. CPU bottleneck detection.
  2. Memory bottleneck detection.
  3. Disk bottleneck detection.
  4. Inefficient sizing of many other tunable parameters.
  5. Limited capacity for an increase in workload or users.
  6. "Impossible" data, such as negative CPU utilization.
  7. Active vs. inactive disks
  8. Runaway process detection.
  9. Memory leak detection.
  10. Large amounts of unused memory.

Based on its analysis of the resources and statistics described above, SarCheck may recommend a variety of steps which can be taken to improve system performance.


Restrictions:

This version of SarCheckTM is designed to work with Linux x86 kernels 2.2 through 2.6. Other versions are available for most AIX, Solaris, and HP-UX operating systems.


Known limitations:

  1. An analysis of the load from an unusually quiet day, such as a holiday, will produce recommendations that may be inappropriate for days when the system is busy. For this reason, we recommend analyzing activity only from times and days when the system is busy. Please refer to the section entitled "How to produce the most accurate analysis" for details.
  2. SarCheck's recommendations are not listed in order of significance or potential for performance improvement.
  3. The /proc filesystem must be mounted.


How to install SarCheck (diskette):

Note: This is a PC-formatted diskette and the software is a compressed tar archive. If your system has a diskette drive, the mcopy utility can be used to copy SarCheck to your system. If not, you should be able to easily FTP the software from another device that can read PC-formatted diskettes.

SarCheck is generally run as root on most UNIX systems but on Linux systems you may be able to run it without being root. This is dependent on the permissions of various files and once you have installed SarCheck you can check this.

The install procedure will put SarCheck into the /opt/sarcheck/bin directory. To install the software, put the diskette in the A: drive and install it. This only takes a few seconds.

  1. Log in as root.
  2. Now copy, uncompress, and detar:

    mcopy a:sclin.taz - | zcat | tar xvfP -

To see if SarCheck can collect resource utilization data without being root, try running the program /opt/sarcheck/bin/sarcheckagent when you are not root. If you see any error messages like this:

Error: can't read /proc/sys/vm/bdflush file

the file either doesn't exist or can not be read due to a permissions problem. If the file doesn't exist, it won't help to be root. If it's a permissions problem, sarcheckagent needs to be run as root or you should allow the file in question to be read by a non-root user. Once you've decided whether SarCheck needs to collect data as root, finish the installation. Then su as sys (or root if necessary) and add the following lines to that user's crontab file:


0,10,20,30,40,50 9-16 * * * /opt/sarcheck/bin/prst1 

0 17 * * * /opt/sarcheck/bin/prst1

30 18 * * * /opt/sarcheck/bin/prst2

These entries will collect data every 10 minutes between 9:00AM and 5:00PM, and will delete old files at 6:30PM. Resource utilization statistics are collected with the script /opt/sarcheck/bin/prst1 and one file will be created for each day. Old files are deleted with the /opt/sarcheck/bin/prst2 script. We recommend running the prst1 script to collect data every 5 to 30 minutes. We also recommend that the prst2 script should run no more than once per day.

To test SarCheck, type:

/opt/sarcheck/bin/analyze /opt/sarcheck/etc/20050221 |more

This will install SarCheck on your system. See the section entitled "Files included in this release" for details.

To reduce typing, you may want to add /opt/sarcheck/bin to your PATH.


How to install SarCheck (electronic copies):

In some cases, usually for those with a software subscription, we will email SarCheck to customers. Emailed software is a compressed tar archive.

SarCheck is generally run as root on most UNIX systems but on Linux systems you may be able to run it without being root. This is dependent on the permissions of various files and once you have installed SarCheck you can check this.

The install procedure will put SarCheck into the /opt/sarcheck/bin directory. This only takes a few seconds.

  1. Log in as root.
  2. Now copy, uncompress, and detar:

    zcat < /tmp/sclin.taz | tar xvfP -

To see if SarCheck can collect resource utilization data without being root, try running the program /opt/sarcheck/bin/sarcheckagent when you are not root. If you see any error messages like this:

Error: can't read /proc/sys/vm/bdflush file

the file either doesn't exist or can not be read due to a permissions problem. If the file doesn't exist, it won't help to be root. If it's a permissions problem, sarcheckagent needs to be run as root or you should allow the file in question to be read by a non-root user. Once you've decided whether SarCheck needs to collect data as root, finish the installation. Then su as sys (or root if necessary) and add the following lines to that user's crontab file:


0,10,20,30,40,50 9-16 * * * /opt/sarcheck/bin/prst1 

0 17 * * * /opt/sarcheck/bin/prst1

30 18 * * * /opt/sarcheck/bin/prst2

These entries will collect data every 10 minutes between 9:00AM and 5:00PM, and will delete old files at 6:30PM. Resource utilization statistics are collected with the script /opt/sarcheck/bin/prst1 and one file will be created for each day. Old files are deleted with the /opt/sarcheck/bin/prst2 script. We recommend running the prst1 script to collect data every 5 to 30 minutes. We also recommend that the prst2 script should run no more than once per day.

To test SarCheck, type:

/opt/sarcheck/bin/analyze /opt/sarcheck/etc/20050221 |more

This will install SarCheck on your system. See the section entitled "Files included in this release" for details.

To reduce typing, you may want to add /opt/sarcheck/bin to your PATH.


Setup of ps -elf data collection:

SarCheck is capable of analyzing more than just /proc data. Additional information from the ps utility and from various kernel structures can be saved and analyzed to improve the accuracy of SarCheck's reporting and recommendations. A few simple steps are required to take advantage of this powerful feature:
  1. Make sure the directory /opt/sarcheck/ps exists.
  2. Add the following entries to root's crontab file for typical 8:00AM to 5:00PM, Monday through Friday monitoring:

    0,20,40 8-17 * * 1-5 /opt/sarcheck/bin/ps1
    5 17 * * 1-5 /opt/sarcheck/bin/ps2

    As an alternative, the following cron entries are oriented towards the 24x7 monitoring that many administrators prefer:

    0,20,40 * * * * /opt/sarcheck/bin/ps1
    45 23 * * * /opt/sarcheck/bin/ps2

We recommend using crontab -e to modify the crontab file.

The ps -elf data collected on large systems can take up a considerable amount of space. If you want to store this data somewhere other than /opt/sarcheck/ps, you can specify a different directory with the PSELFDIR keyword in the sarcheck_parms file.

WARNING: If you choose to specify a different directory, be sure to pick a directory that is not used for anything else. The purpose of the ps2 script is to remove any file in the ps -elf directory that is more than 14 days old and you don't want to accidentaly remove files which contain something other than ps -elf data.


How to set up gnuplot:

If you ask SarCheck to produce PNG or JPEG graphs, it will look for gnuplot 3.7 and will try to use it to generate graphs. Most Linux distributions include gnuplot.


How to activate the software:

Evaluation and licensed software may require an activation key. If you need an activation key, run analyze s and forward the output to sales@sarcheck.com. Feel free to install eval software on as many Linux systems as you want and use it until it expires.

SarCheck was not designed to be regularly moved from one system to another, in an effort to provide "quick fixes" to a number of systems. Quick fixes will not allow you to take advantage of the long term iterative tuning that SarCheck makes possible.


How to deinstall SarCheck:

Remove the files mentioned in the section entitled "Files included in this release". By default, they're all located in /opt/sarcheck


How to run SarCheck (menu driven):

First, log in as root if necessary. See the sections entitled "How to install SarCheck..." to see if this is necessary. To analyze /proc statistics from a menu, type:

/opt/sarcheck/bin/sarcheck

To reduce typing, you may want to add /opt/sarcheck/bin to your PATH.

A series of choices will appear on the screen. If you accept all the defaults by pressing the Enter key, the current day's data will be analyzed, and this is the easiest way to get started. For security reasons, your account must have permission to access the data files that you wish to analyze.

You can control the start and end times by using the -st and -en switches on the command line or by using the sarcheck_parms keywords ST and EN. Information on switches can be found in the section entitled "Options available when running 'analyze'".



Analyze what?      d   A /proc data file 

                   c   Concatenate all /proc data files

                   *   Accept all defaults

                   x   exit sarcheck

(keyword = DR, default = d): _

After you pick the data files, you will be prompted to enter the name of the file. The c option will concatenate all of the data present and will not ask you for the name of a file. The sarcheck script will change your working directory, so you do not have to use the absolute address of the file. To accept all defaults, enter an asterisk; to exit sarcheck, enter an x. To change any of the defaults, see the section "How to change the SarCheck menu defaults".


Enter the name of the /proc data file you wish to analyze

Available data files in /opt/sarcheck/procstat:

20040315 20040316 20040317 20040318 20040319

(default = 20040319): 

The next option allows you to pick formatting. The default will produce a report with page numbers and page breaks (ctrl-L) included. For users that prefer to paginate the report with another tool, such as pg, the p option will suppress these page breaks. You can also choose to produce an HTML document at this point. HTML documents are best viewed with a web browser. If you wish to exit sarcheck, enter an x.


Pick formatting:   n   Normal, with page breaks

                   p   Page breaks suppressed

                   h   Create HTML document

                   *   Accept remaining defaults

                   x   exit sarcheck

(keyword = OPT, default = n): _

The verbosity option controls how verbose the SarCheck report is. Please note that instructions for implementing recommendations, explanations, and alternate tuning strategies may be suppressed by the quiet modes. When you're first using SarCheck, we recommend using the verbose mode so that you don't miss anything. The superquiet mode will automatically suppress page breaks


Verbosity level:   v   Verbose mode

                   q   Quiet, most verbiage

                       suppressed

                   Q   Superquiet, all verbiage

                       suppressed

                   *   accept all defaults

                   x   Exit SarCheck

(keyword = VERBOSE, default = v): _

Analysis of ps -elf data will provide you with a closer look at memory bottlenecks and the ability to detect runaway processes and memory leaks. The enhanced sensitivity option increases the probability of generating "false alarms". See the section entitled Setup of ps -elf data collection for more information. If you wish to exit sarcheck, enter an x.


Analyze ps -elf    n   No, analyze /proc data only

data?              y   Yes, analyze /proc and ps

		       data

		   e   Enhanced sensitivity of

		       ps data analysis

		   *   accept all defaults

		   x   exit sarcheck

(keyword = PSELFOPT, default = y): _

The tabular summary is used to print a summary of statistics in table form at the end of the report, and an example is included later in the manual. If HTML output has been selected, an HTML table is created. This option is useful for transferring statistics to a spreadsheet or graphics program, producing output which can be easily parsed by other programs. and for generating an easy to read table at the end of an HTML page.



Tabular Summary?   y  Print a tabular summary at the 

                      end of the report

                   i  Print a tabular summary instead of 

                      the report

                   n  Print the report without a summary

                   *  Accept remaining defaults

                   x  exit sarcheck

(keyword = TABULAR, default = y):

This option is used to decide where to send the analysis. Note that some of these choices will be different based on the pager you use and any modifications made to the defaults.

If you choose to send the output to a file, you'll be prompted for the name of the file. The default file name is /tmp/yyyymmddhhmmss, which is a date/time stamp. You can modify the default by editing the sarcheck script. If you wish to exit sarcheck, enter an x.



Send output to:    1   more (the screen)

                   2   lp -s (a printer)

                   3   A file

                   x   exit sarcheck

(keyword = OUTOPT, default = 1): _


How to run SarCheck (command driven):

To analyze a /proc data file called 20050221, type:

/opt/sarcheck/bin/analyze 20050221

To reduce typing, you may want to add /opt/sarcheck/bin to your PATH.

For best results, pipe the output to more so that you can read it, or redirect it to a file if you want to save it. A report will be produced which contains information about your system, a brief summary, a recommendations section (if applicable), and a resource analysis section. For more information, see the section entitled "How to Interpret the Analysis".

For users that prefer to paginate the report with another utility, such as pg, the -p option will suppress page numbers and page breaks. To take advantage of this option, type:

/opt/sarcheck/bin/analyze -p 20050221 | pg


How to run SarCheck (crontab entry):

SarCheck can be run automatically by adding an entry to the usr's crontab file, ideally using crontab -e. Here is an example which assumes the analysis will be done at 18:05:

In order to print a SarCheck analysis every weeknight, use the following entry:

5 18 * * 1-5 /opt/sarcheck/bin/analyze
/opt/sarcheck/procstat/`date +\%Y\%m\%d` | lp -s

(this should all be on one line)

To keep all of SarCheck's recommendations in the /usr/ops directory, use the following entry:

5 18 * * 1-5 /opt/sarcheck/bin/analyze
/opt/sarcheck/procstat/`date +\%Y\%m\%d` > /usr/ops/`date +\%y\%m\%d`

(this should all be on one line)

Because the output of the analyze program is stdout, you can pipe or redirect it in lots of ways. It can be printed, mailed, stored... whatever works best in your environment.


How to analyze multiple days of data:

SarCheck has the ability to analyze multiple days of data at once when the reports are concatenated. The only limitation is that these reports must actually exist and be valid. Here is an example of how to analyze data from the first seven days of March.

  1. First concatenate the reports, creating a single report called /tmp/multi:

  2. cat /opt/sarcheck/procstat/2004030[1-7] > /tmp/multi

  3. Now analyze the concatenated report and pipe it to more:

  4. /opt/sarcheck/bin/analyze /tmp/multi | more

Please note that the analyze program does not work if wildcard characters are used as a filename. Wildcard characters should be used with the cat command in order to produce a single file for the analyze program.


How to change the menu defaults:

The sarcheck script will now look for the file named /opt/sarcheck/etc/sarcheck_parms and will use any values found there instead of the normal defaults. This file is not included as part of the SarCheck distribution and you'll need to create it if you want to use it. Create a line in the file with the keyword and its new default value, separated by a space.

The keywords for SarCheck's menu options can be found when running the sarcheck script, and the value should be one of the choices on the menu. For example, if you want the output of SarCheck to include a tabular summary at the end of the report, here is the menu selection that you will see:


Tabular Summary?   y  Print a tabular summary at the 

                      end of the report

                   i  Print a tabular summary instead of 

                      the report

                   n  Print the report without a summary

                   *  Accept remaining defaults

                   x  exit sarcheck

(keyword = TABULAR, default = n):

You can see the name of the keyword and the options available. To change the default from 'n' to 'y' for this menu item, add the following line to the sarcheck_parms file:

TABULAR y

Now when you run the sarcheck script, the default behavior will be to print a tabular summary at the end of the report.

After those two fields are parsed by the sarcheck script, the rest of the line is ignored and is available as a comment. Any line that starts with something other than a valid keyword is also treated as a comment and is ignored.

Once you have decided to change the defaults, create or edit the sarcheck_parms file. Here is an example of a sarcheck_parms file where the starting and ending times used for analysis have been changed, page numbering is suppressed, and a tabular summary is printed at the end of the report. Note that since the sarcheck script only looks at the first two fields on each line, the rest of the line is treated as a comment and lines that don't start with valid keywords are also treated as comments:


	# file to customize sarcheck created by 

	# Jess the sys admin on March 23, 2005

	#

	ST 06:00   starting time is 6AM

	EN 15:00   ending time is 3PM

	OPT p      suppress page numbering

	TABULAR y  add a tabular summary

	GRAPHDIR /diskfarm/sarcheck/images

A complete list of keywords supported in the sarcheck_parms file can be found in Appendix A.


How to change SarCheck's algorithms:

The sarcheck_parms file can also be used to change the thresholds used by SarCheck's tuning algorithms. Default values for SarCheck's thresholds have been established based on feedback from hundreds of systems. We continue to refine our models. A complete list of keywords supported in the sarcheck_parms file can be found in Appendix A.


How to create graphs in SarCheck reports:

SarCheck builds graphs with the gnuplot utility. By adding the -png , -jpg, or -jpeg switches, SarCheck can use gnuplot to produce PNG or JPEG graphs and can insert those graphs in its HTML output. This will enable you to post some really interesting SarCheck reports on your corporate intranet. To produce an HTML report with PNG graphs, use the -html and -png switches when running analyze. For example, the command

analyze -html -png 20040312 > rpt12.html

will produce an HTML report which can be read by your favorite browser. The most important parts of the report will be printed in bold type and headings are used to clarify what you're looking at. Graphs are inserted in appropriate places in the body of the report and some additional text is added to help explain the significance of the graphs. For more information, see Appendix B: entitled "Options available when running 'analyze'".


Examples of how to use the switches:

This section is designed to help you decide how to do what you want. In order to maintain some level of clarity, all examples will analyze the data file 20040317. The output of the analyze program is stdout, so you'll probably want to pipe it to more or redirect it to a file.

Example 1: Analyzing the /proc data file. We're going to start with the simplest possible example. The command below will run the analyze program and tell it to analyze the data file 20040317.

analyze 20040317

Example 2: Removing the page breaks. The -p switch removes the page breaks. This is especially useful when piping the report to pg instead of more. A number of other switches, the -html switch for example, will automatically invoke -p where it makes sense.

analyze -p 20040317

Example 3: Analyzing ps -elf output in conjunction with the sar report. The /opt/sarcheck/bin/ps1 script is used to collect ps -elf data which can be used in conjunction with the sar report. The -ps switch tells the analyze program to search for the ps -elf data and include it in the report if possible.

analyze -ps 20040317

Example 4: Creating an HTML-formatted report. Let's combine a few switches this time. The -html switch makes the -p switch unnecessary, but we want to see a quick table of some statistics at the end of report.

analyze -ps -html -t 20040317

Example 5: Creating an HTML-formatted report with embedded graphs. This is the same as the previous example, except that graphs are now embedded in the HTML output and they will be visible when the output is viewed with a browser. For this to work properly, you must have a copy of gnuplot installed, SarCheck must be able to find it, and the output must be redirected to a file. The file is then opened with a browser that can display PNG graphs. You can also use the -jpeg or -jpg switches if your version of gnuplot supports jpeg output.

analyze -ps -html -t -png 20040317 > rpt17.html

Example 6: Emailing the output. If you manage a network of 200 systems, you may want to email interesting SarCheck reports to yourself, but you probably don't want to be spammed with 200 messages a day saying that everything's okay. The -r switch prevents SarCheck from producing any output at all if there are no recommendations. The -Q switch reduces verbiage to a minimum.

analyze -r -Q 20040317 | mail root@wherever.com

Example 7: Suppressing the detection of memory leaks. False alarms are not uncommon when SarCheck attempts to detect memory leaks. Some programs, such as those found on systems running Oracle, will grow over time. This is apparently a deliberate memory leak, and depending on the behavior of programs running on your system, you may want to suppress the reporting of memory leaks, runaway processes, or unusually large processes. The -pml switch can be used to suppress memory leaks as follows:

analyze -ps -pml 20040317

Example 8: Specifying a different ps -elf file. If you move the SarCheck files to a directory other than /opt/sarcheck and you want to analyze ps -elf data, you have to tell the analyze program where to find the data. The example looks at the /proc file named 20040317 and the ps -elf file /tmp/pselffile:

analyze -pf /tmp/pselffile 20040317


How to produce the most accurate analysis from the SarCheck menu:

Important: Understand that you can inadvertently cause SarCheck to produce misleading or incorrect recommendations. SarCheck looks at all of the data collected by the prst1 script in a file. That data should reflect the times of the day when performance is most important, ideally on the most active days of the week or month.

Analyze the data files that represent the busiest days. Determine the busiest times of the day, and if necessary, modify the crontab settings to collect data from the times when performance is important.


How to produce the most accurate analysis from the command line

Important: Understand that you can inadvertently cause SarCheck to produce misleading or incorrect recommendations. SarCheck looks at all of the data collected by the prst1 script. That data should reflect the times of the day when performance is most important, ideally on the most active days of the week or month.

It's important to analyze reports from days when the processing load was greatest. It may be that on those days, SarCheck will find resource bottlenecks which did not exist on days when the system did less work.

If you analyze data from the weekend, SarCheck may tell you how to optimize the system for weekend processing. Whether that makes sense or not in your environment (and it frequently won't) is up to you.


How to access the online instructions / help text:

To display the online help text, type:

/opt/sarcheck/bin/analyze -h

This sends a subset of the instructions found in this manual to standard output (stdout), which defaults to the screen.

For other help text, type:


/opt/sarcheck/bin/analyze -hp or

/opt/sarcheck/bin/analyze -hm or

/opt/sarcheck/bin/analyze -hg 

A FAQ section can also be found at the end of this manual and updated information can be found on the SarCheck web site.


How to get the most from SarCheck:

To get the most benefit out of SarCheck, we recommend using it as follows:

  1. Review the recommendations based on several days of statistics, especially days when peak processing occurs, and implement the recommendations that occur consistently.
  2. Implement recommendations one at a time. All performance tuning involves trial and error and every system is unique, therefore some recommendations may occasionally hurt performance. This is an uncommon occurrence (in fact, we've never heard of it happening with SarCheck), but these unsuccessful attempts to improve performance can only be identified if recommendations are implemented individually. In some cases, groups of parameters may need to be changed together. The bdflush parameters are an example of this.
  3. Continue using SarCheck on a regular basis. While SarCheck will probably make recommendations when it is first run, that's only the beginning. Since many of the changes recommended by SarCheck are small, they will gradually lead you towards a truly optimized system. As new users and applications are added, older programs are modified, and file sizes increase, SarCheck will help to keep you from being surprised by the changing demands of your applications and users.


How to interpret the analysis:

At the beginning of the analysis, the name of the data file, the date, time, number of intervals, number of processors seen, amount of memory, and system name is printed for identification purposes.

Warning messages will appear if impossible data is seen. Examples would be CPU utilization of 313% or -88 disk writes per second. SarCheck will still produce a report, but you should realize that the analysis of anomalous data is, as always, likely to follow the rule of 'garbage in, garbage out'.

The Summary section will highlight any bottlenecks that were seen in the areas of CPU, memory, or I/O, and will indicate if any kernel parameters need to be changed. If no bottlenecks are seen, the summary will say so, and point out that no recommendations will be made.

If runaway processes, memory leaks, or suspiciously large processes have been detected, a message will appear at the end of the Summary section.

The Recommendations section is present only if SarCheck has recommendations to make. If SarCheck thinks that everything is fine, no recommendations will be made. This is a normal condition and once the system is properly tuned, you should not be surprised to see a lack of recommendations.

The recommendations are based solely on the data contained in the data file and the values of various tunable parameters, and should be taken in that context. For example, if batch jobs are run on Saturdays, and SarCheck analyzes statistics from that day, it may decide that an I/O bottleneck existed and spare memory was present, and therefore, an increase in buffer size may be appropriate. Following these recommendations may improve performance on Saturdays, but could hurt performance during the week by reducing the amount of memory available to users.

The changes to tunable parameters recommended by SarCheck are designed to cause slow, gradual improvement in order to prevent surprises. These gradual changes are designed to prevent any unanticipated side effects of a major change in a tunable parameter.

Due to the interrelationships between tunable parameters and system resources, sarcheck goes beyond the basic rules of thumb whenever possible.

The Resource Analysis section translates the prst1 data file into English. Much of this data is provided for reference, and explanations are given where appropriate. The implications of various statistics regarding CPU utilization, buffer sizing, memory utilization, system table sizes, and disk I/O bandwidth are presented in this section.

The times when key resources are most heavily used appear in this section. If these times correlate well with the times that performance degradation was reported, it can be inferred that exhaustion of these resources may be a cause of performance problems. Peak usage statistics are also used by the capacity planning section.

The Capacity Planning section can be used to approximate the amount of capacity left on the system, based solely on the /proc data being analyzed. CPU and memory use statistics are examined in order to determine which resource is likely to become exhausted first.

This section is not meant to perform the same functions as the more expensive tools available for large systems. It is designed to help meet the needs of system administrators, many of which are managing growing systems and need to know how much "room" is left before various resources become exhausted.

The exhaustion of CPU resources is defined as any single interval in which CPU usage exceeded 90 percent. The need for more memory is determined by the relationship between the number of free pages and the values of the freepages parameters. Because the interval with the greatest resource usage is used, the capacity planning report will be less accurate if peak resource use occurred during an interval of less than 10 minutes.

The Custom Settings section is where both successful and unsuccessful changes to SarCheck's default thresholds are reported. See the sections "How to change the menu defaults" and "How to change SarCheck's algorithms" for more information.

Disclaimers, trademark information, etc. At the end of the report is a disclaimer, trademark and copyright information, your software serial number, code version, licensee, and if applicable, the software's expiration date.


How to order SarCheck:

Use the -o option of /opt/sarcheck/bin/analyze to produce an order form and call us, or ask your reseller to call us. The cost of shipping SarCheck via US Mail is included in the price of SarCheck. If you'd like the software shipped via Federal Express, DHL, etc., please provide your account number and we will be happy to accommodate you.

In some parts of the world, local resellers may charge prices which are higher than our list price because they pay for the currency conversions, international shipping, duties, support, etc. We urge our customers to support their resellers.


How to get technical support for SarCheck:

Use our email address: support@sarcheck.com or:

Call us at +1-603-382-4200,
fax us at +1-603-382-4247,
write to us at PO Box 1033, Plaistow NH 03865, USA,
or visit our web site at http://www.sarcheck.com/


How to get technical support for gnuplot:

See the FAQ at http://www.ucc.ie/gnuplot/gnuplot-faq.html. Please note that SarCheck works with gnuplot 3.7. Newer versions such as 4.0 may be available but because they were written without much regard for backward compatibility, we recommend using 3.7 with SarCheck.


Files included in this release:

This release contains the following files:
/opt/sarcheck/bin/analyze:This program performs the analysis.
/opt/sarcheck/bin/sarcheck:This is the front end for analyze. It's a simple Bourne shell script which allows you to analyze the current day's data by pressing the enter key a few times. Create a sarcheck_parms file if you want to customize this script. See the section "How to change the menu defaults" for more information.
/opt/sarcheck/etc/analyze.txt:This file contains the text used to produce the analysis. In general, we recommend that you do not modify this file, because it may leave us unable to support the software. Users outside of the United States may modify the spelling of certain words in the file if they wish. For example, the word 'utilization' can be changed to 'utilisation'. If you would like a non-English version of SarCheck, please call us.
/opt/sarcheck/etc/analyze.key:This file contains the activation key. This file is not meant to be edited directly and tampering with it may permanently disable SarCheck.
/opt/sarcheck/bin/sarcheckagent:This is the program that collects data from the /proc filesystem and other locations.
/opt/sarcheck/bin/prst1:This is a script that runs the sarcheckagent program.
/opt/sarcheck/bin/prst2:This is a script that deletes old data created by prst1.
/opt/sarcheck/bin/ps1:This is a script that collects ps -elf data.
/opt/sarcheck/bin/ps2:This is a script that cleans up ps -elf data.
/opt/sarcheck/etc/20050221:A sample file containing data from /proc/stat and other locations.
/opt/sarcheck/etc/sarcheck_parms:This file is not actually included with the SarCheck distribution but you might want to create it in order to modify the SarCheck menu defaults or the thresholds used by SarCheck's algorithms.
/opt/sarcheck/doc/linuxman60100.html:This manual.
/opt/sarcheck/procstat/readme:This is a short description of the purpose of the procstat directory.
/opt/sarcheck/ps/readme:This is a short description of the purpose of the ps directory.


An example of a SarCheck report:

The following examples were produced with the -w option, used to suppress page breaks and newlines. This option sounds pretty odd, but it's really useful when exporting SarCheck reports to a Word Processing program. Please note that the text of the SarCheck report is printed in Courier font, and the explanation immediately follows the text of the report.

SarCheck(TM): Automated Analysis of Linux data (English text version 6.01.00)

This is an analysis of the data contained in the file 2005q1. The data was collected from 2005/01/21 to 2005/03/21, from system 'localhost'. There were 4853 data records collected over 51 days used to produce this analysis. Operating system was 2.2.16-22. The number of processors present could not be determined, therefore the amount of processors are estimated from the cpu statistics. 1 processor is assumed to be present. 125 megabytes of memory are present.

    This introductory paragraph prints the name of the file collected by prst1, when the data was collected and other information about the system environment.

The date format used in this report is yyyy/mm/dd. The date format was set in the sarcheck_parms file.

    The date can be formatted in a number of ways. This is reported in the beginning because some dates can be ambiguous. For example, if the date reported was 07/06/2005, it would be nice to knoqw if the format was dd/mm/yyyy or mm/dd/yyyy.

Data collected by the ps -elf command during 51 days between 2005/01/21 and 2005/03/21 will also be analyzed. This program will attempt to match the starting and ending times of the ps -elf data with those of the report file named 2005q1,

    This paragraph prints information about the ps -elf data that was collected. This data is used primarily to find runaway processes or memory leaks.

DIAGNOSTIC MESSAGE: The number of disks in SarCheck's disk table is 2 and the table is 0.044 percent full.

Command line used to produce this report: analyze -w -t -ps -diag 2005q1

    If you choose to use the -diag switch, SarCheck will print statistics about the usage of various internal tables. It will also show the arguments used to run the analyze program to help us if we're having trouble duplicating any SarCheck output that you may have questions about.

SUMMARY

When the data was collected, no CPU bottleneck could be detected. No memory bottleneck was seen and the system has sufficient memory. A change has been recommended to at least one tunable parameter. Recommendations can be found in the Recommendations Section of this report.

Some of the defaults used by SarCheck's rules have been overridden using the sarcheck_parms file. See the Custom Settings section of the report for more information.

    The summary lists any bottlenecks detected, changes specified in the sarcheck_parms file, and any problems which may impact the accuracy of the analysis. If SarCheck found no problems and was unable to make any recommendations, that fact would be mentioned here. If anything unusual is seen in the ps -elf data, it will be summarized here too.

RECOMMENDATIONS SECTION

All recommendations contained in this report are based solely on the conditions which were present when the performance data was collected. It is possible that conditions which were not present at that time may cause some of these recommendations to result in worse performance. To minimize this risk, analyze data from several different days, implement only regularly occurring recommendations.

    The first paragraphs of the recommendations section explain how to implement the recommendations. More information on this topic can be found in the "How to produce the most accurate analysis..." and "How to get the most from SarCheck" sections of this manual.

Change the bdflush parameter 'nfract' from 40 to 52. This is the percentage of dirty buffers allowed in the buffer cache before the kernel flushes some of them.

Change the bdflush parameter 'ndirty' from 500 to 250. This is the number of dirty blocks written to disk at one time when the bdflush daemon wakes up.

Change the bdflush parameter 'nrefill' from 64 to 128. This will allow the operating system to obtain more clean buffers when refill_freelist() is called.

Change the bdflush parameter 'nref_dirt' from 256 to 512. This is recommended in order to keep its value 4 times higher than nrefill.

To change the value of the bdflush parameters immediately as described in the above recommendations, use the following command:

echo "50 250 128 512 500 3000 500 1884 2" > /proc/sys/vm/bdflush

If this change improves performance, you can make it permanent by adding the command to the /etc/rc.d/rc.local file.

    These are examples of parameter tuning recommendations. As a rule, SarCheck will recommend small, incremental changes to the system's tunables in order to produce gradual change. In some cases, larger changes will be recommended because this is the only way to go. For example, a recommendation is made to double nrefill and nrefill_freelist because the value should always be a power of 2.

    Please note that the formulas typically used to set many parameters can cause problems when manual adjustments are being made.

    In addition to parameter tuning recommendations, SarCheck will suggest hardware upgrades when they are likely to help.

RESOURCE ANALYSIS SECTION

    The resource analysis section is the place where various aspects of resource utilization are discussed regardless of whether a problem was seen.

Average CPU utilization was only 0.4 percent. This indicates that spare capacity exists within the CPU. If any performance problems were seen during the monitoring period, they were not caused by a lack of CPU power. CPU utilization peaked at 18.44 percent from 11:30:00 to 11:40:00 on 2005/02/01. A CPU upgrade is not recommended because the current CPU had significant unused capacity. Reporting on individual CPUs has been suppressed with the NOMP keyword in the sarcheck_parms file.

CPU graph, .png format not readable on very old browsers

    CPU utilization statistics from /proc/stat are analyzed here. In addition to average CPU utilization, occasionally heavy utilization and peak utilization is noted. The times of peak resource utilization are noted throughout this section and are provided to help you detect any correlation between peak resource utilization and peak performance degradation.

The average amount of free memory was 8156.4 pages or 31.9 megabytes. The minimum amount of free memory was 568 pages or 2.22 megabytes at 09:40:01 on 2005/02/21.

.png format not readable on very old browsers

    The above graph has been zoomed in to show the relationship between the size of the free list and the values of freepages parameters.

The freepages.min value was 255 pages or 1.0 megabytes. The freepages.low value was 510 pages or 2.0 megabytes. The freepages.high value was 765 pages or 3.0 megabytes. If the system's free list drops below freepages.high the kernel will start gently swapping. No significant memory bottleneck was seen. The number of pages of free memory occasionally dipped below the value of freepages.high but was never less than freepages.low.

    The paragraphs above explain whether SarCheck thinks the system is memory poor, and analyzes the usage of resources related to memory utilization.

The value of nfract, the dirty buffer threshold used to wake up bdflush, was set to 40 percent. The goal of tuning nfract is to keep it low enough that the number of dirty buffers in the cache is not enough to degrade performance, but high enough to allow as many dirty buffers in the cache as possible. In this case a recommendation was made to increase the value to 52 percent.

The value of ndirty was set to allow bdflush to write 250 buffers to the disk at one time. The recommended decrease should make I/O less bursty and will save a small amount of memory.

The nrefill parameter was set to 64 buffers. This parameter controls the number of buffers to be added to the free list whenever bdflush calls refill_freelist(). A recommendation to increase this value to 128 will result in fewer calls to refill_freelist(). The nref_dirt parameter was set to allow refill_freelist() to wake up bdflush whenever it found more than 256 dirty buffers. A recommendation to increase this value to 512 will keep it properly aligned with nrefill.

The interval parameter in /proc/sys/vm/bdflush controls how frequently the kernel update daemon runs, and it was set to 500 jiffies. A jiffie is a clock tick and on x86 systems, there are 100 jiffies per second.

The age_super parameter was set to write dirty metadata buffers to disk when they were 500 jiffies old.

    Other paragraphs analyze various parameters including these bdflush parameters.

The value of the page_cluster parameter was 4. This means that 16 pages are read at once. Values of 4 or 5 are better for large systems that perform non-interactive jobs using sequential I/O. There may be an advantage in lowering this value if response times need to be improved during heavy I/O, but I/O-bound jobs may suffer as a result.

The kswapd parameter tries_base was set to 512. This controls the number of pages that kswapd will try to free each time it runs. The kswapd parameter tries_min was set to 32. This controls the number of times that kswapd tries to free a page of memory when it's called. The kswapd parameter swap_cluster was set to 32. This controls the number of pages that kswapd will try to write when it is called.

The average page in rate was 0.201 per second. Page ins peaked at 27.34 per second from 09:30:00 to 09:40:01 on 2005/02/21. The average page out rate was 0.531 per second. Page outs peaked at 19.00 per second from 11:30:00 to 11:40:00 on 2005/02/01.

.png format not readable on very old browsers

The average swap in rate was greater than zero but less than .01 per second. Swap ins peaked at 1.46 per second from 15:10:00 to 15:20:00 on 2005/03/18. The average swap out rate was .02 per second. Swap outs peaked at 4.03 per second from 12:00:00 to 12:10:00 on 2005/02/21.

.png format not readable on very old browsers

The amount of swap space in use peaked at 37.32 megabytes from 14:20:00 to 14:30:00 on 2005/03/18. The average amount of swap space in use was 28.36 megabytes. The size of swap space was 70.56 megabytes. The peak amount of swap space in use was 52.89 percent of the total.

There was one swap partition seen in /proc/swaps. The rate of swap operations peaked at 4.45 per second from 09:30:00 to 09:40:01 on 2005/02/21.

There were 5 superblocks in use and a maximum of 256 superblocks were available. There is plenty of room for growth here.

According to data collected from /proc/partitions, the system-wide disk I/O rate averaged 0.44 per second and peaked at 26.15 per second from 09:30:00 to 09:40:01 on 2005/02/21. The read rate averaged 0.11 per second and peaked at 21.70 per second from 09:30:00 to 09:40:00 on 2005/02/21. The write rate averaged 0.33 per second and peaked at 4.88 per second from 11:20:00 to 11:30:00 on 2005/02/02.

.png format not readable on very old browsers

The noatime option was specified on at least one of the mounted filesystems. Because non-trivial levels of disk activity were seen, you may want to decide whether it would be helpful to mount some filesystems with this option. The value of the ctrl-alt-del parameter was 0. The value of 0 is better in almost all cases because it prevents an immediate reboot if the ctrl, alt, and delete keys are pressed simultaneously.

There were an average of 135.90 interrupts per second and the peak interrupt rate seen was 440.06 per second from 09:30:00 to 09:40:00 on 2005/02/21.

    We finish the Reource analysis section by looking at the output pf ps -elf data. SarCheck can look at the number of processes and will report any runaway processes or memory leaks that were seen.

At 16:00:00 on 2005/03/16 ps -elf data indicated that there were a peak of 90 processes present. This was the largest number of processes seen with ps -elf but it is not likely to be the absolute peak because the operating system does not store the true "high-water mark" for this statistic. There were an average of 82.4 processes present.

.png format not readable on very old browsers

No runaway processes, memory leaks, or suspiciously large processes were detected in the data contained in the ps data files. Notable was generated because no unusual resource utilization was seen in the ps data.

    Other resource utilization including paging, swap space usage, and disk I/O rates are analyzed here. If SarCheck had recommendations to make with regard to these resources, they would have been make in the Recommendations Section.

CAPACITY PLANNING SECTION

The section is designed to provide the user with a rudimentary linear capacity planning model and should be used for rough approximations only. These estimates assume that an increase in workload will affect the usage of all resources equally. These estimates should be used on days when the load is heaviest to determine approximately how much spare capacity remains at peak times.

    The Capacity Planning section can help you to understand how much additional load your system can support. This feature is not designed to replace the features found in mainframe-type capacity planning tools, but rather to give you an approximation of how much room for growth remains in key system resources.

Based on this single day of data, the system should be able to support a substantial increase in workload before impending CPU or memory bottlenecks are seen. Run SarCheck regularly to detect bottlenecks before they impact performance.

.png format not readable on very old browsers

    This paragraph summarizes the amount of capacity remaining in your system during peak times and identifies the first likely resource bottleneck. If all system resources monitored could support an increase in workload of at least 100 percent (as in this case), the summary will say that no impending capacity limits were seen. If the first bottleneck is likely to occur in memory, the amount of capacity remaining will not be quantified. This is because the data required for that kind of complex memory modeling cannot be found in the /proc/meminfo data.

CUSTOM SETTINGS SECTION

The default SYSUSR threshold was changed in the sarcheck_parms file from 2.5 to 2.8.

The default HSIZE was changed in the sarcheck_parms file from 0.75 to 1.20.

The date format of yyyy/mm/dd was set in the sarcheck_parms file.

The NOMP keyword was found in the sarcheck_parms file and is unneeded because the system being analyzed is not a multiprocessor system.

    The Custom Settings section is where both successful and unsuccessful changes to SarCheck's default thresholds are reported. See the sections "How to change the menu defaults" and "How to change SarCheck's algorithms" for more information.

Please note: In no event can Aptitune Corporation be held responsible for any damages, including incidental or consequent damages, in connection with or arising out of the use or inability to use this software. All trademarks belong to their respective owners. This software is provided for the exclusive use of: Your Company. This software expires on 2005/05/06 (yyyy/mm/dd). Code version: SarCheck for Linux 6.01.00. Serial number: 00099999.

    This is the message that shows up in licensed software. Evaluation versions will display a different message.

(c) Copyright 2003-2005 by Aptitune Corporation, Plaistow NH 03865, USA, All Rights Reserved. http://www.sarcheck.com/

Statistics for system: localhost
Statistics collected on: 06/16/2004
MAC Address: 00:02:B3:3A:41:58
Average combined CPU utilization: 0.69%
Average user CPU utilization: 0.61%
Average sys CPU utilization: 0.08%
Average 'nice' CPU utilization: 0.00%
Peak combined CPU utilization: 7.76%
Average page out rate: 0.76/sec
Peak page out rate: 6.07/sec
Average swap out rate: 0.01/sec
Peak swap out rate: 0.31/sec
Average swap space in use: 26.97 megabytes
Peak swap space in use: 27.48 megabytes
Average amount of free memory: 5679 pages or 22.2 mb
Minimum amount of free memory: 1929 pages or 7.54 mb
Average I/O rate: 0.44/sec
Peak I/O rate: 2.07/sec
Average read rate: 0.03/sec
Peak read rate: 0.51/sec
Average write rate: 0.40/sec
Peak write rate: 1.77/sec
Average Interrupt rate: 143.46/sec
Peak Interrupt rate: 218.88/sec
Avg number of processes seen by ps: 90.8
Max number of processes seen by ps: 98
Approx CPU capacity remaining: 100%+
Can memory support add'l load: Yes

    This is the output that you'll see if you're using the -t or -tonly switches. This table is much easier to parse than the standard text-based SarCheck report. To produce data in this format at the end of the report, use the -t or -tonly options.

Thanks for your interest and support!


Frequently asked questions (FAQ):

Q: What versions of Linux does SarCheck support?

A: We haven't tried all kernels or distributions, but we support most systems running Linux kernels 2.2 through 2.6.

Q: If I have other kinds of UNIX systems, can I try SarCheck on those too?

A: Sure! Fill out our order form and we'll send you eval copies of our released products. SarCheck is also available for Solaris SPARC 2.5 and up, HP-UX versions 10 and 11, and AIX 4.3 to 5.2.

Q. Should I implement recommendations that only show up occasionally?

A. Feel free to try, but first implement the regularly occurring recommendations, since those will address the most frequently occurring problems. If SarCheck occasionally recommends increasing the amount of memory, you should certainly try it. On systems with some extra memory, SarCheck will be able to make additional recommendations that could not be made on systems where memory is "tight".

Q. Every time I make changes based on SarCheck's recommendations, it makes more recommendations. Why doesn't it just figure out the correct values for all the parameters?

A. That's not how real performance tuning works. There are no "correct" values because tuning is a series of compromises between various system resources. Performance tuning involves a certain degree of trial and error, and gradual change is the only way to do it.

Q. When I try to run sarcheck, I get the message "sarcheck: not found". What's wrong?,

A. Check the following:

Q. Why did SarCheck stop producing reports?

A. Usually this is because the software has expired. Run '/opt/sarcheck/bin/analyze' and look for the expiration date at the bottom of the usage text. If you've licensed SarCheck and the expiration date doesn't make sense to you, run 'analyze -s' and send us the output.

Q. How do I collect data at 10 minute intervals over a 24 hour period?

A. The crontab entries should look like this:

0,10,20,30,40,50 * * * * /opt/sarcheck/bin/prst1
55 23 * * * /opt/sarcheck/bin/prst2

Q. How do I collect data every 20 minutes from 08:00 to 18:00?

A. The crontab entries should look like this:

0,20,40 8-17 * * * /opt/sarcheck/bin/prst1
0 18 * * * /opt/sarcheck/bin/prst1
5 18 * * * /opt/sarcheck/bin/prst2

Q. Why are there messages like "Error: can't read /proc/sys/vm/bdflush file" in the procstat data file?

A. The file either doesn't exist or cannot be read due to a permissions problem. If the file doesn't exist, it probably isn't part of your Linux distribution and you shouldn't be concerned. If it's a permissions problem, the prst1 script needs to be run as root or you should allow the file in question to be read by a non-root user.


Bibliography:

UNIX System V Performance Management. 1994. Englewood Cliffs, NJ.: PTR Prentice Hall. ISBN 0-13-106429-1.

Majidimehr, A. Optimizing UNIX for Performance. 1996. Englewood Cliffs, NJ.: PTR Prentice Hall. ISBN 0-13-111551-0.

Fink, Jason R. and Sherer, Matthew D. Linux Performance Tuning and Capacity Planning. 2002. Indianapolis, IN.:SAMS. ISBN 0-672-32081-9.

Nemeth, E., Snyder, G., and Hein, T.R. Linux Administration Handbook. 2002. Upper Saddle River, NJ .:PTR Prentice Hall. ISBN 0-13-008466-2.


Appendix A: SarCheck parms file keywords

Many of the keywords and the defaults can be found by looking at the questions that the sarcheck script asks. Here is a complete list:

PAGERThe pager to be used to display the analysis on the screen. The default is more, but pg or less are common alternatives.
LPSThe command for printing the analysis. The default is lp -s.
PSELFDIRThe directory where SarCheck will look for the ps -elf data. The analyze program and the ps1 and ps2 scripts will use this new directory. WARNING! Please pick a directory that contains nothing but ps -elf data! The ps2 script will use the find command to remove any file in the specified directory which is more than 14 days old. We have tried to limit the potential damage by adding the -name switch to the find command but, you should still be very careful with this.
DRWhether to analyze single prst1 data file or all of the files in /opt/sarcheck/procstat. The default is 'd'. For a list of options, run the sarcheck script and see what options are on the screen when the keyword is DR.
OPTHow to format the report. The default is 'n'.
VERBOSEWhether the output should be verbose or quiet. The default is 'v'.
PSELFOPTHow verbose the ps -elf output should be. This option is used primarily to increase SarCheck's sensitivity to problems in the ps -elf data. The default is 'n'.
TABULARWhether or not to print a tabular summary at the end of the report or print a tabular summary instead of the report. The default is 'n'.
OUTOPTThis option controls where the output of the sarcheck script should go. The default is '1'.
FILEIf you want to save the output of the sarcheck script as a file with a predefined name, this is where the name goes.
SCDIRThe directory where the SarCheck's scripts and executables can be found
ETCDIRThe directory where various files normally in /opt/sarcheck/etc can be found
GNUPLOTThe version of gnuplot present on your system. The default value is 3.7.
GNUPLOTDIRThe directory in which you've installed gnuplot.
GRAPHDIRThe directory in which the graphs will be stored.
HTMLGRAPHDIRThe directory used for the src attribute of the HTML img tags. This is the same as the -hgd switch.
DMYChange the default date format to dd/mm/yyyy
YMDChange the default date format to yyyy/mm/dd
STThe starting time for the analysis. This should be entered in 24 hour format.
ENThe ending time for the analysis. This should be entered in 24 hour format.
NOMPSuppress the reporting of individual CPU statistics on multiprocessor systems. This is the same as the -nomp switch.
DTBLShow disks as an HTML table or as CSV data instead of text. This is the same as the -dtbl switch.
DTOOShow disks as an HTML table or as CSV data in addition to text. This is the same as the -dtoo switch.
PSInclude ps -elf data in SarCheck's analysis of data collected by the sarcheckagent program.
PTBLShow processes that exceed certain thresholds in an HTML table or as CSV data instead of text. This is the same as the -ptbl switch.
PTOOShow processes that exceed certain thresholds in an HTML table or as CSV data in addition to text. This is the same as the -ptoo switch.
HSIZEChange the default width of the graphs generated by gnuplot. If you want to see grpahs that are wider than the ones produced by the default width of 0.7, this keyword can be used to produce wider graphs.

The sarcheck_parms file can also be used to change the defaults used to generate HTML output.

KeywordAllowed rangeDefault
BGCOLORAny valid color#FFEE88
TEXTCOLORAny valid colorblack
REDCOLORAny valid color#FF9999
PINKCOLORAny valid color#FFCC99

BGCOLOR: The background color specified in the bgcolor attribute of the HTML tag.

TEXTCOLOR: The text color specified in the text attribute of the HTML tag.

REDCOLOR: The background color specified in the bgcolor attribute of certain tags. The color used to highlight the cells of an HTML table when the values exceed certain thresholds. The default color is a shade of red and this keyword exists to give you an option if you want to use the color red as the text or background color.

PINKCOLOR: The background color specified in the bgcolor attribute of certain tags. The color used to highlight the cells of an HTML table when the values exceed certain thresholds. The default color is a shade of pink and this keyword exists to give you an option if you want to use the color pink as the text or background color.

These changes can be implemented using the /opt/sarcheck/etc/sarcheck_parms file. Please note that the default values of SarCheck's thresholds have been established based on feedback from hundreds of systems and these values should not be overridden without good reason. Here is a list of thresholds which can be currently overridden, and the meaning of each is described below:

KeywordAllowed rangeExpected RangeDefault
AVGCPU50 - 10060 - 10070
MAXCPU50 - 10060 - 10070
CAPCPU25 - 10060 - 10090
CPULIM0.05 - 10010 - 10020
MLRATE1+100+2000
MLTIME1+100+7195
LGPROC256+ pages4096+ pagesformula
DCALLanyany10
DCLPanyany10
DCMLanyany10
DCRPanyany10
SYSUSR0 - 9990 - 9992.5

AVGCPU: When average CPU utilization exceeds this value, SarCheck considers the system to be busy enough to cause concern.

MAXCPU: When Peak CPU Utilization exceeds this value, SarCheck assumes that performance degradation is likely.

CAPCPU: The value used to calculate the increase in CPU load that the system can support at peak times.

CPULIM: The threshold in computed CPU utilization SarCheck uses to decide if a runaway process has been detected in ps -elf data.

MLRATE: The threshold in pages of memory per hour used by SarCheck to decide if a memory leak has been detected in ps -elf data.

MLTIME: The amount of time in seconds of memory per hour used by SarCheck to decide if a memory leak has been detected in ps -elf data.

LGPROC: The minimum size in pages of a process which SarCheck will report as being suspiciously large. The formula used to calculate the default threshold is 48 megabytes or one half the size of memory, whichever is larger.

DCALL: Disable the feature which limits the number of suspiciously large processes, memory leaks, and runaway processes.

DCLP: Disable the feature which limits the number of suspiciously large processes that are reported or change the number being reported. Using the keyword DCLP without a second field will disable the limit. Using a second field (for example: DCLP 25) will change the limit to the value in the second field.

DCML: Disable the feature which limits the number of processes with memory leaks that are reported or change the number being reported. Using the keyword DCML without a second field will disable the limit. Using a second field (for example: DCML 25) will change the limit to the value in the second field.

DCRP: Disable the feature which limits the number of runaway processes that are reported or change the number being reported. Using the keyword DCRP without a second field will disable the limit. Using a second field (for example: DCRP 25) will change the limit to the value in the second field.

SYSUSR: The threshold used to decide if it's worth mentioning if there is an unusual amount of %sys activity relative to %usr activity. The default of 2.5 means that %sys activity needs to be at least 2.5 times greater than %usr activity for this to be reported.

It is possible to set these parameters to values which can make SarCheck's recommendations meaningless or incorrect. Please override the default values with care.


Appendix B: Options available when running 'analyze':

-dblpSuppress warnings about suspiciously large database processes.
-dbmlSuppress warnings about possible memory leaks in database processes.
-dbrpSuppress warnings about possible runaway database processes.
-dcallDisable limiting the number of warnings about suspiciously large processes, possible memory leaks, and possible runaway processes.
-dclpDisable limiting the number of warnings about suspiciously large processes.
-dcmlDisable limiting the number of warnings about possible memory leaks in processes.
-dcrpDisable limiting the number of warnings about possible runaway processes.
-diagThis option will add a paragraph to the report showing how full SarCheck's internal tables have become. If a table comes too close to becoming full, a message should appear in the SarCheck report asking you to send a copy of the report to support@sarcheck.com This switch will also print the exact command used to produce the report.
-dmyThis switch causes the date format used in the SarCheck report to appear in the format dd/mm/yyyy.
-dnzThis switch causes prevents information about idle disks from printing and can shorten the analysis of systems with many disks.
-dtblIf the -html switch is used, -dtbl will produce a table of statistics instead of generating a paragraph on each disk. Cells in the table will be color coded to highlight the largest valid value in each column. This option is recommended for systems where a large number of individual paragraphs would be hard to comprehend. If the -html switch is not used, -dtbl will cause disk to be output in a comma separated value (CSV) format.
-dtooIf the -html switch is used, -dtoo will produce a table of statistics in addition to generating a paragraph on each disk. Cells in the table will be color coded to highlight the largest valid value in each column. If the -html switch is not used, -dtoo will cause disk statistics to be output in a comma separated value (CSV) format. In addition, it will generate a paragraph on each disk.
-g24This switch will change the appearance of multiday graphs. It changes the graph to be displayed with an X-axis of up to 24 hours and data from different days will be superimposed. This can help to spot activity that occurs at the same time each day.
-gdChange the directory in which SarCheck puts the graphs generated by gnuplot.
-enSpecify the ending time for data to be analyzed in a 24 hour format. Specifying 17 will cause data through 17:00:00 to be analyzed, and specifying 17:30 will cause analysis to stop with any data after 17:30:00. This switch will work on single day or multiple days of data and is usually used in conjunction with the -st switch.
-gonlyProduce graphs only. This switch should be used together with the -jpeg, -jpg, or -png switches. The names of the graphs produced will be sent to stdout and no report will be produced.
-hDisplays brief instructions and shows all of the possible switches.
-hgHow to produce graphs using the -jpg, -jpeg, and -png switches.
-hmHow to analyze multiple days of data.
-hpHow to analyze supplemental ps -elf data.
-htmlInsert HTML tags in text for use by a browser. The -dtbl, -dtoo, -dserv, -dbusy, -ptbl, -ptoo, -t, -png and -jpg switches are likely to be of interest to you if you are using -html.
-jpeg or -jpgThese switches will cause SarCheck to look for gnuplot and use it to produce graphs in JPEG format. The naming convention used by SarCheck will append either ".jpeg" or ".jpg" to the file name of the graph, depending on the switch you use. The creation of JPEG formatted graphs uses less CPU time than the creation of PNG formatted graphs. JPEG formatted graphs are also larger and do not look as crisp as PNG graphs, but they are much more likely to display correctly with older browsers.
-kAllows you to change the activation key and software expiration date.
-mdyForce the default mm/dd/yyyy date format to be used if it's overridden by the use of a non-English text file or entries of DMY or YMD in the sarcheck_parms file.
-nompOn multiprocessor systems, this switch will prevent the reporting of statistics for each individual CPU.
-noparmsIgnore the contents of the sarcheck_parms file when generating the report.
-oPrints an order/registration form for those wishing to purchase a software license, or register their licensed software.
-pSuppress page numbering & page breaks. This is especially useful when the output is piped to pg.
-pngThis switch will cause SarCheck to look for gnuplot and use it to produce graphs in PNG format. The naming convention used by SarCheck will append ".png" to the file name of the graph. The creation of PNG formatted graphs takes more CPU time on AIX. PNG formatted graphs are also smaller and look cleaner than JPEG graphs, but may not display correctly with older browsers.
-psIncorporate the analysis of a single ps -elf file called /opt/sarcheck/ps/yyyymmdd where the date is extracted from the sar data.
-pdChange the directory in which SarCheck expects to find ps -elf data. SarCheck will still determine the name of the ps -elf data file and the purpose of this switch is to allow you to store ps -elf data wherever you want. This data can take up a considerable amount of space.
-pfInclude analysis of a specified file containing ps -elf data.
-pvVerbose analysis of ps -elf data, overridden by the -Q and -q switches.
-plpSuppress warnings about suspiciously large processes.
-pmlSuppress warnings about possible memory leaks.
-prpSuppress warnings about possible runaway processes.
-ptblIf the -html switch is used, -ptbl will produce a table of ps -elf statistics instead of generating a paragraph on each process whose resource utilization exceeds the threshold. Cells in the table will be color coded to highlight the interesting statistics. This option is recommended for systems where a large number of individual paragraphs would be hard to comprehend.

If the -html switch is not used, -ptbl will cause ps -elf statistics to be output in a comma separated value (CSV) format.

-ptooIf the -html switch is used, -ptoo will produce a table of ps -elf statistics in addition to generating a paragraph on each process whose resource utilization exceeds the threshold. Cells in the table will be color coded to highlight interesting statistics.

If the -html switch is not used, -ptoo will cause ps -elf statistics to be output in a comma separated value (CSV) format. In addition, it will generate a paragraph on each process whose resource utilization exceeds the threshold.

-QPrint a non-verbose (super-Quiet) analysis. This option automatically sets the -p option.
-qPrint a less verbose (quiet) analysis.
-rPrint an analysis only if recommendations are made.
-ret0Force a return code of zero. The analyze program normally returns zero if no recommendations are made and one if it makes recommendations. This option exists because some scheduling tools report non-zero return codes as errors or exceptional conditions.
-sDisplay all the information needed to activate SarCheck.
-stSpecify the starting time for data to be analyzed in a 24 hour format. Specifying 09 (or just 9) will cause data starting at 09:00:00 to be analyzed, and specifying 9:30 will cause analysis to start with any data collected at or after 09:30:00. This switch will work on a single day or multiple days of data and is usually used in conjunction with the -en switch.
-summDisplay only the text summary at the beginning of the SarCheck report.
-tThis option will produce a summary of interesting statistics in a tabular format. This output can be parsed with relative ease. If the -html switch is used, the statistics will be presented in an HTML table, and cells in the table will be color coded to highlight noteworthy statistics. This option works well with -dtbl.
-tonlyThis option will produce nothing but a summary of interesting statistics in a tabular format. All recommendations, analysis, and other hopefully interesting text will vanish. If the -html switch is used, the statistics will be presented in an HTML table, and cells in the table will be color coded to highlight noteworthy statistics.
-wSuppress page breaks and newline characters, primarily for export to PC-based word processing programs.
-wideChange the width of graphs genreated by gnuplot from 0.7 to 1.3. To pick a different value or change the default, use the HSIZE keyword in the sarcheck_parms file.
-ymdThis switch causes the date format used in the SarCheck report to appear in the format yyyy/mm/dd.