ZoneHound™ 1.01 manual


Table of Contents:

Introduction
Features
Restrictions
Known limitations
How to install ZoneHound
How to set up ZoneHound in cron
How to activate the software
How to deinstall ZoneHound
How to change the menu defaults
Some examples of how to use the switches
How to access the online instructions/help text
How to order ZoneHound
How to get technical support for ZoneHound
Files included in this release
An example of ZoneHound's output
Frequently asked questions
Appendix A: zonehound_parms file keywords
Appendix B: Options available when running ZoneHound

Introduction:

ZoneHound™ is a utility which summarizes data collected by the ps utility. The purpose of ZoneHound is to tell you the resource utilization of each Solaris Zone on a system with multiple zones.

We use the ps utility's -Z option to display the zone name. This field is not wide enough to show a very long zone name, so we suggest that you use zone names which are short enough to display clearly and unambiguously.

We also use the ps utility's -y option to monitor memory usage. This option displays the resident set size in the RSS column of ps output. The total RSS size of a zone may exceed the amount of memory allocated to that zone, probably due to shared memory being reported multiple times.

ZoneHound's output is easily parsed and will be automatically analyzed by our ZoneHound performance analysis tool.


Features:

The following resources are monitored by ZoneHound:
  1. CPU usage
  2. Memory usage according to the RSS field in ps
  3. The number of processes running
  4. The number of processes waiting to run
  5. The number of zombie process
  6. Usage of ZoneHound's internal table


Restrictions:

ZoneHoundTM for Solaris SPARC is designed to work with Solaris SPARC 10. Older versions of Solaris do not support zones, and while ZoneHound should work, the results won't be very interesting.


Known limitations:

  1. Zones must use a naming convention that enables them to be seen individually with ps -elfyZ. To do this, you'll want the first 8 characters of each zone name to be unique. This is really a limitation of the ps utility and we suggest that you make your life easier by using zone names where the first 8 characters are unique.


How to install ZoneHound:

The software is a compressed tar archive and there are different names for different versions. The SPARC version is called zh3264.taz, the 32 bit x86 version is called zhx86.taz, and the 64 bit x86 version is called zhx64.taz. The taz file extension is an old one and it's an abbreviation of .tar.Z. The following example uses the SPARC version:

To install the software, log in to the global zone as root, uncompress, and detar the software. This only takes a few seconds.

  1. Log in to the global zone as root.
  2. Now extract the compressed tar archive:
    zcat < zh3264.taz | tar xvf -
This will install the files on your system, described in the section "Files included in this release". The installation of ZoneHound does not require rebuilding the kernel. This is important because it means that ZoneHound will not increase the size of your kernel, and you won't have to reboot your system.

To try ZoneHound, first make sure that it's running by displaying the help text:

/opt/sarcheck/bin/zonehound -h

Now run a quick test of ZoneHound:

/opt/sarcheck/bin/zonehound -quick &

This will cause ZoneHound to run for about two minutes. After 2-3 minutes, the output from this test will be stored in the /opt/sarcheck/zoneout directory and the filename is a datestamp using the yyyymmdd naming convention.

For a longer test of ZoneHound, type:

/opt/sarcheck/bin/zonehound &

This will cause ZoneHound to run for the default length of time, currently four hours. At the end of four hours, the output will be in the /opt/sarcheck/zoneout directory and the filename is a datestamp using the yyyymmdd naming convention.

To reduce typing, you may want to add /opt/sarcheck/bin to root's PATH.


How to set up ZoneHound in cron:

One of the most powerful features in ZoneHound is its ability to analyze data collected by ps -elf and add this information to its analysis of sar data. A few simple steps are required to take advantage of this powerful feature:
  1. Log in to the global zone as root
  2. If it doesn't exist, make the directory /opt/sarcheck/ and allow root to write to it:
    mkdir /opt/sarcheck/zoneout
  3. Set the permissions for the directory
    chmod g+w /opt/sarcheck/zoneout
    chgrp sys /opt/sarcheck/zoneout
  4. Add the following entries to the crontab file for root for typical 8:00AM to 5:00PM, Monday thru Friday monitoring:
    0 8 * * 1-5 /opt/sarcheck/bin/zonehound -daylen 32400


How to activate the software:

Licensed software will require an activation key. Run zonehound -s and forward the output to sales@sarcheck.com. Feel free to install eval software on as many Solaris systems as you want and use it until it expires.


How to deinstall ZoneHound:

Remove the files which are described in the section entitled "Files included in this release". If you are loading a new version of ZoneHound over an old one, deinstallation is not necessary.


How to change the menu defaults:

The ZoneHound script looks for the file named zonehound_parms (by default in /opt/sarcheck/etc) and will use any values found there instead of the normal defaults. This file is not included as part of the ZoneHound distribution and you'll need to create it if you want to use it. The syntax for the zonehound_parms file is very simple. Create a line in the file with the keyword and its new default value, separated by a space.


Some examples of how to use the switches:

The number of switches and options available in ZoneHound is growing and this section is designed to help you decide how to do what you want. This is beta software and some of the switches may not be working yet.

Example 1: A quick test of ZoneHound. Once you've installed ZoneHound, there is a quick way to test it in order to be sure that it's working correctly.

/opt/sarcheck/bin/zonehound -quick

The -quick switch will cause ZoneHound to run for just a few minutes. The output will be stored in the /opt/sarcheck/zoneout directory and the file name is a date stamp using the yyyymmdd naming convention.

Example 2: A longer test of ZoneHound. Once you've installed ZoneHound, there is a quick way to test it in order to be sure that it's working correctly.

/opt/sarcheck/bin/zonehound &

This will run for the default period of four hours, so you'll start this and then check it later. The output will be stored in the /opt/sarcheck/zoneout directory and the file name is a date stamp using the yyyymmdd naming convention.

Example 3: Changing the recording interval. The -wrfreq switch will change the frequency with which ZoneHound writes it's output. If you're monitoring a system with a workload that has frequent intermittent spikes in resource utilization, you may want to pick a lower number. If you're looking at overall activity over an entire day, you might want to pick a longer interval. If you're running our SarCheck analysis tool and want it to look at ZoneHound output, try to use the same intervals for sar, ps, and ZoneHound data collection. The following example assumes that you want the recording interval to be 20 minutes, or 1200 seconds).

/opt/sarcheck/bin/zonehound -wrfreq 1200 &

Again, this will run for the default period of four hours, so you'll start this and then check it later. The output will be stored in the /opt/sarcheck/zoneout directory and the file name is a date stamp using the yyyymmdd naming convention.

Example 4: Changing the length of the data collection. The -daylen switch tells ZoneHound how long to run in seconds. If you want ZoneHound to run for 10 hours, tell it to run for 36000 seconds. There is no "correct" length of time to run ZoneHound. Run it during the times that you care about, don't run it during times that you don't care about, and if you're running our SarCheck analysis tool and want it to look at ZoneHound output, try to coordinate the starting and ending times with those used for sar and ps data collection. The following example assumes that you want the recording interval to be 10 hours, or 36000 seconds).

/opt/sarcheck/bin/zonehound -daylen 36000 &

This will run for ten hours, so you'll start this and check it the next day. The output will be stored in the /opt/sarcheck/zoneout directory and the file name is a date stamp using the yyyymmdd naming convention.


How to access the online instructions / help text:

To display the online help text, type:

/opt/sarcheck/bin/zonehound -h

This sends a subset of the instructions found in this manual to standard output (stdout), which defaults to the screen. More details are found in this manual.

Here is the online FAQ section.


How to order ZoneHound:

Use the -o option of /opt/sarcheck/bin/zonehound to produce an order form and call us, or ask your reseller to call us. The cost of shipping ZoneHound via US Mail is included in the price of ZoneHound. If you'd like the software shipped via Federal Express, DHL, etc., please provide your account number and we will be happy to accommodate you.

In some parts of the world, local resellers may charge prices which are higher than our list price because they pay for the currency conversions, international shipping, duties, support, etc. We urge our customers to support their resellers.


How to get technical support for ZoneHound:

Please read the FAQ section of this manual and visit the FAQ section of our website first. This will always be the fastest way to get the answer to a frequently asked question. If that doesn't do it:

Call us at +1-603-430-8300,
fax us at +1-603-430-8303
write to us at PO Box 7104, Portsmouth NH 03801, USA
use our email address: support@sarcheck.com
or contact the party from whom you purchased ZoneHound.


Files included in this release:

This release contains the following files:

/opt/sarcheck/bin/zonehound: This program collects the data, formats it, and stores it in the /opt/sarcheck/zoneout directory.

/opt/sarcheck/etc/zonehound.key: This file contains the activation key. Tampering with this file may permanently disable ZoneHound.

/opt/sarcheck/bin/zh2: This is a script that cleans up zonehound output, and is roughly analogous to a subset of the sa2 script used by sar.

/opt/sarcheck/zoneout/README: This is a readme file that explains the purpose of the zoneout directory..

/opt/sarcheck/etc/zhman101.htm: A copy of this manual. The exact name of the manual may vary from one release to another.


An example of ZoneHound's output:

Here is an example of ZoneHound's output. It is easy to parse and an explanation of each column follows this example.

15:00:00  Zone      CPU-sec  RSS-mb  PR-avg O-avg R-avg  Z-avg   table
15:10:00  global      116    214.24    64.0   1.0   0.0    0.0    8.7%
15:10:00  development 232    201.56    63.5   0.3   0.0    0.0    8.7%
15:10:00  production   40     83.25    43.0   0.0   0.0    0.0    8.7%
15:20:00  global      131    214.25    64.0   1.1   0.0    0.0    9.6%
15:20:00  development 221    211.01    65.7   1.0   0.0    0.0    9.6%
15:20:00  production   40     83.25    43.0   0.0   0.0    0.0    9.6%
15:30:00  global      204    214.24    64.0   1.1   0.0    0.0   10.3%
15:30:00  development 232    203.95    63.5   0.9   0.0    0.0   10.3%
15:30:00  production   42     83.25    43.0   0.0   0.0    0.0   10.3%
15:40:00  global      187    214.24    64.0   1.2   0.0    0.0   11.2%
15:40:00  development 211    199.85    63.2   0.1   0.0    0.0   11.2%
15:40:00  production   41     83.28    43.0   0.0   0.0    0.0   11.2%
15:50:00  global      177    191.26    62.8   1.1   0.0    0.0   11.9%
15:50:00  development 210    196.60    62.0   0.1   0.0    0.0   11.9%
15:50:00  production   39     83.28    43.0   0.0   0.0    0.0   11.9%

Zone: The name of each zone. Running the command zonename should give you the same thing.

CPU-sec: The total number of CPU seconds that can be accounted for with ps -elfyZ since the last time ps was run. This is only available with a resolution of one second and it may exceed the number of seconds of real time if more than one processor or core was in use.

RSS-mb: The total in megabytes of all of the RSS values for a given zone. This amount may exceed the amount of physical memory, apparently because shared memory is counted more than once.

PR-avg: The average number of processes seen by ps. The ps utility was run a number of times and this is the average.

O-avg: The average number of processes seen by ps with an "O" in the "S" column. This is the number of processes running.

R-avg: The average number of processes seen by ps with an "R" in the "S" column. This is the number of processes waiting to run.

Z-avg: The average number of processes seen by ps with an "Z" in the "S" column. This is the number of zombie processes seen.

table: The usage of ZoneHound's internal table or the number of times the table overflowed. This column is here to help us properly size the table.


Appendix A: ZoneHound parms file keywords

ZoneHound will allow you to change how frequently it performs various functions. These changes can be implemented using the file /opt/sarcheck/etc/zonehound_parms.

Please note that the default values of are really just educated guesses because this is a new program that has only been tested at a few beta sites. We think the guesses are a reasonably good place to start because we're capturing a lot of activity despite the low overhead. Here is a list of values which can be overridden, and the meaning of each.

RDFREQThe frequency in seconds used to collect resource utilization data by running the ps command. The default is 30 seconds. Increasing the frequency will reduce the amount of data that gets lost but it will increase the overhead of running ZoneHound.
WRFREQThe frequency in seconds used to write resource utilization data to output. The default is 600 seconds, or 10 minutes. Increasing the frequency will enable you to see any peaks in activity more clearly but will also make the data more "noisy" because random peaks will not be averaged into the data. Decreasing the frequency will make it easier to see the big picture, but some detail will be lost. The WRFREQ value should be at least 10 times greater than the RDFREQ value. This will ensure that there are more than just a few runs of the ps command summarized in each line of output.
DAYLENThe length of time in seconds that you want ZoneHound to run. Run this for as long as you want, but don't collect data from time periods that you don't care about.


Appendix B:

Options available when running ZoneHound

-daylenThe length of time in seconds that you want ZoneHound to run. Run this for as long as you want, but don't collect data from time periods that you don't care about.
-hDisplays brief instructions and shows all of the possible switches.
-kAllows you to change the activation key and software expiration date.
-oPrints an order/registration form for those wishing to purchase a software license, or register their licensed software.
-quickProduces a quick run of ZoneHound for test purposes.
-rdfreq The frequency in seconds used to collect resource utilization data by running the ps command. The default is 30 seconds. Increasing the frequency will reduce the amount of data that gets lost but it will increase the overhead of running ZoneHound. We suggest setting this to a value between 5 and 60 seconds.
-sDisplay all the information needed to activate ZoneHound.
-stdout Send ZoneHound output to stdout so that it can be piped, redirected, etc. The default is to put the output in the /opt/sarcheck/zoneout directory.
-wrfreq The frequency in seconds used to write resource utilization data to output. The default is 600 seconds, or 10 minutes. Increasing the frequency will enable you to see any peaks in activity more clearly but will also make the data more "noisy" because random peaks will not be averaged into the data. Decreasing the frequency will make it easier to see the big picture, but some detail will be lost. The WRFREQ value should be at least 10 times greater than the RDFREQ value. This will ensure that there are more than just a few runs of the ps command summarized in each line of output. We suggest setting this to 300, 600, 900, or 1200 seconds.