SarCheck(TM): Automated Analysis of HP-UX sar and ps data
(English text version 6.00.03)
NOTE: This software is scheduled to expire on 01/11/2005 and has not yet
been tied to your system's Machine ID. To permanently activate
SarCheck, please run /usr/local/bin/analyze9000 -o and send the output
to us so that we can generate an activation key for you.
This is an analysis of the data contained in the file /tmp/rpt. There
were 2 days of data collected from 11/29/2004 to 11/30/2004, from the
HP9000/785/C360 system 'hippie'. There were 200 data records used to
produce this analysis. The operating system used to produce the sar
report was HP-UX Release B.11.00. 1 processor is present. 64 megabytes
of memory are present.
Data collected by the ps -elf command during 2 days between 11/29/2004
and 11/30/2004 will also be analyzed. This program will attempt to
match the starting and ending times of the ps -elf data with those of
the sar report file named /tmp/rpt.
SUMMARY
When the data was collected, no CPU bottleneck could be detected. A
memory bottleneck was seen. No significant I/O bottleneck was seen. A
change to at least one tunable parameter has been recommended. Limits
to future growth have been noted in the Capacity Planning section.
At least one possible runaway process has been detected. See the
Resource Analysis section for details.
RECOMMENDATIONS SECTION
All recommendations contained in this report are based solely on the
conditions which were present when the performance data was collected.
It is possible that conditions which were not present at that time may
cause some of these recommendations to result in worse performance. To
minimize this risk, analyze data from several different days, implement
only regularly occurring recommendations, and implement them one at a
time.
Additional memory may improve performance. If possible, borrow some
memory for test purposes, and monitor system performance and resource
utilization before and after its installation.
Change 'bufpages' from 0 to 2457. SarCheck has determined that this
value should be a more efficient way of handling the size of the cache
which has been effectively fixed at one size by a memory poor condition.
No disk recommendations have been made because no bottleneck was seen.
It may be possible to reduce memory utilization by reducing the
parameter 'maxdsiz'. This parameter defines the maximum data segment
size, and a smaller value will prevent users from taking up as much
memory. The optimum value of this parameter is application dependent
and experimentation is required.
Use the System Administration Manager (SAM) to change the values of
tunable parameters. More information on the SAM utility and relinking
the kernel is available in the System Administration Tasks manual.
RESOURCE ANALYSIS SECTION
Average CPU utilization was only 19.2 percent. This indicates that
spare CPU capacity exists. If any performance problems were seen during
the entire monitoring period, they were not caused by a lack of CPU
power. User CPU as measured by the %usr column in the sar -u data
averaged 18.62 percent and system CPU (%sys) averaged 0.56 percent. The
sys/usr ratio averaged 0.03 : 1. CPU utilization peaked at 29 percent
during multiple time intervals.
The CPU was waiting for I/O an average of 0.5 percent of the time. This
statistic does not indicate the presence of an I/O bottleneck. The time
that the system was waiting for I/O peaked at 20 percent from 09:00:00
to 09:10:01, on 11/30/2004.
The CPU was idle (neither busy nor waiting for I/O) and had nothing to
do an average of 80.3 percent of the time. If overall performance was
good, this means that on average, the CPU was lightly loaded. If
performance was generally unacceptable, the bottleneck may have been
caused by remote file I/O which cannot be directly measured with sar and
therefore cannot be considered by SarCheck.
The run queue had an average depth of 1.3 which indicates that processes
were generally not bound by latent demand for CPU resources. The run
queue was usually occupied, despite the lack of a significant run queue
depth. This condition is usually seen when the number of CPU-intensive
processes is low. It is likely that the performance of these processes
is closely related to CPU speed.
The syncer daemon used 0.03 percent of the CPU from 07:30:01 to
17:00:01. The syncer is responsible for writing data from the buffer
cache to disk. It's activity indicates that it is not so active as to
cause a problem.
This system's buffer cache is dynamic, meaning that its size is
determined by the amount of free memory on the system. Buffer cache
data indicates that increasing the size of dbc_max_pct would probably
not be effective because memory pressure would prevent the buffer cache
from growing much beyond the value specified by dbc_min_pct. Based on
the current values of dbc_min_pct and dbc_max_pct, the buffer cache can
range in size from 9.6 to 25.6 megabytes of memory. The actual size of
the dynamic buffer cache ranged from 9.6 to 9.9 megabytes of memory.
At least one indication of a memory shortage was seen in the following
statistics: Data collected with ps -elf shows that the sched daemon used
67 seconds of CPU time. This indicates a memory shortage. Data
collected with ps -elf shows that the vhand daemon used 54 seconds of
CPU time. This indicates a possible memory shortage, which is confirmed
by other statistics related to memory utilization. The swap out rate
indicates that an intermittent memory bottleneck may have existed. This
may result in inconsistent performance. The swap out rate peaked at
1.03 per second from 11:20:01 to 11:30:00, on 11/29/2004. Peak resource
utilization statistics can be used to help understand performance
problems. If performance was worst during the period of peak swap out
activity, then a performance bottleneck may be a memory shortage.
The minimum number of free pages of memory seen was 35. The value of
lotsfree was 586 pages and the maximum value of gpgslim seen was 338
pages. The value of desfree was 146. If the minimum number of free
pages drops below the value of desfree, the system should benefit from
additional memory.
The fs_async flag is not set. This may result in reduced disk
performance, but keeps filesystem data structures consistent in the
event of a system crash. This option is currently in the state
recommended for production systems. Since no disk I/O bottleneck was
seen on this system, setting the fs_async flag would be unlikely to
provide enough of an improvement to justify the additional risk.
The average context switching rate was 248.2 per second. This works out
to an average of one context switch every 4.03 milliseconds. No
recommendations have been made to the timeslice parameter because no
problems were seen with the context switching rate.
No unusual configurable parameter values were seen in those parameters
which relate to the process accounting system. The current values of
acctsuspend and acctresume are unlikely to have an impact on system
performance.
The inode cache did not overflow, but was completely full in 79.5
percent of the samples collected during the monitoring period. No inode
table recommendation has been made because a change to the size of the
table would not be helpful.
The process and open file tables were less than 80.0 percent full. Peak
table usage statistics (max used/table size) as reported by sar: Process
table: 90/276. Open file table: 431/920.
The file table, controlled by the nfile parameter, was much larger than
necessary. There is nothing to gain by reducing the size of this table,
so no change to the parameter 'nfile' is recommended.
The average rate of System V semaphore calls was 0.001 per second. No
problems have been seen, and no changes have been recommended for System
V semaphore parameters. Note that SarCheck only checks these
parameter's relationships to each other since semaphore usage data is
not available. Algorithms used by SarCheck to check these relationships
are available in the help text of SAM.
No System V message activity was seen. No problems have been seen, and
no changes have been recommended for System V message parameters. Note
that SarCheck only checks these parameter's relationships to each other
since message usage data is not available. Algorithms used by SarCheck
to check these relationships are available in the help text of SAM, and
in the file /usr/include/sys/msg.h.
The ratio of exec to fork system calls was 0.82. This indicates that
PATH variables are efficient.
One volume group was seen and the maxvgs parameter was set to 10. This
leaves plenty of room for growth and no changes to maxvgs have been
recommended.
The volume group /dev/vg00 contained 1 physical volume and 9 logical
volumes. All of the logical volumes were open. The size of the group
was 4.00 gigabytes, of which 50.24 percent was allocated and 49.76
percent was free.
The disk device c0t6d0 was busy an average of 0.70 percent of the time
and had an average queue depth of 0.8 (when occupied). This indicates
that the device is not a performance bottleneck. The average service
time reported for this device and its accompanying disk subsystem was
17.0 milliseconds. This is somewhat slow for a modern disk drive, and
the disappointing performance may be due to the disk or its controller.
Service time is the delay between the time a request was sent to a
device and the time that the device signaled completion of the request.
The disk device c0t6d0 was reported by pvdisplay as being a 4.00
gigabyte disk. 2036 megabytes of space was reported as being free and
2056 megabytes have been allocated. This disk device was a part of
volume group /dev/vg00 and contained 9 logical volumes. At least one
logical volume occupied noncontiguous physical extents on the disk. The
following paragraph will provide more details.
The logical volume /dev/vg00/lvol5 was located in more than one place on
disk c0t6d0. If this logical volume is busy and it is not mirrored,
performance will suffer because the disk's read/write heads are likely
to travel back and forth in an inefficient manner. The gap between two
places where the logical volume was located was 386 blocks in size.
This was more than one third of the disk's total size and is a large
gap. If /dev/vg00/lvol5 was an active logical volume, large gaps are
likely to have been a contributing factor in the slow service time seen
on disk volume c0t6d0.
The disk device c1t2d0 was busy an average of 0.00 percent of the time
and had an average queue depth of 0.5 (when occupied). This indicates
that the device is not a performance bottleneck. The average service
time reported for this device and its accompanying disk subsystem was
1220.1 milliseconds. This is so slow that the sar statistics may be
unreliable or the device may be something other than a conventional hard
disk. Floppy disk drives on model 800 systems will cause this message
to be printed. Due to the suspiciously slow average service time,
statistics from this device will not be used to in capacity planning and
comma-separated statistics.
At 12:00:00 on 11/30/2004 ps -elf data indicated that there were 91
processes present. This was the largest number of processes seen with
ps -elf but it is not likely to be the absolute peak because the
operating system does not store the true "high-water mark" for this
statistic. There were an average of 74.3 processes present.
CPU usage seen in /usr/bin/X11/X, owned by root, pid 1569. Between
09:20:00 and 17:00:01, 8458 seconds of CPU time were used. CPU
utilization by this process averaged 30.64 percent during that interval.
CAPACITY PLANNING SECTION
This section is designed to provide the user with a rudimentary linear
capacity planning model and should be used for rough approximations
only. These estimates assume that an increase in workload will affect
the usage of all resources equally. These estimates should be used on
days when the load is heaviest to determine approximately how much spare
capacity remains at peak times.
Based on the limited data available in these sar reports, the system
should be able to support a very limited increase in workload at peak
times before the first resource bottleneck affects performance. See the
following paragraphs for additional information.
The CPU can support an increase in workload of at least 100 percent at
peak times. Since page outs and/or swapping were detected, an increase
in workload should be accompanied by an increase in memory. The busiest
disk can support a workload increase of at least 100 percent at peak
times. For more information on peak CPU and disk utilization, refer to
the Resource Analysis section of this report.
The process table, controlled by the parameter 'nproc', can support at
least a 100 percent increase in the number of entries. The file table,
controlled by the parameter 'nfile', can support approximately a 71
percent increase in the number of entries.
CUSTOM SETTINGS SECTION
The default TEXTCOLOR value was changed in the sarcheck_parms file from
black to blue.
The gnuplot graph directory specified in the sarcheck_parms file with
the GRAPHDIR keyword was /tmp.
Please note: In no event can Aptitune Corporation be held responsible
for any damages, including incidental or consequent damages, in
connection with or arising out of the use or inability to use this
software. All trademarks belong to their respective owners. This
software licensed for the exclusive use of: test. This software must be
activated by 01/11/2005 (mm/dd/yyyy). SC9000 Code version: 6.00.03.
Serial number: 58483828.
This software is updated frequently. For information on the latest
version, contact the party from whom SarCheck was originally purchased,
or visit our web site.
NOTE: This software appears to be unregistered. Please register with us
by printing the registration form using 'analyze9000 -o', filling it
out, and sending it to us via snail mail, fax, or email.
(c) copyright 1995-2004 by Aptitune Corporation, Plaistow NH, USA, All
Rights Reserved. http://www.sarcheck.com
Statistics for system, hippie
System model number is, 9000/785/C360
Statistics collected from, 11/29/2004
Statistics collected until, 11/30/2004
Average CPU utilization, 19.2%
Peak CPU utilization, 29%
Average user CPU utilization, 18.6%
Average sys CPU utilization, 0.6%
Average waiting for I/O, 0.5%
Average run queue depth, 1.3
Peak run queue depth, 1.9
Average swap queue occupancy, 0.0%
Average swap out rate, 0.87/sec
Average cache read hit ratio, 98.8%
Average cache write hit ratio, 59.0%
Disk device w/highest peak, c0t6d0
Avg pct busy for that disk, 0.70%
Peak pct busy for that disk, 22.28%
Avg number of processes seen by ps, 74.3
Max number of processes seen by ps, 91
Percent of process tbl used, 32.6%
Process table overflows, No
Percent of file table used, 46.8%
File table overflows, No
Inode cache pct of time full, 79.5%
Inode cache overflows, No
Approx CPU capacity remaining, 100%+
Approx I/O bandwidth remaining, 100%+
Remaining process tbl capacity, 100%+
Remaining file table capacity, 70.8%
Can memory support add'l load, No