SarCheck(TM): Automated Analysis of HP-UX sar and ps data

(English text version 4.00)


This is an analysis of the data contained in the file sar29t. The data was collected on 08/29/2000, from 12:30:00 to 17:00:00, from the HP9000/816/E35 system 'aurora'. There were 18 data records used to produce this analysis. The operating system used to produce the sar report was HP-UX Release B.10.10. 1 processor is present. 32 megabytes of memory are present.

Data collected by the ps -elf command on 08/29/2000 from 12:40:01 to 17:00:00, and stored in the file /usr/local/ps/20000829, will also be analyzed.

SUMMARY

When the data was collected, no CPU bottleneck could be detected. No significant I/O bottleneck was seen. A change to at least one tunable parameter has been recommended. Limits to future growth have been noted in the Capacity Planning section.

RECOMMENDATIONS SECTION

All recommendations contained in this report are based solely on the conditions which were present when the performance data was collected. It is possible that conditions which were not present at that time may cause some of these recommendations to result in worse performance. To minimize this risk, analyze data from several different days, implement only regularly occurring recommendations, and implement them one at a time.

A CPU upgrade is not recommended because the current CPU had significant unused capacity.

Change the value of 'nproc' from 148 to 198. The parameter 'nproc' is used to set the maximum number of processes which may run on the system simultaneously. This change will use roughly 0.03 additional megabytes of memory. This approximation does not take into account the memory impact of changes to any other parameters whose values are dependent on this value. The accuracy of this approximation is also limited by the fact that the actual size of the kernel changes in 4kb increments.

No disk recommendations have been made because no bottleneck was seen.

Use the System Administration Manager (SAM) to change the values of tunable parameters. More information on the SAM utility and relinking the kernel is available in the System Administration Tasks manual.

RESOURCE ANALYSIS SECTION

Average CPU utilization was only 17.2 percent. This indicates that spare CPU capacity exists. If any performance problems were seen during the entire monitoring period, they were not caused by a lack of CPU power. CPU utilization peaked at 24 percent from 14:00:00 to 14:15:00.

The run queue had an average depth of 1.2 which indicates that processes were generally not bound by latent demand for CPU resources.

The CPU was idle (neither busy nor waiting for I/O) and had nothing to do an average of 73.7 percent of the time. If overall performance was good, this means that on average, the CPU was lightly loaded. If performance was generally unacceptable, the bottleneck may have been caused by remote file I/O which cannot be directly measured with sar and therefore cannot be considered by SarCheck.

The CPU was waiting for I/O an average of 9.1 percent of the time. This infers that the system may have been somewhat I/O bound. The time that the system was waiting for I/O peaked at 12 percent from 14:30:00 to 14:45:01. Disk statistics do not confirm the presence of an I/O bottleneck, and tape I/O may have been responsible.

The syncer daemon used 0.0 percent of the CPU from 12:40:01 to 17:00:00. The syncer is responsible for writing data from the buffer cache to disk. Its activity indicates that it is not so active as to cause a problem.

This system's buffer cache is dynamic, meaning that its size is determined by the amount of free memory on the system. Some buffer cache statistics were poor but there was insufficient disk activity to justify further investigation. Based on the current values of dbc_min_pct and dbc_max_pct, the buffer cache can range in size from 1.6 to 25.6 megabytes of memory.

No evidence of an overall memory shortage was seen in the following statistics: The swap queue was occupied an average of 0 percent of the time. The average swap out rate was 0.00 per second.

The fs_async flag is not set. This may result in reduced disk performance, but keeps filesystem data structures consistent in the event of a system crash. This option is currently in the state recommended for production systems. Since no disk I/O bottleneck was seen on this system, setting the fs_async flag would be unlikely to provide enough of an improvement to justify the additional risk.

No unusual configurable parameter values were seen in those parameters which relate to the process accounting system. The current values of acctsuspend and acctresume are unlikely to have an impact on system performance.

The inode cache did not overflow, but was completely full in 16.7 percent of the samples collected during the monitoring period. This is not unusual with UNIX operating systems such as HP-UX which use the inode table as a cache.

The process table was almost full during part of the monitoring period. Specific recommendations for increasing the size of this table have been made in the recommendations section. Peak table usage statistics (max used/table size) as reported by sar: Process table: 135/148. Open file table: 82/1480.

The file table, controlled by the nfile parameter, was much larger than necessary. There is nothing to gain by reducing the size of this table, so no change to the parameter 'nfile' is recommended.

No System V semaphore activity was seen. No problems have been seen, and no changes have been recommended for System V semaphore parameters. Note that SarCheck only checks these parameter's relationships to each other since semaphore usage data is not available. Algorithms used by SarCheck to check these relationships are available in the help text of SAM.

The average rate of System V message calls was 4.6 per second. System V message activity peaked at a rate of 5.17 per second during multiple time intervals. No problems have been seen, and no changes have been recommended for System V message parameters. Note that SarCheck only checks these parameter's relationships to each other since message usage data is not available. Algorithms used by SarCheck to check these relationships are available in the help text of SAM, and in the file /usr/include/sys/msg.h.

The ratio of exec to fork system calls was 1.00. This indicates that PATH variables are efficient.

The -dtoo switch has been used to format disk statistics into the following table.

Disk Device Statistics
Results sorted by Average Percent Busy
Disk Device Average Percent Busy Peak Percent Busy Queue Depth When Occupied Average Service Time
disc3-0 3.9 20.0 3.7 7.1
disc3-2 0.7 12.0 1.0 206.1

The -dbusy switch has been used to sort the disk analysis by the average percent of time the disk was busy.

The disk device disc3-0 was busy an average of 3.9 percent of the time and had an average queue depth of 3.7 (when occupied). This usage pattern is typical of that generated by sync activity. Sync activity refers to efforts made by the sync process to transfer data from the system buffer cache to disk. The average service time reported for this device and its accompanying disk subsystem was 7.1 milliseconds. This is relatively fast. Service time is the delay between the time a request was sent to a device and the time that the device signaled completion of the request.

The disk device disc3-2 was busy an average of 0.7 percent of the time and had an average queue depth of 1.0 (when occupied). This indicates that the device is not a performance bottleneck. The average service time reported for this device and its accompanying disk subsystem was 206.1 milliseconds. This is so slow that the sar statistics may be unreliable or the device may be something other than a conventional hard disk. Floppy disk drives on model 800 systems will cause this message to be printed. Due to the suspiciously slow average service time, statistics from this device will not be used to in capacity planning and comma-separated statistics.

No runaway processes, memory leaks, or suspiciously large processes were detected in the data contained in file /usr/local/ps/20000829.

CAPACITY PLANNING SECTION

This section is designed to provide the user with a rudimentary linear capacity planning model and should be used for rough approximations only. These estimates assume that an increase in workload will affect the usage of all resources equally. These estimates should be used on days when the load is heaviest to determine approximately how much spare capacity remains at peak times.

Based on the limited data available in this single sar report, the system cannot support an increase in workload at peak times without some loss of performance or reliability, and the bottleneck is likely to be the size of the process table (nproc). Implementation of some of the suggestions in the recommendations section may help to increase the system's capacity.

The CPU can support an increase in workload of at least 100 percent at peak times. Due to the lack of page outs or swapping activity, the amount of memory present should be able to support a greater load. The busiest disk can support a workload increase of at least 100 percent at peak times. For more information on peak CPU and disk utilization, refer to the Resource Analysis section of this report.

The process table, controlled by the parameter 'nproc', can support approximately a 0 percent increase in the number of entries. The file table, controlled by the parameter 'nfile', can support at least a 100 percent increase in the number of entries.

Please note: In no event can Aurora Software Inc. be held responsible for any damages, including incidental or consequent damages, in connection with or arising out of the use or inability to use this software. All trademarks belong to their respective owners. Evaluation copy for: Your Company. This software expires on 10/31/2000 (mm/dd/yyyy). SC9000 Code version: 4.00. Serial number: 00012345.

Thank you for trying this evaluation copy of SarCheck. To order a licensed version of this software, just type 'analyze9000 -o' at the prompt to produce the order form, and follow the instructions.

(c) copyright 1995-2000 by Aurora Software Inc., Plaistow NH, USA, All Rights Reserved. http://www.sarcheck.com

Statistics for system, aurora
System model number is, 9000/816/E35
Statistics collected on, 08/29/2000
Average CPU utilization, 17.2%
Peak CPU utilization, 24%
Average user CPU utilization, 8.5%
Average sys CPU utilization, 8.7%
Average waiting for I/O, 9.1%
Average run queue depth, 1.2
Peak run queue depth, 2.0
Average swap queue occupancy, 0.0%
Average swap out rate, 0.00/sec
Average cache read hit ratio, 91.5%
Average cache write hit ratio, 73.0%
Disk device w/highest peak, disc3-0
Avg pct busy for that disk, 3.9%
Peak pct busy for that disk, 20.0%
Percent of process tbl used, 91.2%
Process table overflows, No
Percent of file table used, 5.5%
File table overflows, No
Inode cache pct of time full, 16.7%
Inode cache overflows, No
Approx CPU capacity remaining, 100%+
Approx I/O bandwidth remaining, 100%+
Remaining process tbl capacity, 0.0%
Remaining file table capacity, 100%+
Can memory support add'l load, Yes