Текст статьи скопирован отсюда: http://www.ibm.com/developerworks/aix/library/au-unix-perfmonsar.html
 
 Users seem to remember performance problems some time after they occur. Ignoring the "If
 it wasn't important then, why is it important now?" question that you long to ask, the question
 then becomes, "What was the condition of the system at the time of the alleged problem?" By
 periodically taking performance snapshots and reviewing the data, you're one step closer to
 pinpointing the cause of the problem and creating a solution. 
 Collecting data 
 The SAR suite of utilities is bundled with your system (in fact, it
 is installed on most flavors of UNIX®), but probably not enabled. To 
enable SAR, you must run some utilities at periodic intervals through 
the cron facility. Use the crontab -e command while running as the root user, and then provide the configuration shown in Listing 1. 
 Listing 1. Run crontab for the root user to enable the SAR collection
 # Collect measurements at 10-minute intervals
0,10,20,30,40,50 * * * * /usr/lib/sa/sa1
# Create daily reports and purge old files
0 0 * * * /usr/lib/sa/sa2 -A
  |  
  
 The first command, sa1, is a shell script that calls sadc to collect the performance data in a binary log file. The sa1 command also ensures that each day has its own file, which I explain in the Timing is everything section. Run this command every ten minutes, which is a good tradeoff between granularity and system impact. 
 
 The second command, sa2, is another shell script that 
dumps all the data from the current day's binary log file into a text 
file, and then purges any log files older than seven days. The -A
 argument specifies what is extracted from the binary file into the text
 file. Although you can read the text file to see the status of the 
system for the day, I show you how to query the binary log files to be 
more precise. 
 Back to top Extracting useful information 
 Data is being collected, but it must be queried to be useful. Running the sar command without options generates basic statistics about CPU usage for the current day. Listing 2 shows the output of sar without any parameters. (You might see different column names depending on the platform. In some UNIX flavors, sadc
 collects more or less data based on what's available.) The examples 
here are from Sun Solaris 10; whatever platform you're using will be 
similar, but might have slightly different column names. 
 Listing 2. Default output of sar (showing CPU usage
 -bash-3.00$ sar
SunOS unknown 5.10 Generic_118822-23 sun4u 01/20/2006
00:00:01 %usr %sys %wio %idle
00:10:00 0 0 0 100
. cut ...
09:30:00 4 47 0 49
Average 0 1 0 98
  |  
  
 Each line in the output of sar is a single 
measurement, with the timestamp in the left-most column. The other 
columns hold the data. (These columns vary depending on the command-line
 arguments you use.) In Listing 2, the CPU usage is broken into four categories: 
 - %usr: The percentage of time the CPU is spending on user processes, such
 as applications, shell scripts, or interacting with the user.
 - %sys: The percentage of time the CPU is spending executing kernel tasks. In
 this example, the number is high, because I was pulling data from the kernel's random
 number generator.
 - %wio: The percentage of time the CPU is waiting for input or output from a
 block device, such as a disk.
 - %idle: The percentage of time the CPU isn't doing anything useful.
  
 The last line is an average of all the datapoints. However, because most systems experience
 busy periods followed by idle periods, the average doesn't tell the entire story. 
 Watching disk activity 
 Disk activity is also monitored. High disk usage means that there 
will be a greater chance that an application requesting data from disk 
will block (pause) until the disk is ready for that process. The 
solution typically involves splitting file systems across disks or 
arrays; however, the first step is to know that you have a problem. 
 The output of sar -d shows various disk-related statistics for
 one measurement period. For the sake of brevity, Listing 3 shows only hard disk drive activity. 
 Listing 3. Output of sar -d (showing disk activity)
 $ sar -d
SunOS unknown 5.10 Generic_118822-23 sun4u 01/22/2006
00:00:01 device %busy avque r+w/s blks/s avwait avserv
. cut ...
14:00:02 dad0 31 0.6 78 16102 1.9 5.3
 dad0,c 0 0.0 0 0 0.0 0.0
 dad0,h 31 0.6 78 16102 1.9 5.3
 dad1 0 0.0 0 1 1.6 1.3
 dad1,a 0 0.0 0 1 1.6 1.3
 dad1,b 0 0.0 0 0 0.0 0.0
 dad1,c 0 0.0 0 0 0.0 0.0
  |  
  
 As in the previous example, the time is along the left. The other columns are as follows: 
 - device: This is the disk, or disk partition, being 
measured. In Sun Solaris, you must translate this disk into a physical 
disk by looking up the reported name in /etc/path_to_inst, and then 
cross-reference that information to the entries in /dev/dsk. In Linux®, 
the major and minor numbers of the disk device are used.
 - %busy: This is the percentage of time the device is being read from or written to.
 - avque: This is the average depth of the queue that is 
used to serialize disk activity. The higher the avque value, the more 
blocking is occurring.
 - r+w/s, blks/s: This is disk activity per second in terms of read or write operations and
 disk blocks, respectively.
 - avwait: This is the average time (in milliseconds) that a disk read or write operation
 waits before it is performed.
 - avserv: This is the average time (in milliseconds) that a disk read or write operation
 takes to execute.
  
 Some of these numbers, such as avwait and avserv values, correlate directly
 into user experience. High wait times on the disk likely point to several people contending for the disk,
 which should be confirmed with high avque numbers. High avserv values
 point to slow disks. 
 Other metrics 
 Many other items are collected, with corresponding arguments to view them: 
 - The 
-b argument shows information on buffers and the efficiency of using a buffer versus having to go to disk. - The 
-c argument shows system calls broken down into some of the popular calls, such as fork(), exec(), read(), and write().
 High process creation can lead to poor performance and is a sign that 
you might need to move some applications to another computer. - The 
-g, -p, and -w arguments show paging (swapping) activity. High paging is a sign of memory starvation. In particular, the -w
 argument shows the number of process switches: A high number can mean 
too many things are running on the computer, which is spending more time
 switching than working. - The 
-q argument shows the size of the run queue, which is the same as the load average for the time. - The 
-r argument shows free memory and swap space over time.  
 Each UNIX flavor implements its own set of measurements and command-line arguments for sar. Those I've shown are common and represent the elements that I find more useful. 
 Back to top Timing is everything 
 The examples thus far have shown the current day's data, which has its uses, but it also has two
 problems: 
 - You're interested in an hour of data, but you get the whole day.
 - You need to go back to a different day.
  
 As you saw earlier, sa1 saves the data in a different file for each day. Looking at the sa1
 script itself tells you which directory is used; in the case of Sun 
Solaris 10, it is in /var/adm/sa. Several files reside in this 
directory, starting with either "sa" or "sar" followed by a number. The 
number represents the day of the month, with the files beginning with 
"sar" being text dumps of the data for that day (created by the nightly 
run of sa2) and the files beginning with "sa" holding the 
binary version. Indeed, the file containing the current date is the file
 that is being read from when you launch sar. 
 Specifying -f to the sar command selects 
the file to read from. If today were the 23rd day of the month, I could 
look at yesterday's data by reading from sa22 with the command sar -f /var/adm/sa/sa22. You can also pass the other arguments I showed you to access different types of data. 
 The second thing you can do to narrow the scope of the query is to specify the time by using the -s and -e arguments (think start and end). Note that -s is not inclusive, so you must subtract an extra ten minutes from the chosen start time. Continuing with the previous example, Listing 4 shows swap file usage and the run queue for the 22nd from 2:30 p.m. to 3:00 p.m. 
 Listing 4. A complex sar query specifying date, time, and multiple data sets
 # sar -f /var/adm/sa/sa22 -s 14:20 -e 15:00 -w -q -i 4
SunOS unknown 5.10 Generic_118822-23 sun4u 01/22/2006
14:20:00 swpin/s bswin/s swpot/s bswot/s pswch/s
14:30:00 0.00 0.0 0.00 0.0 140
14:40:01 0.00 0.0 0.00 0.0 144
14:50:01 0.00 0.0 0.00 0.0 140
15:00:00 0.00 0.0 0.00 0.0 139
Average 0.00 0.0 0.00 0.0 140
14:20:00 runq-sz %runocc swpq-sz %swpocc
14:30:00 10.5 100 0.0 0
14:40:01 10.5 100 0.0 0
14:50:01 10.4 100 0.0 0
15:00:00 10.5 100 0.0 0
Average 10.5 100 0.0 0
  |  
  
 Back to top Making sense of it all 
 A brief look at Listing 4
 shows that swap activity was NIL, approximately 140 process switches 
per second occurred, and the load average was slightly more than ten. 
Assuming that you were investigating a claim of poor performance at the 
time, what does this tell you? 
 - Whatever process is running isn't memory intensive, because you don't see swapping.
 - Chances are that this problem is caused by a long-running set of
 processes, because the run queue and process switches are relatively 
consistent. Had they not been, you could suspect application-level 
problems, such as a busy Web server.
 - Knowing that the output of Listing 3 shows part of the same time period, you can see that one of the disks was being used heavily (31 percent according to 
sar -b,
 but also 16,000 blocks per second). This disk is the home directory 
partition; depending on what the user was trying to do, he or she might 
have experienced slow responses.  
 A quick look at the CPU usage for the time period shows that the 
system took up approximately 80 percent of the CPU; the rest was 
consumed by user tasks. As the systems administrator, you can use this 
information in three ways: 
 - Go back over previous days' logs. In this case, I found that the problem started at 1:00 p.m. and
 ended the next morning.
 - Try to correlate the activity to any 
cron jobs that might have been
 started that day. - Try to find a trend. Looking at data from a couple of other days, I saw that the performance
 was normal, which isn't indicative of a system that has reached its limits.
  
 In this case, the problem seemed to be isolated, and for good reason -- I was intentionally running the
 disks with shell scripts to create some interesting sar reports!
 However, had a trend appeared, such as busy home drives during working hours, it would have been a
 call to do something about the problem. Possible solutions range from splitting home directories off to
 other disks, installing faster disks, or moving to something like Network Attached Storage (NAS). 
 Back to top Conclusion 
 Obtaining qualitative data about your system at periodic intervals is an effective way of finding
 performance bottlenecks and determining whether further action is needed. SAR and related utilities do
 just this -- snapshots are taken every ten minutes and a front end allows you to access this data.
 Though tactical in nature, a wealth of information is provided that enables systems administrators to
 discover just what aspect of the system is suffering and whether it requires further investigation. 
 
 
Resources Learn - SAR runs on most flavors of UNIX, including
 
 AIX®, HP-UX, and
 Linux.
 
 
  - Stay current with developerWorks technical events and Webcasts.
 
 
  - The UNIX Insider Perfomance Q&A column has some valuable advice on performance-tuning Solaris, including more interpretation of 
sar results.
 
 
  - If you liked 
sar, you might also like iostat and
 vmstat, which let you dig into current system activity in more depth. The Solaris System Adminstration Guide outlines these tools' use along with more information on sar. Like sar, most of this information applies to other flavors of UNIX.
 
 
  - I've written about using
 
vmstat to watch current activity for Linux, which also applies to systems such as AIX, Solaris, and HP-UX.
 
 
  
 Get products and technologies - 
 Build your next development project with 
 IBM
 trial software, available for download directly from developerWorks.
 
 
  - 
 SarCheck® has a commercial offering built around SAR
 that provides a graphical view of the data. A free
 evaluation is available.
 
  
  
		
	  |