Memory and swapping
Two indicators of a RAM shortage are the scan rate
and swap device activity.
In both cases, the high activity rate can be due to a process that
does not have a consistent large impact on performance.
The processes running on the system have to be examined to see how
frequently they are run and what their impact is. It may
be possible to re-work the program or run the process differently to
reduce the amount of new data being read into memory. See
"Process Memory Usage" below.
Whether or not to provide additional RAM for infrequent processes
is a classic money/performance tradeoff. If the cost is more important
than the performance, additional virtual memory space must be provided
to allow enough space for the application to run. The cheapest way
to do this is to provide additional swap space.
If adequate total virtual memory space is not provided, new processes
will not be able to open. (The system may report "Not enough space"
or "WARNING: /tmp: File system full, swap space limit exceeded.")
If inadequate physical memory is provided,
the system will be so busy paging to swap that it will be unable to
keep up with demand. (This state is known as "thrashing" and is
characterized by heavy I/O on the swap device and horrendous performance.
In this state, the scanner can use up to 80% of CPU.)
(For a more thorough discussion of paging, see
"Paging" below.
The page scanning rate is the main tipoff that a system does not have
enough physical memory. Use
sar -g or
vmstat to look at the scan rate.
With vmstat , use vmstat 30 to check memory
useage every 30 seconds. Ignore the summary statistics on the first
line. If page/sr exceeds 200 pages per second for an
extended time, your system may be running short of physical memory.
(Shorter sampling periods may be used to get a feel for what is
happening on a smaller time scale.)
A very low scan rate is a sure indicator that the system is not running
short of physical memory. On the other hand, a high scan rate can be
caused by transient issues, such as a process reading large amounts of
uncached data. The processes on the system should be examined to see
how much of a long-term impact they have on performance.
A nonzero scan rate is not necessarily an indication of a problem.
Over time, memory is allocated for caching and other activities.
Eventually, the amount of memory will reach the lotsfree
memory level, and the pageout scanner will be invoked. For a
more thorough discussion of the paging algorithm, see
Paging below.
The amount of disk activity on the swap device can be measured using
iostat . For Solaris 2.6 and higher,
iostat -xPnce provides information on disk activity on
a partition-by-partition basis. For Solaris 2.5.1, iostat -xc
provides information on a disk-by-disk basis, which may be of limited use
unless swap has its own physical disk.
sar -d provides similar information, and
vmstat provides some usage
information as well.
If there are I/O's queued for the swap device, application paging is occurring.
If there is significant, heavy I/O to the swap device, a RAM upgrade may
be in order.
The
/usr/proc/bin/pmap command is available in Solaris 2.6 and above. It
can help pin down which process is the memory hog. /usr/proc/bin/pmap -x
PID prints out details of memory use by a process.
Summary statistics regarding process size can be found in the RSS
column of ps -ly or top .
dbx , the debugging utility in the SunPro package, has extensive
memory leak detection built in. The source code will need to be compiled
with the -g flag by the appropriate SunPro compiler.
ipcs -mb shows memory statistics for shared memory. This
may be useful when attempting to size memory to fit expected traffic.
A "segmentation violation fault" results when a process overflows its
stack. The kernel recognizes the violation and can extend the stack
size, up to a configurable limit.
In a multithreaded environment, the kernel does not keep track of
each user thread's stack, so it cannot perform this function. The
thread itself is responsible for stack SIGSEGV (stack overflow signal)
handling. (The SIGSEGV signal is sent by the threads library when
an attempt is made to write to a write-protected page just beyond the
end of the stack. This page is allocated as part of the stack creation
request.)
The Solaris virtual memory system combines physical memory with available
swap space via swapfs. If insufficient total virtual memory space is
provided, new processes will be unable to open.
Swap space can be added, deleted or examined with the
swap command. swap -l reports total and
free space for each of the swap partitions or files that are
available to the system. Note that this number does not reflect
total available virtual memory space, since physical memory
is not reflected in the output. swap -s reports the
total available amount of virtual memory, as does sar -r
If swap is mounted on /tmp via tmpfs ,
df -k /tmp will report on total available virtual memory
space, both swap and physical. As large memory allocations are made,
the amount of space available to tmpfs will decrease,
meaning that the utilization percentages reported by df
will be of limited use.
Solaris uses both common types of paging in its virtual memory system.
These types are swapping (swaps out all memory associated with a
user process) and demand paging (swaps out the not recently used pages).
Which method is used is determined by comparing the amount of available
memory with several key parameters:
- physmem:
physmem is the total page count
of physical memory.
- lotsfree: The page scanner is woken up when available memory
falls below
lotsfree . The default value for this is
physmem/64; it can be tuned in the /etc/system file
if necessary. The page scanner runs in demand paging mode by default.
The initial scan rate is set by the kernel parameter slowscan
, which is fastscan /10 by default.
- minfree: Between
lotsfree and
minfree , the scan rate increases linearly between
slowscan and fastscan . (
minfree is set to desfree /2 and
fastscan is set to physmem /4 by default.)
If free memory falls below desfree (
lotsfree /2 by default), the page scanner is started
100 times per second. Each page scanner will run for
desscan pages. This parameter is dynamically
set based on the scan rate.
- maxpgio:
maxpgio (default 40 or 60)
limits the rate at which I/O is queued to the swap devices.
It is set to 40 for sun4c, sun4m and sun4u architectures and
60 for sun4d architectures. If the disks are faster than 7200rpm,
maxpgio can safely be set to 100 times the number
of swap disks.
- throttlefree: When free memory falls below
throttlefree (default minfree ),
the page_create routines force the calling process
to wait until free pages are available.
- cachefree: If the kernel parameter
priority_paging
is set to 1 on a Solaris 7 system (or current patchlevels of
2.5.1 or 2.6), only data files will be targeted by the page daemon
until lotsfree is reached. By default,
cachefree is set to 2 x lotsfree . (Solaris 8
uses a different algorithm to determine which
pages are targeted
by the page daemon. priority_paging should not be
set on a Solaris 8 machine.)
The page scanner operates by first freeing a usage flag on each page
at a rate reported as "scan rate" in
vmstat
and
sar -g . After
handspreadpages additional pages have been read, the page
scanner checks to see whether the usage flag has been reset. If not,
the page is swapped out. (The default for handspreadpages
is physmem /4 up through Solaris 9. It is set dynamically
in Solaris 10.)
Solaris 8 uses a different algorithm for removing pages from memory.
This new architecture is known as the cyclical page cache.
It is designed to remove most of the file system cache-induced problems
with virtual memory. The new system fills the same need as priority
paging does for Solaris 2.5.1-7.
The cyclical page cache uses a file system free list to cache
filesystem data only. Other memory objects are managed on a separate
free list. (This second list would include application binaries,
shared libraries, applications and uninitialized application data.)
With the new algorithm, filesystem cache only competes with itself
for memory. It does not force applications out of primary memory
as sometimes happened with the earlier OS versions.
As a result of these changes,
vmstat under Solaris 8 will report
different statistics than would be expected under an earlier version
of Solaris:
- Page Reclaim rate higher.
- Higher reported Free Memory: A large component of the filesystem
cache is reported as free memory.
- Low Scan Rates: Scan rates will be near zero unless there is a
systemwide shortage of available memory.
vmstat -p reports paging activity details for applications
(executables), data (anonymous) and filesystem activity.
Swapping
If the system is consistently below desfree of
free memory (over a 30 second average), the memory scheduler
will start to swap out processes. (ie, if both avefree
and avefree30 are less than desfree ,
the swapper begins to look at processes.)
Initially, the scheduler will
look for processes that have been idle for maxslp
seconds. (maxslp defaults to 20 seconds and can be
tuned in /etc/system .) This swapping mode is known
as soft swapping.
Swapping priorities are calculated for an LWP
by the following formula:
epri = swapin_time - rss/(maxpgio/2) - pri
where swapin_time is the time since the thread was
last swapped, rss is the amount of memory used by
the LWPs process, and pri is the thread's priority.
If, in addition to being below desfree of free memory,
there are two processes in the run queue and paging activity exceeds
maxpgio , the system will commence hard swapping.
In this state, the kernel unloads all modules and cache memory that
is not currently active and starts swapping out processes sequentially
until desfree of free memory is available.
Processes are not eligible for swapping if they are:
- In the SYS or RT
scheduling class.
- Being executed or stopped by a signal.
- Exiting.
- Zombie.
- A system thread.
- Blocking a higher priority thread.
Large sequential I/O can cause performance problems due to excessive use
of the memory page cache. One way to avoid this problem is to use
direct I/O on filesystems where
large sequential I/Os are common.
Source of this article: http://www.princeton.edu/~unix/Solaris/troubleshoot/ram.html
|