Post by Brandon HighPost by Jürgen KeilAnd another case was an attempt to use "xsltproc(1)" on a
big xml file, this time on an amd64 x2 machine with 4GB of
memory, using zfs, and the xsltproc process had grown to
use > 2GB of memory. Again heavy disk trashing, and I
didn't had the impression that the arc cache did
shrink enough to prevent that thrashing.
prstat can give more accurate info on Solaris, fyi.
Yep, top doesn't work well with 64-bit processes.
But top's summary information includes quite a few
things that are missing in prstat, like swap space
usage, memory totals, total cpu usage, ...
Post by Brandon Highload averages: 0.01, 0.02, 0.05 21:11:47
75 processes: 74 sleeping, 1 on cpu
CPU states: 99.4% idle, 0.1% user, 0.5% kernel, 0.0% iowait, 0.0% swap
Memory: 4031M real, 108M free, 2693M swap in use, 1310M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
7587 jk 1 60 0 2430M 2306M sleep 0:25 0.24% xsltproc
Only 124M of the xsltproc process is paged - The rest is resident, but
there is 2693M of swap in use. It appears that something other than
your xsltproc is paged out.
Nope, most of the 2693M of "swap in use" is reserved swap
space for the heap data that is allocated by the big xsltproc
process.
I also had a few old / big files in /tmp, but not more than 50mbytes.
Post by Brandon HighAFAIK, the ARC should not page,
Correct, that's in the kernel's heap.
Post by Brandon Highso what else is using that memory?
The big xsltproc process, and a few files in /tmp
Plus the X11 server & dtlogin, but no user
logged in into JDS.
Post by Brandon HighYour ARC + xsltproc are only equal to
2754M. There's something else on your system consuming 3862M.
That's not what I see...
I'd say the ARC consumes arc.c 1356 MB
+ arc.arc_meta_used 448 MB (?);
and there is a 2306MB RSS for the xsltproc process.
That's a total of 4110 MB.
And in case the arc meta data is already included
in arc.c:
1356 MB + 2306MB = 3662 MB used.
Post by Brandon HighBefore you ran the process, what did things look like, memory wise?
Before running xsltproc:
% swap -s
total: 105256k bytes allocated + 16804k reserved = 122060k used, 4133532k available
And while it is running:
% swap -s
total: 2590132k bytes allocated + 16604k reserved = 2606736k used, 1668740k available
Post by Brandon HighAlso keep in mind that /tmp is an in memory filesystem. Things written
there will use physical memory, or page out if required.
Also, could you sort the output by memory usage?
"prstat -s rss" and
"prstat -s size" will sort by resident size or total
image size.
% prstat -s size
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
22642 jk 2430M 2029M sleep 60 0 0:00:22 19% xsltproc/1
22498 root 40M 15M sleep 59 0 0:00:01 0.0% Xorg/1
9 root 20M 1784K sleep 59 0 0:01:23 0.0% svc.configd/15
15634 postgres 19M 1116K sleep 59 0 0:01:06 0.0% postgres/1
15632 postgres 19M 1380K sleep 59 0 0:00:00 0.0% postgres/1
3036 root 13M 748K sleep 59 0 0:00:27 0.0% fmd/17
22535 root 13M 4516K sleep 59 0 0:00:00 0.0% dtgreet/1
7 root 11M 540K sleep 59 0 0:00:29 0.0% svc.startd/12
3108 root 10M 0K sleep 59 0 0:00:00 0.0% smbd/1
4192 root 10M 0K sleep 59 0 0:00:00 0.0% smbd/1
15635 postgres 9636K 952K sleep 59 0 0:00:01 0.0% postgres/1
15636 postgres 8900K 0K sleep 59 0 0:00:00 0.0% postgres/1
22500 root 8504K 1652K sleep 59 0 0:00:00 0.0% dtlogin/1
4207 smmsp 8036K 552K sleep 59 0 0:00:03 0.0% sendmail/1
4238 root 8028K 1140K sleep 59 0 0:00:48 0.0% sendmail/1
3699 root 7776K 1004K sleep 59 0 0:00:00 0.0% dtlogin/1
3307 root 7080K 1572K sleep 59 0 0:03:22 0.0% intrd/1
365 root 6784K 1976K sleep 59 0 0:00:06 0.0% hald/4
146 daemon 6568K 900K sleep 59 0 0:00:00 0.0% kcfd/3
27392 root 6300K 1796K sleep 59 0 0:03:38 0.0% ypserv/1
26803 root 5576K 2608K sleep 59 0 0:00:56 0.0% nscd/35
137 root 5572K 1052K sleep 59 0 0:00:00 0.0% syseventd/15
2859 root 5068K 1544K sleep 59 0 0:00:27 0.0% inetd/4
430 root 4712K 0K sleep 59 0 0:00:00 0.0% hald-addon-netw/1
22645 root 4508K 2856K cpu1 59 0 0:00:00 0.0% prstat/1
3038 root 4420K 0K sleep 59 0 0:00:00 0.0% sshd/1
6764 root 4108K 1176K sleep 59 0 0:00:11 0.0% syslogd/13
541 root 4032K 732K sleep 59 0 0:00:00 0.0% rmvolmgr/1
% prstat -s rss
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
22642 jk 2430M 2088M sleep 60 0 0:00:22 0.6% xsltproc/1
22498 root 40M 15M sleep 59 0 0:00:01 0.0% Xorg/1
22535 root 13M 4520K sleep 59 0 0:00:00 0.0% dtgreet/1
26803 root 5576K 2964K sleep 59 0 0:00:56 0.0% nscd/35
22651 root 4508K 2784K cpu1 59 0 0:00:00 0.0% prstat/1
3307 root 7080K 2624K sleep 59 0 0:03:22 0.0% intrd/1
9 root 20M 2216K sleep 59 0 0:01:23 0.0% svc.configd/15
22638 root 3336K 2096K sleep 49 0 0:00:00 0.0% tcsh/1
27393 root 3112K 1980K sleep 59 0 0:00:20 0.0% rpc.nisd_resolv/1
365 root 6784K 1976K sleep 59 0 0:00:06 0.0% hald/4
22600 jk 3524K 1920K sleep 59 0 0:00:00 0.0% tcsh/1
22608 jk 3512K 1900K sleep 59 0 0:00:00 0.0% tcsh/1
22634 jk 3400K 1864K sleep 59 0 0:00:00 0.0% tcsh/1
27392 root 6300K 1820K sleep 59 0 0:03:38 0.0% ypserv/1
22500 root 8504K 1652K sleep 59 0 0:00:00 0.0% dtlogin/1
2876 root 3568K 1612K sleep 59 0 0:01:23 0.0% automountd/5
2859 root 5068K 1544K sleep 59 0 0:00:27 0.0% inetd/4
305 root 3996K 1524K sleep 59 0 0:00:44 0.0% devfsadm/7
22607 root 2632K 1476K sleep 59 0 0:00:00 0.0% in.rlogind/1
22599 root 2632K 1464K sleep 59 0 0:00:00 0.0% in.rlogind/1
15632 postgres 19M 1412K sleep 59 0 0:00:00 0.0% postgres/1
7137 root 2716K 1364K sleep 100 - 0:01:44 0.0% xntpd/1
4274 root 3844K 1324K sleep 59 0 0:00:00 0.0% mountd/5
18588 root 2536K 1312K sleep 59 0 0:00:00 0.0% hald-addon-stor/1
579 daemon 3588K 1292K sleep 59 0 0:01:27 0.0% rpcbind/1
6764 root 4108K 1176K sleep 59 0 0:00:11 0.0% syslogd/13
4238 root 8028K 1156K sleep 59 0 0:00:48 0.0% sendmail/1
Btw, when the xsltproc process starts when memory usage on the
machine looks like this, it completes in less than a minute
(this is after several xsltproc runs, zfs arc has released enough
memory so that the big process fits into free memory):
load averages: 0.00, 0.05, 0.06 16:14:03
71 processes: 70 sleeping, 1 on cpu
CPU states: 99.8% idle, 0.0% user, 0.2% kernel, 0.0% iowait, 0.0% swap
Memory: 4031M real, 2589M free, 123M swap in use, 4057M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
22993 jk 1 59 0 3788K 1556K cpu/0 0:00 0.02% top
27392 root 1 59 0 6300K 1860K sleep 3:38 0.00% ypserv
22498 root 1 59 0 0K 0K sleep 0:01 0.00% Xorg
22535 root 1 59 0 13M 4520K sleep 0:00 0.00% dtgreet
4276 daemon 2 60 -20 2756K 720K sleep 5:35 0.00% nfsd
3307 root 1 59 0 7080K 2732K sleep 3:22 0.00% intrd
7137 root 1 100 -20 2716K 1364K sleep 1:44 0.00% xntpd
579 daemon 1 59 0 3588K 1296K sleep 1:27 0.00% rpcbind
9 root 15 59 0 20M 2340K sleep 1:23 0.00% svc.configd
2876 root 5 59 0 3568K 1836K sleep 1:23 0.00% automountd
15634 postgres 1 59 0 19M 1152K sleep 1:06 0.00% postgres
26803 root 35 59 0 5576K 3072K sleep 0:56 0.00% nscd
4238 root 1 59 0 8028K 1192K sleep 0:48 0.00% sendmail
305 root 7 59 0 3996K 1524K sleep 0:44 0.00% devfsadm
7 root 12 59 0 11M 592K sleep 0:29 0.00% svc.startd
Now all I have to do is fill the arc cache with
"find /path/to/a/zfs/filesystem -type f -exec grep does_not_exist {} +",
after that findmemory usage changes like this:
load averages: 0.12, 0.20, 0.14 16:19:44
71 processes: 70 sleeping, 1 on cpu
CPU states: 99.7% idle, 0.1% user, 0.2% kernel, 0.0% iowait, 0.0% swap
Memory: 4031M real, 669M free, 123M swap in use, 2137M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
22993 jk 1 59 0 3788K 1556K cpu/1 0:00 0.02% top
22498 root 1 59 0 0K 0K sleep 0:01 0.00% Xorg
22535 root 1 59 0 13M 4520K sleep 0:00 0.00% dtgreet
4276 daemon 2 60 -20 2756K 720K sleep 5:35 0.00% nfsd
27392 root 1 59 0 6300K 1860K sleep 3:38 0.00% ypserv
3307 root 1 59 0 7080K 2732K sleep 3:22 0.00% intrd
7137 root 1 100 -20 2716K 1364K sleep 1:44 0.00% xntpd
579 daemon 1 59 0 3588K 1296K sleep 1:27 0.00% rpcbind
9 root 15 59 0 20M 2340K sleep 1:23 0.00% svc.configd
2876 root 5 59 0 3568K 1836K sleep 1:23 0.00% automountd
15634 postgres 1 59 0 19M 1152K sleep 1:06 0.00% postgres
26803 root 35 59 0 5576K 3072K sleep 0:56 0.00% nscd
4238 root 1 59 0 8028K 1192K sleep 0:48 0.00% sendmail
305 root 7 59 0 3996K 1524K sleep 0:44 0.00% devfsadm
7 root 12 59 0 11M 592K sleep 0:29 0.00% svc.startd
... and in this environment xsltproc needs > 15 minutes,
because it doesn't get enough RSS:
load averages: 0.30, 0.23, 0.15 16:20:31
72 processes: 71 sleeping, 1 on cpu
CPU states: % idle, % user, % kernel, % iowait, % swap
Memory: 4031M real, 32M free, 2013M swap in use, 1151M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
23076 jk 1 60 0 1894M 1556M sleep 0:17 19.38% xsltproc
23078 jk 1 58 0 3724K 1456K cpu/1 0:00 0.09% top
22498 root 1 60 0 0K 0K sleep 0:01 0.01% Xorg
27392 root 1 60 0 6300K 1600K sleep 3:38 0.01% ypserv
15634 postgres 1 59 0 19M 1116K sleep 1:06 0.00% postgres
4238 root 1 59 0 8028K 1044K sleep 0:48 0.00% sendmail
22535 root 1 59 0 13M 1504K sleep 0:00 0.00% dtgreet
4276 daemon 2 60 -20 2756K 564K sleep 5:35 0.00% nfsd
3307 root 1 59 0 7080K 1512K sleep 3:22 0.00% intrd
7137 root 1 100 -20 2716K 1364K sleep 1:44 0.00% xntpd
579 daemon 1 59 0 3588K 1048K sleep 1:27 0.00% rpcbind
2876 root 5 59 0 3568K 1592K sleep 1:23 0.00% automountd
9 root 15 59 0 20M 624K sleep 1:23 0.00% svc.configd
26803 root 35 59 0 5576K 2096K sleep 0:56 0.00% nscd
305 root 7 59 0 3996K 1000K sleep 0:44 0.00% devfsadm
....
load averages: 0.32, 0.24, 0.16 16:21:00
72 processes: 71 sleeping, 1 on cpu
CPU states: 99.3% idle, 0.1% user, 0.6% kernel, 0.0% iowait, 0.0% swap
Memory: 4031M real, 118M free, 2549M swap in use, 952M swap free
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
23076 jk 1 60 0 2430M 1821M sleep 0:22 8.97% xsltproc
23078 jk 1 59 0 3788K 1408K cpu/1 0:00 0.03% top
22498 root 1 59 0 0K 0K sleep 0:01 0.01% Xorg
27392 root 1 59 0 6300K 1656K sleep 3:38 0.00% ypserv
3307 root 1 59 0 7080K 2356K sleep 3:22 0.00% intrd
9 root 15 59 0 20M 1056K sleep 1:23 0.00% svc.configd
15634 postgres 1 59 0 19M 1152K sleep 1:06 0.00% postgres
22535 root 1 59 0 13M 1504K sleep 0:00 0.00% dtgreet
4276 daemon 2 60 -20 2756K 564K sleep 5:35 0.00% nfsd
7137 root 1 100 -20 2716K 1364K sleep 1:44 0.00% xntpd
579 daemon 1 59 0 3588K 1104K sleep 1:27 0.00% rpcbind
2876 root 5 59 0 3568K 1512K sleep 1:23 0.00% automountd
26803 root 35 59 0 5576K 2404K sleep 0:56 0.00% nscd
4238 root 1 59 0 8028K 1144K sleep 0:48 0.00% sendmail
305 root 7 59 0 3996K 1000K sleep 0:44 0.00% devfsadm
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
23076 jk 2430M 1831M sleep 60 0 0:00:22 4.6% xsltproc/1
22498 root 40M 8948K sleep 59 0 0:00:01 0.0% Xorg/1
23080 jk 4556K 2956K cpu0 59 0 0:00:00 0.0% prstat/1
26803 root 5576K 2404K sleep 59 0 0:00:56 0.0% nscd/35
3307 root 7080K 2356K sleep 59 0 0:03:22 0.0% intrd/1
22634 jk 3400K 2064K sleep 49 0 0:00:00 0.0% tcsh/1
27393 root 3112K 1732K sleep 59 0 0:00:20 0.0% rpc.nisd_resolv/1
27392 root 6300K 1656K sleep 59 0 0:03:38 0.0% ypserv/1
2876 root 3568K 1512K sleep 59 0 0:01:23 0.0% automountd/5
22535 root 13M 1508K sleep 59 0 0:00:00 0.0% dtgreet/1
15632 postgres 19M 1416K sleep 59 0 0:00:00 0.0% postgres/1
7137 root 2716K 1364K sleep 100 - 0:01:44 0.0% xntpd/1
22500 root 8504K 1332K sleep 59 0 0:00:00 0.0% dtlogin/1
365 root 6784K 1332K sleep 59 0 0:00:06 0.0% hald/4
2859 root 5068K 1308K sleep 59 0 0:00:27 0.0% inetd/4
4238 root 8028K 1156K sleep 59 0 0:00:48 0.0% sendmail/1
15634 postgres 19M 1152K sleep 59 0 0:01:06 0.0% postgres/1
18588 root 2536K 1140K sleep 59 0 0:00:00 0.0% hald-addon-stor/1
3449 root 2536K 1124K sleep 59 0 0:00:00 0.0% hald-addon-stor/1
579 daemon 3588K 1104K sleep 59 0 0:01:27 0.0% rpcbind/1
4274 root 3844K 1100K sleep 59 0 0:00:00 0.0% mountd/5
6764 root 4108K 1072K sleep 59 0 0:00:11 0.0% syslogd/13
9 root 20M 1056K sleep 59 0 0:01:23 0.0% svc.configd/15
305 root 3996K 1000K sleep 59 0 0:00:44 0.0% devfsadm/7
22632 jk 1512K 980K sleep 59 0 0:00:00 0.0% script/1
15635 postgres 9636K 960K sleep 59 0 0:00:01 0.0% postgres/1
146 daemon 6568K 900K sleep 59 0 0:00:00 0.0% kcfd/3
4531 daemon 2664K 896K sleep 60 -20 0:00:00 0.0% nfs4cbd/2
22633 jk 1528K 892K sleep 59 0 0:00:00 0.0% script/1
601 root 2840K 748K sleep 59 0 0:00:00 0.0% keyserv/3
137 root 5572K 748K sleep 59 0 0:00:00 0.0% syseventd/15
2832 root 3040K 612K sleep 59 0 0:00:00 0.0% cron/1
7 root 11M 588K sleep 59 0 0:00:29 0.0% svc.startd/12
Total: 72 processes, 212 lwps, load averages: 0.25, 0.23, 0.15
This message posted from opensolaris.org
_______________________________________________
opensolaris-