Discussion:
NFSv3 versus NFSv4 performance
Thomas Nau
2011-11-24 16:15:22 UTC
Permalink
Dear all


We recently ran a few tests using filebench with different workloads
such as varmail and others. In some cases, especially the varmail test
run, v3 performs much better than v4 with repect to the total number
of OPs handled per interval. We destroy the pool and reboot the server
after each test so there should be no dependency left. Also the
workload is big enough to fit into main memory for the client as well as
for the server. Given that we tend to use v3 for a few hundred clients
but honetsly, I expected v4 to be superior. Both client and server run
Solaris 11 with a few tweaks in /etc/system such as the number of connections, ...

For example pre-allocating 80.000 files takes about 270 seconds in v3 versus 357
in NFSv4. The total IOs for a mix of create/delete/append/... ops are

v3:
IO Summary: 6894774 ops, 11490.8 ops/s, (1768/1768 r/w) 53.4mb/s, 662us cpu/op, 4.5ms latency


v4:
IO Summary: 3084085 ops, 5139.9 ops/s, (791/791 r/w) 30.1mb/s, 921us cpu/op, 10.1ms latency


Any hints or recommendations?

Thomas
o***@public.gmane.org
2011-11-28 18:56:42 UTC
Permalink
Hi

We use ZFS Storage Appliances to store mailboxes. Our application manages them in maildir format (at least
almost...)

For our workload (many thousand POP3/IMAP4 concurrent sessions and lots of SMTP traffic) NFSv4 works just
fine. IIRC correctly, NFSv4 issues many more synchronous writes than NFSv3, so be sure to have your ZFS intend log
devices on fast SSDs.

Cheers
Mika
----Ursprüngliche Nachricht----
Datum: 24.11.2011
17:15
Betreff: [osol-discuss] NFSv3 versus NFSv4 performance
Dear all
We recently ran a few tests using filebench with different workloads
such as varmail and others. In some cases,
especially the varmail test
run, v3 performs much better than v4 with repect to the total number
of OPs handled per
interval. We destroy the pool and reboot the server
after each test so there should be no dependency left. Also the
workload is big enough to fit into main memory for the client as well as
for the server. Given that we tend to use v3
for a few hundred clients
but honetsly, I expected v4 to be superior. Both client and server run
Solaris 11 with a
few tweaks in /etc/system such as the number of connections, ...
For example pre-allocating 80.000 files takes about
270 seconds in v3 versus 357
in NFSv4. The total IOs for a mix of create/delete/append/... ops are
IO
Summary: 6894774 ops, 11490.8 ops/s, (1768/1768 r/w) 53.4mb/s, 662us cpu/op, 4.5ms latency
IO
Summary: 3084085 ops, 5139.9 ops/s, (791/791 r/w) 30.1mb/s, 921us cpu/op, 10.1ms latency
Any hints or
recommendations?
Thomas
_______________________________________________
opensolaris-discuss mailing list
_______________________________________________
opensolaris-discuss mailing li
Thomas Nau
2011-11-28 22:18:20 UTC
Permalink
Thanks for the information Pavel
We have a support contract but the bug is filled already so
I assume Oracle is already working on it ;)

Thomas
Hi Thomas,
6826477 NFSv4 can be much slower and more expensive than NFSv3 when creating
files at high concurrency
If you have a support contract you can contact Oracle for fixing this issue.
Regards,
Pavel
Post by Thomas Nau
Dear all
We recently ran a few tests using filebench with different workloads
such as varmail and others. In some cases, especially the varmail test
run, v3 performs much better than v4 with repect to the total number
of OPs handled per interval. We destroy the pool and reboot the server
after each test so there should be no dependency left. Also the
workload is big enough to fit into main memory for the client as well as
for the server. Given that we tend to use v3 for a few hundred clients
but honetsly, I expected v4 to be superior. Both client and server run
Solaris 11 with a few tweaks in /etc/system such as the number of connections, ...
For example pre-allocating 80.000 files takes about 270 seconds in v3 versus 357
in NFSv4. The total IOs for a mix of create/delete/append/... ops are
IO Summary: 6894774 ops, 11490.8 ops/s, (1768/1768 r/w) 53.4mb/s,
662us cpu/op, 4.5ms latency
IO Summary: 3084085 ops, 5139.9 ops/s, (791/791 r/w) 30.1mb/s,
921us cpu/op, 10.1ms latency
Any hints or recommendations?
Thomas
_______________________________________________
opensolaris-discuss mailing list
michael masterson
2011-11-28 23:58:34 UTC
Permalink
Hi, I'm hoping that someone here can help me...

it was only thanks to this list and a couple of threads that I was able
to get my opensolaris > sol11express machine upgraded to solaris 11...
that really was somewhat painful.

the zone, I eventually had to just blow away and start over with,
because it kept wanting to look at opensolaris.org for the ips packages,
and refused to upgrade. :( I never did figure out how to get it to stop
looking there, I deleted everything referring to opensolaris from
/var/pkg on zpools the zone referred to.

but, now, I've got another problem, amd motherboard with dual bge nics,
bge0 won't plumb up after the upgrade to sol11,

I get these exact symptoms, as reported in this bug:

https://defect.opensolaris.org/bz/show_bug.cgi?format=multiple&id=6343

worked just fine in express.

my output from dmesg:
Nov 27 16:24:52 woden mac: [ID 469746 kern.info] NOTICE: bge1 registered
Nov 27 16:24:52 woden pci_pci: [ID 370704 kern.notice] PCI-device:
pci14e4,***@9,1, bge1
Nov 27 16:24:52 woden genunix: [ID 936769 kern.notice] bge1 is
/***@0,0/pci1022,***@a/pci14e4,***@9,1
Nov 27 16:24:52 woden ip: [ID 205306 kern.error] bge0: DL_ATTACH_REQ
failed: DL_SYSERR (errno 22)
Nov 27 16:24:52 woden ip: [ID 670008 kern.error] bge0: DL_BIND_REQ
failed: DL_OUTSTATE
Nov 27 16:24:52 woden ip: [ID 328654 kern.error] bge0: DL_PHYS_ADDR_REQ
failed: DL_OUTSTATE
Nov 27 16:24:52 woden ip: [ID 954675 kern.error] bge0: DL_UNBIND_REQ
failed: DL_OUTSTATE
Nov 27 16:24:53 woden smbd[2075]: [ID 702911 daemon.notice] dyndns:
failed to get domainname
Nov 27 16:24:53 woden last message repeated 1 time
Nov 27 16:25:00 woden mac: [ID 435574 kern.info] NOTICE: bge1 link up,
1000 Mbps, full duplex

as you can see, bge1 works fine, (thankfully)

Loading...