Solaris 11 Express NAT/Router IP Fragments


 
Thread Tools Search this Thread
Special Forums IP Networking Solaris 11 Express NAT/Router IP Fragments
# 1  
Old 04-08-2011
Solaris 11 Express NAT/Router IP Fragments

Upon replacing my linux router/server with a Solaris one I've noticed very poor network performance. The server itself has no issues connecting to the net, but clients using the server as a router are getting a lot of IP fragments as indicated from some packet sniffing I conducted.

Here was my old setup.
<DSL_Modem>-<Linux Router>-<switch>-<wifi>-<macbook>
- this setup works fine, with no fragmentation or performance issues

Setup 1
<DSL_Modem>-<Sol 11 Router>-<switch>-<wifi>-<macbook>
- this setup has major packet fragmentation

Setup 2 (taking wifi out of the flow)
<DSL_Modem>-<Sol 11 Router>-<switch>-<macbook>
- this setup has major packet fragmentation

I played with various MTU settings on the solaris server internal NIC, but it made no difference so I tried a couple of things with the client box.

I determined the max MTU I could send from my macbook as 1464 without getting fragmentation by using:
ping -D -s 1464 <any internet ip>

Once I manually set my MTU down to 1464 on my macbook instead of the default 1500 web pages started loading normally. So here's the problem...why do I have to manually set the MTU on the client macbook when I have my solaris server setup as a router. Is there some network related tuning I can perform on the server that will address these issues?
# 2  
Old 04-08-2011
Maybe some firewall or setting is not allowing Path MTU Detection, the process where routing tables are used to record, for specific hosts on normal routes, the discovered max MTU of the path to that host. This is done by sending DNF flagged packets and getting too-big ICMP messages back, or no response at all (a Path MTU Black Hole, where a firewall or setting prevents the ICMP too-big message).

Packet fragmentation is not uncommon with VPN, for instance, as the VPN wrapper expands the packet size. NAT just rewrites packets in place, does not expand packets, unless they have added NAT features since I was playing with NAT.

Normally, MTU is 1500 on Ethernet. The 802.3 MTU is 1492. I wonder what is trimming the MTU to 1464? Is VPN in play? http://en.wikipedia.org/wiki/Maximum_transmission_unit

Packet fragmentation should not be the end of the world, speed wise, just a bit less than optimal, with all the additional, small fragments. Can you, did someone turn off reassembly to avoid a related denial of service?

Extremely low MSS or RWIN (window size) settings can lower packet size. Low RWIN means the recipient does not have the buffer to hold the data in the packet, which seems very silly, but here we are. A "nice" TCP stack could ack the part it could digest (once it has some space) and discard the part it has no buffer for, but who knows? At one time, for Internet traffic, servers wanted an RWIN that is about 4 * MSS = (MTU - IP Header (20 for IPV4) - TCP Header (20 for IPV4 plus any modulo-4 byte option additions, one of which can send the MTU) ), so 4 packets are sent and then an ack is waited for, but you can go much higher to boost performance at the cost of more potential retransmission in case of error. Originally, RWIN maxed out at 65535, but later (RFC-1323) it was enabled to go higher. http://en.wikipedia.org/wiki/Transmi...Window_scaling RWIN represents the size of an end's TCP socket stream buffer (ret = setsockopt( sock, SOL_SOCKET, SO_RCVBUF, (char *)&sock_buf_size, sizeof(sock_buf_size) )Smilie, and RAM has gotten cheaper and more ample. RWIN needs to accommodate all the data you can normally send before the ack of the first packet returns, to not choke throughput. Big transmit socket buffers SO_SNDBUF are nice but not that critical to net throughput; they ensure that the sending app can write/send all the data of one ply off on the API and move on, not blocking. Of course, both ends have an MSS, but MSS is only important at the end receiving the bulk of the data, so the sourcing system can keep sending at max rate without delays. Welcome to the full duplex world of TCP, simulated if not real. Be careful to tune both ends! http://en.wikipedia.org/wiki/Maximum_segment_size http://en.wikipedia.org/wiki/Transmi...ment_structure

So, once you find choke points in the MTU, you need to tune the RWIN, MSS so TCP will use it, tune any apps for big buffers and ensure Path Detection and Black Hole Detection are properly configured, then you can get close to the throughput you paid for, at least in the more popular direction.

Last edited by DGPickett; 04-08-2011 at 12:19 PM..
This User Gave Thanks to DGPickett For This Post:
# 3  
Old 04-09-2011
Some good detail in there. I also found some useful information here MSS Problems with Sun PPPoE . Additionally, I reviewed my Linux router config to see what may be "working" and found that it's likely that the following firewall rule was addressing the issue I'm now experiencing with Solaris.

Code:
iptables -I FORWARD 1 -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

Too bad it wasn't that simple with the Solaris setup Smilie. I'll attempt to tune the Solaris setup and see how I make out.

---------- Post updated 04-09-11 at 08:04 PM ---------- Previous update was 04-08-11 at 08:16 PM ----------

It's working now...and appears to be performing, but is it optimal?...I'm not sure yet. For those who wish to tackle using Solaris as a firewall/router against a PPPoE connection, I'll put my details here.

By default, the negotiated MTU over PPPoE is going to be 1492.
Using the 1492 MTU as the model, I've knocked 40 off for a max MSS number of 1452 for the TCP stack to use.
Code:
ndd -set /dev/tcp tcp_mss_max_ipv4 1452

In addition to this I'll want to turn off Path MTU Discovery.
Code:
ndd -set /dev/ip ip_path_mtu_discovery 0

# 4  
Old 04-11-2011
Path MTU Discovery is nice on a varied intranet, but not so good for the Internet, where the short relationships might make it not worth the effort.

Use a sniffer to see what sort of options are in your standard TCP packets (not SYN or FIN). Add their length to the 40 before subtracting. Sometimes, the RWIN is called max MSS. Try various options with a long stream between two local hosts.

Normally, frags are for big UDP packets and normal-net-max packets on VPN.
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Solaris

OpenSolaris, Solaris, Solaris Express - differences

What are the differences between these systems? I have to use Oracle's product but I do not know who to choose (3 Replies)
Discussion started by: PtaQ
3 Replies

2. Solaris

Solaris 11 Express NAT performance issues

Hi all, I decided to replace my linux router/firewall with Solaris 11 express. This is a pppoe connection directly to my server...no router boxes. I got everything setup, but the performance is terrible on the NAT....really slow. A web page that loads on the server instantly will take... (3 Replies)
Discussion started by: vectox
3 Replies

3. Solaris

Solaris Express or OpenIndiana

Simply question which should I use. correct me if I'm wrong but Solaris Express is taking the place of Opensolaris and is officially sanctioned by Oracle and OpenIndiana is what used to be OpenSolaris. If I opt for OpenIndiana is it going to follow the official Oracle Solaris releases or are... (3 Replies)
Discussion started by: michael78
3 Replies

4. Solaris

ipfilter solaris express

Hello, | am trying to setup ipfilter on solaris express snv_91 but I don't seem to have the following file available. /etc/ipf/pfil.ap Is this an older way of configuring the interface?, I have all the packages installed. Thanks, (1 Reply)
Discussion started by: Actuator
1 Replies

5. IP Networking

Destination NAT using ipnat in Solaris 8

Hello People, Please can someone help me with destination IP address NAT and Port transalation using ipnat in Solaris 8. Scenario: Box A(192.168.100.1/24) and Box B (192.168.100.50/24) are connected phyically and logically(vlan) on the same network switch. Box A hosts an... (0 Replies)
Discussion started by: mandarawachat
0 Replies

6. UNIX for Dummies Questions & Answers

Installing Solaris behind a windows NAT...

Greetings, and thank you for your time. I am cracking the whip to self-teach myself Unix because I think it will be the best platform for me to really open my mind and be creative. Sadly I lack anyone experienced in Unix to nag with questions, so you will be seeing a lot of me here I am sure... (4 Replies)
Discussion started by: Dustin
4 Replies

7. UNIX for Advanced & Expert Users

fragments in Solaris 8

When discussing inodes and data blocks, I know Solaris creates these data blocks with a total size of 8192b, divided into eight 1024b "fragments." It stores data in "contiguous" fragments and solaris doesn't allow a file to use portions of two different fragments. If the file size permits, then the... (4 Replies)
Discussion started by: manderson19
4 Replies
Login or Register to Ask a Question