Thursday, May 31, 2012

A Brief History of SDN

Software Defined Networking is a new and exciting development in networking.  However, it's been a long time coming:

1985    IGRP  planned to take into count metrics beyond hop count to include reliability and load

1991    OSPF  specified calculating paths to deliver better throughput, lower latency, etc.  John Moy's excellent book remains one of my very favorites.

1996    Cisco Policy Based Routing was introduced with IOS 11.0 and "provides a flexible mechanism for network administrators to customize the operation of the routing table and the flow of traffic within their networks."

2006    Cisco OER  became PfR sometime around 2010.  It allows the traffic forwarding decision to be based on delay, packet loss, and link loading.  Arguably the closest precursor to SDN (at least from Cisco) in that it has a 'separation of control plane', 'a centralized controller' and possibly 'programmability by external applications.

Please send me other examples!

Thursday, May 24, 2012

Slow TCP but no packet loss

Many times I've heard the claim "out-of-order packets" wreak havoc with your application.  Many interpretations conclude that must be because of poor application coding.  Blaming others can flow both ways, it seems.  My favorite comment in all threads about Duplicate ACKs and Fast Retransmit concludes with " how the network performance is improved by a server reboot. "

Back to our main concern here.  When a receiver gets TCP datagrams 1,2,3 and 5 it will immediately send an ACK expecting 4 as it already did to confirm receipt of datagram 3.  Ding.  First 'duplicate ACK'.

Now, listen closely as I don't plan to get in this habit.  The RFC for TCP Congestion Control is a moderately pain-free read and gives the authoritative fundamentals on how TCP should behave.  Give it a look!

The key to dig out is somewhere around page 6.  Duplicate ACKs cause sender to reduce window size.  The whole mechanism and scenarios for SLOWING down transmission rates of the TCP flow are detailed.

MORAL:  nothing has to be lost and TCP can still determine congestion has been encountered and slow down.  Sometimes significantly.  Also came across these great course slides.

Thursday, May 17, 2012

Two missing but vital Cisco Nexus vPC commands

So you come across a pair of Nexus 5548 switches and dive right into setting up a Virtual PortChannel following the configuration guides.  You'll wind up with something like this:

Things go swimmingly!  That is until you want to upgrade or have to power cycle one of the switches.  Your server can't get anywhere nor be reached for quite a few seconds.  Odd, you think.  Surely frames can happily flow along the remaining active link.  Isn't that what vPC is for?  As one switch comes back online, you then get ANOTHER period of lost connection to your server.

Tuesday, May 8, 2012

TCP Throughput across the Internet

After reading Brad Hedlund's excellent overview and digging into the great tools at SWITCH, I ran some quick numbers for what transfers might see across the Internet.

First, taking low packet loss of 0.01% and checking across various delays from 5 to 60ms, I get:

Then, using 60 ms as a rather normal round-trip time (RTT) across the Internet, I tried values for packet loss increasing to a very normal 0.1% :

Looks like it's pretty hard to fill that 25Mbps DSL.  In the future, I will look at tuning options for improving that.  Comments welcome!

Friday, May 4, 2012

Response Time Composition

Waaaay back in 2006, I used a network monitoring appliance from Network Physics whom I'd lost touch with.  It delivered this graph that quickly became my best friend:

One glance at the Response Time Composition and I could see what had changed from a baseline.  I could also see where to dig in further as to the cause of reported 'slowness'. Of course it was never the network and quite often the server or sheer massiveness of application chattiness.

I'm pleased to say my old friend lives on and has been seriously enhanced by the folks at OPNET in their AppResponse Xpert tool.  Even more super-cool is the SaaS edition that fires up right in your #cloud -running application via a sweet bit of JavaScript.