Nagle’s Algorithm and performance problems.

In the past, I’ve seen issues on latency sensitive applications that were quite difficult to find the root cause of. Symptoms tended to be in the form of unexpected delays and performance problems in applications where there were frequent, but very small updates. This is easy to spot if your app has some sort of timestamping mechanism to determine whether delays are occurring (eg: you could see a delay that is too large to be network induced), otherwise you’re into packet analysis territory.

Wireshark traces tended to show a delay in ACKs being sent back and/or fast retransmits where the other end was still waiting for an ACK to a packet. The delays introduced were far too high to be a network issue (or so you’d think) and everything checked out, so blame was levelled at the server end, but it took a while to find out where exactly this delay was being introduced.

From Microsoft KB2020559:

The Nagle Algorithm applied as part of RFC 1122 means that small payload network transmissions (such as read SCSI OpCode commands) may not be sent until either a full TCP segment is reached on that NIC or the delayed acknowledge time trigger is reached (200ms).

The procedure is on the webpage linked above, and does highlight some potential pitfalls which you should be aware of, but for reference:

  • Start Registry Editor (Regedit.exe).
  • Locate and then click the following registry subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces The interfaces will be listed underneath by automatically generated GUIDs like {064A622F-850B-4C97-96B3-0F0E99162E56} Click each of the interface GUIDs and perform the following steps:
  • Check the IPAddress or DhcpIPAddress parameters to determine whether the interface is used for iSCSI traffic. If not, skip to the next interface.
  • On the Edit menu, point to New, and then click DWORD value.
  • Name the new value TcpAckFrequency, and assign it a value of 1.
  • Exit the Registry Editor.
  • Restart Windows for this change to take effect.
  • On Linux, this setting has to be coded into the app by setting the TCP_NODELAY socket option. There doesn’t seem to be a way of setting this on a NIC that I can find.

    Why you should try Splunk

    Do you have massive syslog files that take an age to grep or run your awk scripts through?

    Splunk will be your saviour. It’s easy to define field extractions and follow a transaction all the way through and make pretty reports. I had a large script that hammered CPU every morning to report on system outages and the environment, but with SPLUNK you can run huge queries in seconds. It also allows you to pair or correlate syslog events by common key. Once your network syslogs hit around 100MB per day or more, it doesn’t make much sense to use the usual UNIX tools anymore.

    Unfortunately, the enterprise licences for Splunk are very expensive (it’d put me off buying if I were a small company to be honest), but it’s easy enough to build usable, functional dashboards with the free version. You can’t send emails or SNMP traps from the free one but if you want to, it’s easy enough to get around this by writing a syslog monitoring script to cover the extra functionality that’s missing in most cases.

    I’ll update this post with examples later, but for anyone who is struggling with huge volumes of syslogs, I seriously recommend that you look into it. Imagine a combination of grep and SQL on steroids and being able to say “show me the interfaces that have gone down in the last 24 hour period and haven’t come back up yet” or “show me the time periods where I’m getting more than 100 hits a minute on this webpage”. You also don’t need to change all your syslog config if you don’t want to; you can run a splunk forwarder on your existing syslog servers.

    URL: http://www.splunk.com

    The community is pretty damned decent as well. Lots of keen people out there who are willing to help out.

    Nexus 5K upgrade breaks syslog

    After upgrading our Cisco Nexus 5Ks we found that syslog to our syslog servers mysteriously stopped. No more interface down/up alerts and certainly no sysconfig alerts either. The “show logging server” command unearthed the problem here. Originally there was no numeric suffix to the logging server commands.

    It seems that the code upgrade must have changed the default logging level so we had to do the following:

    logging server 10.0.0.129 7
    logging server 10.0.1.129 7
    

    Note that the logging levels are as follows:

    <0-7> 0-emerg;1-alert;2-crit;3-err;4-warn;5-notif;6-inform;7-debug

    If you want to see logs from system configuration events, you need debug (7).

    Quick and dirty Cisco DHCP

    Sometimes we just need a simple DHCP server for testing in a lab, or for other basic purposes. Let’s face it, not many people use cisco routers as DHCP servers in a production environment unless they are strapped for cash.

    Let’s cut through all the documentation and just set something simple up with an exclusion range.

    ! Exclude .250 to .254
    ip dhcp excluded-address 192.168.0.250 192.168.0.254
     
    ! New DHCP pool
    ip dhcp pool LAN
     network 192.168.0.0 /24
     domain-name localdomain.com
     dns-server 4.2.2.1
     default-router 192.168.0.254
     lease infinite
    

    nnmcluster failure and issues

    NNMi nnmcluster sometimes does not behave as expected. The following gotchas and procedures may be helpful.

    Note that in ALL cases, you should have a recent backup in place in case of unexpected results.

    First Scenario: nnmcluster is effectively “disconnected” on the primary, showing no status whatsoever.

    The secondary has become active, yet the primary still functions as normal (ovstatus -c shows all fine and it works as normal.) nnmcluster command suggests that only the remote member is active. You are now unable to shutdown cleanly.
    Solution:

    As nnmcluster -shutdown and nnmcluster -halt won’t respond on the primary (as the system thinks you are not in cluster mode) run nnmcluster -shutdown on the secondary and then on the primary do the following:

    vi /var/opt/OV/shared/nnm/conf/props/nms-cluster.properties
    

    Comment out the cluster name with #, then run

    ovstop
    

    BE PATIENT as it may take a few minutes to shut down.

    Again, on the primary, UNcomment the cluster name in the nms-cluster.properties file.

    Move the sentinel file

    mv /var/opt/OV/shared/nnm/node-cluster.sentinel /var/opt/OV/shared/nnm/node-cluster.sentinel.orig
    

    Check if nnmcluster is still running and using the port required (it’ll essentially be a “detached” process here).

    lsof –i :7810
    

    KILL the PID with kill -9 [PID]

    Then run

    nnmcluster -daemon
    

    Run nnmcluster, wait for it to be in active state, then run nnmcluster -daemon on the standby.

    Second scenario: Unable to get both members up in cluster.

    i) Check that cluster name is the same in both config files. Ensure that NO white space is trailing the cluster name in either file.

    ii) run nnmcluster -shutdown on both members

    iii) Follow the steps above in finding the PID binding to 7810

    lsof –i :7810
    

    Kill this PID if it exists, then restart both sides. If this still doesn’t work, then repeat, deleting /var/opt/OV/shared/nnm/node-cluster.sentinel file on both servers before the restart.