From: Steffen Nurpmeso Subject: Re: Add sysctl to disable Nagle's algorithm (RFC 896 - Congestion Control) To: Job Snijders Cc: tech@openbsd.org Date: Mon, 13 May 2024 21:28:37 +0200 Job Snijders wrote in : |Back in the early 1980s, a suggestion was put forward how to improve TCP |congestion control, also known as "Nagle's algorithm". See RFC 896. | |Nagle's algorithm can cause consecutive small packets from userland |applications to be coalesced into a single TCP packet. This happens at |the cost of an increase in latency: the sender is locally queuing up |data until it either receives an acknowledgement from the remote side or |sufficient additional data piled up to send a full-sized segment. | |This approach might have been advantageous 40 - 50 years ago, when |multiple users were concurrently working behind 1200 baud lines. Nagle's |algorithm discourages sending tiny segments when the data to be sent |increases in small increments. The trade-off being "sacrificing a |degree of interactivity" in exchange for "increased throughput". | |In recent days the applicability and usefulness of Nagle's agorithm in |our times came into question. Nagle's algorithm negatively interacts |with Delayed Acks (RFC 813), as per Nagle himself: |https://news.ycombinator.com/item?id=10608356 and a more complete |description: https://datatracker.ietf.org/doc/html/draft-minshall-nagle | |But some argue "Given the vast amount of work a modern server can do in |even a few hundred microseconds, delaying sending data for even one RTT |isn’t clearly a win." https://brooker.co.za/blog/2024/05/09/nagle.html | |In base, various applications have taken it upon themselves to disable |Nagle's algorithm: ssh, httpd, iscsid, relayd, bgpd, and unwind. Bluhm |and I are not aware of applications that explicitly enable Nagle. |The standards say in RFC 9293 section 3.7.4: "A TCP implementation |SHOULD implement the Nagle algorithm to coalesce short segments. |However, there MUST be a way for an application to disable the Nagle |algorithm on an individual connection." TCP_NODELAY/TCP_NOPUSH is what applications used to adjust the behaviour as necessary already more than the quarter of a century ago. |So, why not take it a step further and allow for the algorithm to be |disabled on the whole system? :-) | |The below changeset introduces sysctl net.inet.tcp.nodelay, which if set |to 1 will simply cause TCP_NODELAY to be set on all TCP sockets. | |Note that with net.inet.tcp.nodelay set to 1, applications still can |inspect and disable TCP_NODELAY using getsockopt() and setsockopt(). | |Perhaps in the future - after more study & contemplation - we'll to |change this sysctl's default from 0 to 1? I would not, i coded explicitly in the past, like, for example, doing explicit switches in sendfile() code if any of headers or trailers was set, etc. My real thought was however that soon the crack of doom for TCP as come due to the pushing of QUIC, and that kills the nail (nagel). Nagle, sorry. |Kind regards, Greetings. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)