Эх сурвалжийг харах

tcp: inherit listener congestion control for passive cnx

Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.

Reported-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet 13 жил өмнө
parent
commit
d8a6e65f8b

+ 3 - 0
Documentation/networking/ip-sysctl.txt

@@ -175,6 +175,9 @@ tcp_congestion_control - STRING
 	connections. The algorithm "reno" is always available, but
 	connections. The algorithm "reno" is always available, but
 	additional choices may be available based on kernel configuration.
 	additional choices may be available based on kernel configuration.
 	Default is set as part of kernel configuration.
 	Default is set as part of kernel configuration.
+	For passive connections, the listener congestion control choice
+	is inherited.
+	[see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
 
 
 tcp_cookie_size - INTEGER
 tcp_cookie_size - INTEGER
 	Default size of TCP Cookie Transactions (TCPCT) option, that may be
 	Default size of TCP Cookie Transactions (TCPCT) option, that may be

+ 1 - 0
net/ipv4/tcp_ipv4.c

@@ -1511,6 +1511,7 @@ exit:
 	return NULL;
 	return NULL;
 put_and_exit:
 put_and_exit:
 	tcp_clear_xmit_timers(newsk);
 	tcp_clear_xmit_timers(newsk);
+	tcp_cleanup_congestion_control(newsk);
 	bh_unlock_sock(newsk);
 	bh_unlock_sock(newsk);
 	sock_put(newsk);
 	sock_put(newsk);
 	goto exit;
 	goto exit;

+ 3 - 1
net/ipv4/tcp_minisocks.c

@@ -495,7 +495,9 @@ struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
 		newtp->frto_counter = 0;
 		newtp->frto_counter = 0;
 		newtp->frto_highmark = 0;
 		newtp->frto_highmark = 0;
 
 
-		newicsk->icsk_ca_ops = &tcp_init_congestion_ops;
+		if (newicsk->icsk_ca_ops != &tcp_init_congestion_ops &&
+		    !try_module_get(newicsk->icsk_ca_ops->owner))
+			newicsk->icsk_ca_ops = &tcp_init_congestion_ops;
 
 
 		tcp_set_ca_state(newsk, TCP_CA_Open);
 		tcp_set_ca_state(newsk, TCP_CA_Open);
 		tcp_init_xmit_timers(newsk);
 		tcp_init_xmit_timers(newsk);