linux-bl808/include/net
Eric Dumazet 8b27dae5a2 tcp: add one skb cache for rx
Often times, recvmsg() system calls and BH handling for a particular
TCP socket are done on different cpus.

This means the incoming skb had to be allocated on a cpu,
but freed on another.

This incurs a high spinlock contention in slab layer for small rpc,
but also a high number of cache line ping pongs for larger packets.

A full size GRO packet might use 45 page fragments, meaning
that up to 45 put_page() can be involved.

More over performing the __kfree_skb() in the recvmsg() context
adds a latency for user applications, and increase probability
of trapping them in backlog processing, since the BH handler
might found the socket owned by the user.

This patch, combined with the prior one increases the rpc
performance by about 10 % on servers with large number of cores.

(tcp_rr workload with 10,000 flows and 112 threads reach 9 Mpps
 instead of 8 Mpps)

This also increases single bulk flow performance on 40Gbit+ links,
since in this case there are often two cpus working in tandem :

 - CPU handling the NIC rx interrupts, feeding the receive queue,
  and (after this patch) freeing the skbs that were consumed.

 - CPU in recvmsg() system call, essentially 100 % busy copying out
  data to user space.

Having at most one skb in a per-socket cache has very little risk
of memory exhaustion, and since it is protected by socket lock,
its management is essentially free.

Note that if rps/rfs is used, we do not enable this feature, because
there is high chance that the same cpu is handling both the recvmsg()
system call and the TCP rx path, but that another cpu did the skb
allocations in the device driver right before the RPS/RFS logic.

To properly handle this case, it seems we would need to record
on which cpu skb was allocated, and use a different channel
to give skbs back to this cpu.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-23 21:57:38 -04:00
..
9p
bluetooth Bluetooth: Add quirk for reading BD_ADDR from fwnode property 2019-02-26 10:08:26 +01:00
caif
iucv
netfilter netfilter: nf_tables: bogus EBUSY when deleting set after flush 2019-03-11 13:19:24 +01:00
netns ipv6: Add icmp_echo_ignore_anycast for ICMPv6 2019-03-20 16:29:37 -07:00
nfc
phonet
sctp sctp: get sctphdr by offset in sctp_compute_cksum 2019-03-18 18:16:12 -07:00
tc_act
6lowpan.h
act_api.h
addrconf.h
af_ieee802154.h
af_rxrpc.h
af_unix.h net: split out functions related to registering inflight socket files 2019-02-28 08:24:23 -07:00
af_vsock.h
ah.h
arp.h
atmclip.h
ax25.h
ax88796.h
bond_3ad.h
bond_alb.h
bond_options.h
bonding.h
busy_poll.h
calipso.h
cfg80211-wext.h
cfg80211.h
cfg802154.h
checksum.h
cipso_ipv4.h
cls_cgroup.h
codel.h
codel_impl.h
codel_qdisc.h
compat.h
datalink.h
dcbevent.h
dcbnl.h
devlink.h devlink: Add support for direct reporter health state update 2019-03-04 11:00:43 -08:00
dn.h
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h
dsa.h net: dsa: add KSZ9893 switch tagging support 2019-03-03 13:48:49 -08:00
dsfield.h
dst.h net: dst: remove gc leftovers 2019-03-21 13:39:25 -07:00
dst_cache.h
dst_metadata.h
dst_ops.h
erspan.h
esp.h
ethoc.h
failover.h
fib_notifier.h
fib_rules.h
firewire.h
flow.h route: Add multipath_hash in flowi_common to make user-define hash 2019-02-27 12:50:17 -08:00
flow_dissector.h
flow_offload.h
fou.h
fq.h
fq_impl.h
garp.h
gen_stats.h
genetlink.h genetlink: make policy common to family 2019-03-22 10:38:23 -04:00
geneve.h
gre.h
gro_cells.h
gtp.h
gue.h
hwbm.h
icmp.h net: Add __icmp_send helper. 2019-02-25 14:32:35 -08:00
ieee80211_radiotap.h
ieee802154_netdev.h
if_inet6.h
ife.h
ila.h
inet6_connection_sock.h
inet6_hashtables.h
inet_common.h
inet_connection_sock.h
inet_ecn.h
inet_frag.h net: remove unused struct inet_frag_queue.fragments field 2019-02-26 08:27:05 -08:00
inet_hashtables.h
inet_sock.h
inet_timewait_sock.h
inetpeer.h
ip.h ipv4: Allow amount of dirty memory from fib resizing to be controllable 2019-03-21 13:29:53 -07:00
ip6_checksum.h
ip6_fib.h ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create 2019-03-21 10:16:54 -07:00
ip6_route.h
ip6_tunnel.h
ip_fib.h
ip_tunnels.h route: Add multipath_hash in flowi_common to make user-define hash 2019-02-27 12:50:17 -08:00
ip_vs.h
ipcomp.h
ipconfig.h
ipv6.h
ipv6_frag.h
ipx.h
iw_handler.h
kcm.h
l3mdev.h
lag.h
lapb.h
lib80211.h
llc.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
lwtunnel.h
mac80211.h
mac802154.h
mip6.h
mld.h
mpls.h
mpls_iptunnel.h
mrp.h
ncsi.h
ndisc.h
neighbour.h
net_failover.h
net_namespace.h
net_ratelimit.h
netevent.h
netlabel.h
netlink.h
netprio_cgroup.h
netrom.h
nexthop.h
nl802154.h
nsh.h
p8022.h
page_pool.h
ping.h
pkt_cls.h net: sched: set dedicated tcf_walker flag when tp is empty 2019-02-25 10:18:17 -08:00
pkt_sched.h
pptp.h
protocol.h
psample.h
psnap.h
raw.h
rawv6.h
red.h
regulatory.h
request_sock.h tcp: free request sock directly upon TFO or syncookies error 2019-03-19 14:13:01 -07:00
rose.h
route.h
rsi_91x.h
rtnetlink.h
sch_generic.h net: sched: add empty status flag for NOLOCK qdisc 2019-03-23 21:52:36 -04:00
scm.h
secure_seq.h
seg6.h
seg6_hmac.h
seg6_local.h
slhc_vj.h
smc.h
snmp.h
sock.h tcp: add one skb cache for rx 2019-03-23 21:57:38 -04:00
sock_reuseport.h
Space.h
stp.h
strparser.h
switchdev.h switchdev: Remove unused transaction item queue 2019-03-01 21:35:19 -08:00
tcp.h tcp: convert tcp_md5_needed to static_branch API 2019-02-26 13:16:03 -08:00
tcp_states.h
timewait_sock.h
tipc.h
tls.h net/tls: Add support of AES128-CCM based ciphers 2019-03-20 11:02:05 -07:00
transp_v6.h
tso.h
tun_proto.h
udp.h
udp_tunnel.h
udplite.h
vsock_addr.h
vxlan.h vxlan: add extack support for create and changelink 2019-02-26 08:54:37 -08:00
wext.h
wimax.h
x25.h
x25device.h
xdp.h
xdp_sock.h xsk: fix umem memory leak on cleanup 2019-03-16 01:27:51 +01:00
xfrm.h