net: dont hold rtnl mutex during netlink dump callbacks

Four years ago, Patrick made a change to hold rtnl mutex during netlink
dump callbacks.

I believe it was a wrong move. This slows down concurrent dumps, making
good old /proc/net/ files faster than rtnetlink in some situations.

This occurred to me because one "ip link show dev ..." was _very_ slow
on a workload adding/removing network devices in background.

All dump callbacks are able to use RCU locking now, so this patch does
roughly a revert of commits :

1c2d670f36 : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
6313c1e099 : [RTNETLINK]: Remove unnecessary locking in dump callbacks

This let writers fight for rtnl mutex and readers going full speed.

It also takes care of phonet : phonet_route_get() is now called from rcu
read section. I renamed it to phonet_route_get_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet 2011-04-27 22:56:07 +00:00 committed by David S. Miller
parent dcfd9cdc12
commit e67f88dd12
8 changed files with 25 additions and 23 deletions

View file

@ -752,7 +752,8 @@ static int dn_nl_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
skip_naddr = cb->args[1];
idx = 0;
for_each_netdev(&init_net, dev) {
rcu_read_lock();
for_each_netdev_rcu(&init_net, dev) {
if (idx < skip_ndevs)
goto cont;
else if (idx > skip_ndevs) {
@ -761,11 +762,11 @@ static int dn_nl_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
skip_naddr = 0;
}
if ((dn_db = rtnl_dereference(dev->dn_ptr)) == NULL)
if ((dn_db = rcu_dereference(dev->dn_ptr)) == NULL)
goto cont;
for (ifa = rtnl_dereference(dn_db->ifa_list), dn_idx = 0; ifa;
ifa = rtnl_dereference(ifa->ifa_next), dn_idx++) {
for (ifa = rcu_dereference(dn_db->ifa_list), dn_idx = 0; ifa;
ifa = rcu_dereference(ifa->ifa_next), dn_idx++) {
if (dn_idx < skip_naddr)
continue;
@ -778,6 +779,7 @@ cont:
idx++;
}
done:
rcu_read_unlock();
cb->args[0] = idx;
cb->args[1] = dn_idx;