net: percpu net_device refcount

We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.

There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)

We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.

On x86, dev_hold(dev) code :

before
        lock    incl 0x280(%ebx)
after:
        movl    0x260(%ebx),%eax
        incl    fs:(%eax)

Stress bench :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE)

Before:

real    1m1.662s
user    0m14.373s
sys     12m55.960s

After:

real    0m51.179s
user    0m15.329s
sys     10m15.942s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet 2010-10-11 10:22:12 +00:00 committed by David S. Miller
parent f0b9f47251
commit 29b4433d99
4 changed files with 41 additions and 14 deletions

View file

@ -1026,7 +1026,7 @@ struct net_device {
struct timer_list watchdog_timer;
/* Number of references to this device */
atomic_t refcnt ____cacheline_aligned_in_smp;
int __percpu *pcpu_refcnt;
/* delayed register/unregister */
struct list_head todo_list;
@ -1330,6 +1330,7 @@ static inline void unregister_netdevice(struct net_device *dev)
unregister_netdevice_queue(dev, NULL);
}
extern int netdev_refcnt_read(const struct net_device *dev);
extern void free_netdev(struct net_device *dev);
extern void synchronize_net(void);
extern int register_netdevice_notifier(struct notifier_block *nb);
@ -1798,7 +1799,7 @@ extern void netdev_run_todo(void);
*/
static inline void dev_put(struct net_device *dev)
{
atomic_dec(&dev->refcnt);
irqsafe_cpu_dec(*dev->pcpu_refcnt);
}
/**
@ -1809,7 +1810,7 @@ static inline void dev_put(struct net_device *dev)
*/
static inline void dev_hold(struct net_device *dev)
{
atomic_inc(&dev->refcnt);
irqsafe_cpu_inc(*dev->pcpu_refcnt);
}
/* Carrier loss detection, dial on demand. The functions netif_carrier_on