Star64_linux/arch/x86
Avi Kivity 0d1622d7f5 x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write
The Intel Architecture Optimization Reference Manual states that a short
load that follows a long store to the same object will suffer a store
forwading penalty, particularly if the two accesses use different addresses.
Trivially, a long load that follows a short store will also suffer a penalty.

__downgrade_write() in rwsem incurs both penalties:  the increment operation
will not be able to reuse a recently-loaded rwsem value, and its result will
not be reused by any recently-following rwsem operation.

A comment in the code states that this is because 64-bit immediates are
special and expensive; but while they are slightly special (only a single
instruction allows them), they aren't expensive: a test shows that two loops,
one loading a 32-bit immediate and one loading a 64-bit immediate, both take
1.5 cycles per iteration.

Fix this by changing __downgrade_write to use the same add instruction on
i386 and on x86_64, so that it uses the same operand size as all the other
rwsem functions.

Signed-off-by: Avi Kivity <avi@redhat.com>
LKML-Reference: <1266049992-17419-1-git-send-email-avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-13 13:37:56 -08:00
..
boot Merge branch 'for-33' of git://repo.or.cz/linux-kbuild 2009-12-17 07:23:42 -08:00
configs
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-12-01 15:16:22 +08:00
ia32 Unify sys_mmap* 2009-12-11 06:44:29 -05:00
include/asm x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write 2010-02-13 13:37:56 -08:00
kernel x86: Merge show_regs() 2010-01-13 09:23:15 -08:00
kvm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2009-12-14 09:58:24 -08:00
lguest
lib x86-64: support native xadd rwsem implementation 2010-01-13 22:39:50 -08:00
math-emu
mm x86: Lift restriction on the location of FIX_BTMAP_* 2009-12-30 11:57:30 +01:00
oprofile perf events, x86/stacktrace: Make stack walking optional 2009-12-17 09:56:19 +01:00
pci x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource 2009-12-04 16:00:17 -08:00
power hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events 2009-11-08 15:34:42 +01:00
tools Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-19 09:48:14 -08:00
vdso sysctl x86: Remove dead binary sysctl support 2009-11-12 02:05:04 -08:00
video
xen locking: Convert raw_spinlock to arch_spinlock 2009-12-14 23:55:32 +01:00
Kbuild
Kconfig hw-breakpoints: Fix hardware breakpoints -> perf events dependency 2009-12-18 13:11:51 +01:00
Kconfig.cpu x86-64: support native xadd rwsem implementation 2010-01-13 22:39:50 -08:00
Kconfig.debug Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-11 20:47:30 -08:00
Makefile Merge branch 'perf/core' into perf/probes 2009-11-17 10:17:47 +01:00
Makefile_32.cpu Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:38:11 -08:00