Star64_linux/kernel
Clement Courbet 1e17fb8edc sched: Optimize __calc_delta()
A significant portion of __calc_delta() time is spent in the loop
shifting a u64 by 32 bits. Use `fls` instead of iterating.

This is ~7x faster on benchmarks.

The generic `fls` implementation (`generic_fls`) is still ~4x faster
than the loop.
Architectures that have a better implementation will make use of it. For
example, on x86 we get an additional factor 2 in speed without dedicated
implementation.

On GCC, the asm versions of `fls` are about the same speed as the
builtin. On Clang, the versions that use fls are more than twice as
slow as the builtin. This is because the way the `fls` function is
written, clang puts the value in memory:
https://godbolt.org/z/EfMbYe. This bug is filed at
https://bugs.llvm.org/show_bug.cgi?idI406.

```
name                                   cpu/op
BM_Calc<__calc_delta_loop>             9.57ms Â=B112%
BM_Calc<__calc_delta_generic_fls>      2.36ms Â=B113%
BM_Calc<__calc_delta_asm_fls>          2.45ms Â=B113%
BM_Calc<__calc_delta_asm_fls_nomem>    1.66ms Â=B112%
BM_Calc<__calc_delta_asm_fls64>        2.46ms Â=B113%
BM_Calc<__calc_delta_asm_fls64_nomem>  1.34ms Â=B115%
BM_Calc<__calc_delta_builtin>          1.32ms Â=B111%
```

Signed-off-by: Clement Courbet <courbet@google.com>
Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210303224653.2579656-1-joshdon@google.com
2021-03-10 09:51:49 +01:00
..
bpf idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
cgroup idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
configs
debug kgdb: fix to kill breakpoints on initmem after boot 2021-02-26 09:41:05 -08:00
dma Merge branch 'stable/for-linus-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2021-02-26 13:59:32 -08:00
entry entry: Explicitly flush pending rcuog wakeup before last rescheduling point 2021-02-17 14:12:43 +01:00
events kernel: delete repeated words in comments 2021-02-26 09:41:03 -08:00
gcov
irq Driver core / debugfs update for 5.12-rc1 2021-02-24 10:13:55 -08:00
kcsan
livepatch
locking kernel: delete repeated words in comments 2021-02-26 09:41:03 -08:00
power
printk Merge branch 'printk-rework' into for-linus 2021-02-22 13:43:55 +01:00
rcu Scheduler updates for v5.12: 2021-02-21 12:35:04 -08:00
sched sched: Optimize __calc_delta() 2021-03-10 09:51:49 +01:00
time These are the latest RCU updates for v5.12: 2021-02-21 12:04:41 -08:00
trace tracing: Skip selftests if tracing is disabled 2021-03-04 09:51:25 -05:00
.gitignore
acct.c
async.c
audit.c
audit.h
audit_fsnotify.c audit_alloc_mark(): don't open-code ERR_CAST() 2021-02-23 10:25:27 -05:00
audit_tree.c
audit_watch.c
auditfilter.c
auditsc.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
backtracetest.c
bounds.c
capability.c
compat.c
configs.c
context_tracking.c
cpu.c cpu/hotplug: Add cpuhp_invoke_callback_range() 2021-03-06 12:40:22 +01:00
cpu_pm.c
crash_core.c
crash_dump.c
cred.c
delayacct.c
dma.c
exec_domain.c
exit.c
extable.c
fail_function.c
fork.c kernel: provide create_io_thread() helper 2021-03-04 15:45:03 -07:00
freezer.c
futex.c
gen_kheaders.sh
groups.c groups: simplify struct group_info allocation 2021-02-26 09:41:03 -08:00
hung_task.c
iomem.c
irq_work.c
jump_label.c
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt preempt: Introduce CONFIG_PREEMPT_DYNAMIC 2021-02-17 14:12:24 +01:00
kcov.c
kexec.c
kexec_core.c Merge branch 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-02-21 09:29:23 -08:00
kexec_elf.c
kexec_file.c
kexec_internal.h kexec: move machine_kexec_post_load() to public interface 2021-02-22 12:33:26 +00:00
kheaders.c
kmod.c
kprobes.c kprobes: Fix to delay the kprobes jump optimization 2021-02-19 14:57:12 -05:00
ksysfs.c
kthread.c
latencytop.c
Makefile
module-internal.h
module.c
module_signature.c
module_signing.c
notifier.c
nsproxy.c
padata.c
panic.c
params.c
pid.c
pid_namespace.c
profile.c
ptrace.c kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals 2021-02-21 17:25:22 -07:00
range.c
reboot.c
regset.c
relay.c
resource.c
resource_kunit.c
rseq.c
scftorture.c
scs.c
seccomp.c
signal.c kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals 2021-02-21 17:25:22 -07:00
smp.c smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-02-17 14:12:42 +01:00
smpboot.c
smpboot.h
softirq.c
stackleak.c
stacktrace.c
static_call.c static_call: Allow module use without exposing static_call_key 2021-02-17 14:12:42 +01:00
stop_machine.c
sys.c Kbuild updates for v5.12 2021-02-25 10:17:31 -08:00
sys_ni.c
sysctl-test.c
sysctl.c sysctl.c: fix underflow value setting risk in vm_table 2021-02-26 09:41:03 -08:00
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c
user_namespace.c
usermode_driver.c
utsname.c
utsname_sysctl.c
watch_queue.c
watchdog.c
watchdog_hld.c
workqueue.c Merge branch 'for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2021-02-22 17:06:54 -08:00
workqueue_internal.h