GitHub-Mirror/Star64_linux

mirror of https://github.com/Fishwaldo/Star64_linux.git synced 2025-06-14 10:37:51 +00:00

Author	SHA1	Message	Date
Edward Cree	03714bbb22	sfc: make mem_bar a function rather than a constant Support using BAR 0 on SFC9250, even though the driver doesn't bind to such devices yet. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 13:07:49 -05:00
Linus Torvalds	64a48099b3	Merge branch 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 syscall entry code changes for PTI from Ingo Molnar: "The main changes here are Andy Lutomirski's changes to switch the x86-64 entry code to use the 'per CPU entry trampoline stack'. This, besides helping fix KASLR leaks (the pending Page Table Isolation (PTI) work), also robustifies the x86 entry code" * 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/cpufeatures: Make CPU bugs sticky x86/paravirt: Provide a way to check for hypervisors x86/paravirt: Dont patch flush_tlb_single x86/entry/64: Make cpu_entry_area.tss read-only x86/entry: Clean up the SYSENTER_stack code x86/entry/64: Remove the SYSENTER stack canary x86/entry/64: Move the IST stacks into struct cpu_entry_area x86/entry/64: Create a per-CPU SYSCALL entry trampoline x86/entry/64: Return to userspace from the trampoline stack x86/entry/64: Use a per-CPU trampoline stack for IDT entries x86/espfix/64: Stop assuming that pt_regs is on the entry stack x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 x86/entry: Remap the TSS into the CPU entry area x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct x86/dumpstack: Handle stack overflow on all stacks x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss x86/kasan/64: Teach KASAN about the cpu_entry_area x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area x86/entry/gdt: Put per-CPU GDT remaps in ascending order x86/dumpstack: Add get_stack_info() support for the SYSENTER stack ...	2017-12-18 08:59:15 -08:00
David S. Miller	59436c9ee1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit `e87c6bc385` ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 10:51:06 -05:00
David S. Miller	b36025b19a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2017-12-17 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix a corner case in generic XDP where we have non-linear skbs but enough tailroom in the skb to not miss to linearizing there, from Song. 2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when BPF context is not skb, from Daniel. 3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper call would use the wrong register for the skb, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 10:49:22 -05:00
Paolo Bonzini	43aabca38a	KVM/ARM Fixes for v4.15, Round 2 Fixes: - A bug in our handling of SPE state for non-vhe systems - A bug that causes hyp unmapping to go off limits and crash the system on shutdown - Three timer fixes that were introduced as part of the timer optimizations for v4.15 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaN5C1AAoJEEtpOizt6ddyodkH/jN1lquFVdYJBlEO6NXiumEk GBH6x6CmuGyiUL3J0ffx5U51x0NN2jE89TpH5d1dsnQg77CCjTCxHtQ9suHne3n1 5/r0BzHZhaCbnbY0f7+E4EL0UOTpiAwUIqin1ufLPjs4XywcFyiLa7xiWkQkDmyr WXKOdppTc4j/FUyqb1fQBmYY8pENR5jjfgdaeZ6C6o7e6aksXgrPWqXhV/6OSRLd MOcxA06QfwTWy+MT1x4yo1hzCTjOEvvQXT2Va09moiNxT7hVWWvO/kwJVQL+YpWW di7t4CLCvGYUsxM5t8fHHV7X+dfd2nvpJA46TWggPye7yMYkTYXFQu1LHwPIdDU= =c5Kt -----END PGP SIGNATURE----- Merge tag 'kvm-arm-fixes-for-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Fixes for v4.15, Round 2 Fixes: - A bug in our handling of SPE state for non-vhe systems - A bug that causes hyp unmapping to go off limits and crash the system on shutdown - Three timer fixes that were introduced as part of the timer optimizations for v4.15	2017-12-18 12:57:43 +01:00
Wanpeng Li	e39d200fa5	KVM: Fix stack-out-of-bounds read in write_mmio Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through (u64 )val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-18 12:57:01 +01:00
Takashi Iwai	bb82e0b4a7	ACPI: APEI / ERST: Fix missing error handling in erst_reader() The commit `f6f8285132` ("pstore: pass allocated memory region back to caller") changed the check of the return value from erst_read() in erst_reader() in the following way: if (len == -ENOENT) goto skip; - else if (len < 0) { - rc = -1; + else if (len < sizeof(*rcd)) { + rc = -EIO; goto out; This introduced another bug: since the comparison with sizeof() is cast to unsigned, a negative len value doesn't hit any longer. As a result, when an error is returned from erst_read(), the code falls through, and it may eventually lead to some weird thing like memory corruption. This patch adds the negative error value check more explicitly for addressing the issue. Fixes: `f6f8285132` (pstore: pass allocated memory region back to caller) Cc: All applicable <stable@vger.kernel.org> Tested-by: Jerry Tang <jtang@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-18 12:12:08 +01:00
Colin Ian King	951ef0e19f	ACPI: CPPC: remove initial assignment of pcc_ss_data The initialization of pcc_ss_data from pcc_data[pcc_ss_id] before pcc_ss_id is being range checked could lead to an out-of-bounds array read. This very same initialization is also being performed after the range check on pcc_ss_id, so we can just remove this problematic and also redundant assignment to fix the issue. Detected by cppcheck: warning: Value stored to 'pcc_ss_data' during its initialization is never read Fixes: `85b1407bf6` (ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-18 12:10:37 +01:00
Rafael J. Wysocki	56026645e2	cpufreq: governor: Ensure sufficiently large sampling intervals After commit `aa7519af45` (cpufreq: Use transition_delay_us for legacy governors as well) the sampling_rate field of struct dbs_data may be less than the tick period which causes dbs_update() to produce incorrect results, so make the code ensure that the value of that field will always be sufficiently large. Fixes: `aa7519af45` (cpufreq: Use transition_delay_us for legacy governors as well) Reported-by: Andy Tang <andy.tang@nxp.com> Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Andy Tang <andy.tang@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-12-18 12:09:39 +01:00
Lucas Stach	ccc153a6de	cpufreq: imx6q: fix speed grading regression on i.MX6 QuadPlus The commit moving the speed grading check to the cpufreq driver introduced some additional checks, so the OPP disable is only attempted on SoCs where those OPPs are present. The compatible checks are missing the QuadPlus compatible, so invalid OPPs are not correctly disabled there. Move both checks to a single condition, so we don't need to sprinkle even more calls to of_machine_is_compatible(). Fixes: `2b3d58a3ad` (cpufreq: imx6q: Move speed grading check to cpufreq driver) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-12-18 12:06:37 +01:00
Rafael J. Wysocki	5839ee7389	PCI / PM: Force devices to D0 in pci_pm_thaw_noirq() It is incorrect to call pci_restore_state() for devices in low-power states (D1-D3), as that involves the restoration of MSI setup which requires MMIO to be operational and that is only the case in D0. However, pci_pm_thaw_noirq() may do that if the driver's "freeze" callbacks put the device into a low-power state, so fix it by making it force devices into D0 via pci_set_power_state() instead of trying to "update" their power state which is pointless. Fixes: `e60514bd44` (PCI/PM: Restore the status of PCI devices across hibernation) Cc: 4.13+ <stable@vger.kernel.org> # 4.13+ Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Maarten Lankhorst <dev@mblankhorst.nl> Tested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Maarten Lankhorst <dev@mblankhorst.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>	2017-12-18 12:06:07 +01:00
Kailang Yang	9226665159	ALSA: hda/realtek - Fix Dell AIO LineOut issue Dell AIO had LineOut jack. Add LineOut verb into this patch. [ Additional notes: the ALC274 codec seems requiring the fixed pin / DAC connections for HP / line-out pins for enabling EQ for speakers; i.e. the HP / LO pins expect to be connected with NID 0x03 while keeping the speaker with NID 0x02. However, by adding a new line-out pin, the auto-parser assigns the NID 0x02 for HP/LO pins as primary outputs. As an easy workaround, we provide the preferred_pairs[] to map forcibly for these pins. -- tiwai ] Fixes: `75ee94b20b` ("ALSA: hda - fix headset mic problem for Dell machines with alc274") Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2017-12-18 11:09:05 +01:00
Christoffer Dall	0eb7c33cad	KVM: arm/arm64: Fix timer enable flow When enabling the timer on the first run, we fail to ever restore the state and mark it as loaded. That means, that in the initial entry to the VCPU ioctl, unless we exit to userspace for some reason such as a pending signal, if the guest programs a timer and blocks, we will wait forever, because we never read back the hardware state (the loaded flag is not set), and so we think the timer is disabled, and we never schedule a background soft timer. The end result? The VCPU blocks forever, and the only solution is to kill the thread. Fixes: `4a2c4da125` ("arm/arm64: KVM: Load the timer state when enabling the timer") Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2017-12-18 10:53:24 +01:00
Christoffer Dall	36e5cfd410	KVM: arm/arm64: Properly handle arch-timer IRQs after vtimer_save_state The recent timer rework was assuming that once the timer was disabled, we should no longer see any interrupts from the timer. This assumption turns out to not be true, and instead we have to handle the case when the timer ISR runs even after the timer has been disabled. This requires a couple of changes: First, we should never overwrite the cached guest state of the timer control register when the ISR runs, because KVM may have disabled its timers when doing vcpu_put(), even though the guest still had the timer enabled. Second, we shouldn't assume that the timer is actually firing just because we see an interrupt, but we should check the actual state of the timer in the timer control register to understand if the hardware timer is really firing or not. We also add an ISB to vtimer_save_state() to ensure the timer is actually disabled once we enable interrupts, which should clarify the intention of the implementation, and reduce the risk of unwanted interrupts. Fixes: `b103cc3f10` ("KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit") Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Jia He <hejianet@gmail.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2017-12-18 10:53:24 +01:00
Marc Zyngier	f384dcfe4d	KVM: arm/arm64: timer: Don't set irq as forwarded if no usable GIC If we don't have a usable GIC, do not try to set the vcpu affinity as this is guaranteed to fail. Reported-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2017-12-18 10:53:23 +01:00
Marc Zyngier	7839c672e5	KVM: arm/arm64: Fix HYP unmapping going off limits When we unmap the HYP memory, we try to be clever and unmap one PGD at a time. If we start with a non-PGD aligned address and try to unmap a whole PGD, things go horribly wrong in unmap_hyp_range (addr and end can never match, and it all goes really badly as we keep incrementing pgd and parse random memory as page tables...). The obvious fix is to let unmap_hyp_range do what it does best, which is to iterate over a range. The size of the linear mapping, which begins at PAGE_OFFSET, can be easily calculated by subtracting PAGE_OFFSET form high_memory, because high_memory is defined as the linear map address of the last byte of DRAM, plus one. The size of the vmalloc region is given trivially by VMALLOC_END - VMALLOC_START. Cc: stable@vger.kernel.org Reported-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2017-12-18 10:53:23 +01:00
Julien Thierry	bfe766cf65	arm64: kvm: Prevent restoring stale PMSCR_EL1 for vcpu When VHE is not present, KVM needs to save and restores PMSCR_EL1 when possible. If SPE is used by the host, value of PMSCR_EL1 cannot be saved for the guest. If the host starts using SPE between two save+restore on the same vcpu, restore will write the value of PMSCR_EL1 read during the first save. Make sure __debug_save_spe_nvhe clears the value of the saved PMSCR_EL1 when the guest cannot use SPE. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2017-12-18 10:53:22 +01:00
Miquel Raynal	d82c368216	mtd: Fix mtd_check_oob_ops() The mtd_check_oob_ops() helper verifies if the operation defined by the user is correct. Fix the check that verifies if the entire requested area exists. This check is too restrictive and will fail anytime the last data byte of the very last page is included in an operation. Fixes: `5cdd929da5` ("mtd: Add sanity checks in mtd_write/read_oob()") Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2017-12-18 09:16:35 +01:00
Linus Torvalds	1291a0d504	Linux 4.15-rc4	2017-12-17 18:59:59 -08:00
Kees Cook	779f4e1c6c	Revert "exec: avoid RLIMIT_STACK races with prlimit()" This reverts commit `04e35f4495`. SELinux runs with secureexec for all non-"noatsecure" domain transitions, which means lots of processes end up hitting the stack hard-limit change that was introduced in order to fix a race with prlimit(). That race fix will need to be redesigned. Reported-by: Laura Abbott <labbott@redhat.com> Reported-by: Tomáš Trnka <trnka@scm.com> Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-17 14:26:25 -08:00
Chunyan Zhang	36b0cb84ee	ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch An additional 'ip' will be pushed to the stack, for restoring the DACR later, if CONFIG_CPU_SW_DOMAIN_PAN defined. However, the fixup still get the err_ptr by add #8*4 to sp, which results in the fact that the code area pointed by the LR will be overwritten, or the kernel will crash if CONFIG_DEBUG_RODATA is enabled. This patch fixes the stack mismatch. Fixes: `a5e090acbf` ("ARM: software-based priviledged-no-access support") Signed-off-by: Lvqiang Huang <Lvqiang.Huang@spreadtrum.com> Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2017-12-17 22:20:39 +00:00
Linus Torvalds	f8940a0f20	Merge branch 'WIP.x86-pti.base-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Page Table Isolation (PTI) v4.14 backporting base tree from Ingo Molnar: "This tree contains the v4.14 PTI backport preparatory tree, which consists of four merges of upstream trees and 7 cherry-picked commits, which the upcoming PTI work depends on" NOTE! The resulting tree is exactly the same as the original base tree (ie the diff between this commit and its immediate first parent is empty). The only reason for this merge is literally to have a common point for the actual PTI changes so that the commits can be shared in both the 4.15 and 4.14 trees. * 'WIP.x86-pti.base-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow locking/barriers: Convert users of lockless_dereference() to READ_ONCE() locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() bpf: fix build issues on um due to mising bpf_perf_event.h perf/x86: Enable free running PEBS for REGS_USER/INTR x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD x86/cpufeature: Add User-Mode Instruction Prevention definitions	2017-12-17 13:57:08 -08:00
Linus Torvalds	6ba64feff6	Merge branch 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Page Table Isolation (PTI) preparatory tree from Ingo Molnar: "This does a rename to free up linux/pti.h to be used by the upcoming page table isolation feature" * 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: drivers/misc/intel/pti: Rename the header file to free up the namespace	2017-12-17 13:54:31 -08:00
Linus Torvalds	2ffb448ccb	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single bugfix which prevents arbitrary sigev_notify values in posix-timers" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timer: Properly check sigevent->sigev_notify	2017-12-17 13:48:50 -08:00
Linus Torvalds	c43727908f	dmaengine fixes for v4.15-rc4 Here are fixes for this round - Fix for disable clk on error path in fsl-edma driver - Disable clk fail fix in jz4740 driver - Fix long pending bug in dmatest driver for dangling pointer - Fix potential NULL pointer dereference in at_hdmac driver - Error handling path in ioat driver -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaNp2PAAoJEHwUBw8lI4NHNfEP/RcERW0adeh4HXIges3FMW9N cqhRQpnUWmYIOTsfrWDH6WyR7pmEBOLryBzZXQ6Y8I3AOW4FTwBOG/W8NqFj+aiA JjeKRkXdpHD1eZpav72FUUhf4inb7BTtpGN5lBEFFk+gA3rWLscUw33OJLZmxsb5 Vi82pJARfst42XB/jZUBDq+YNYlgG+NspYgUWvS5EnNXJJmy+47c5TS9EQUovctM X0yrKNomSW2qEO/IEXTUvXtArxpmxBrZI1q9RU21JJvF3izElqYO9C3OhGijM27u ERHWM1iStOxOl7zrzp8JJmVswdHyW/rT5uSlhl4689S2UZEJrqBGPo67Nh5w0Isi Nk+2X8I7Z/tNByggXASHX/xXymQGJF/mxAs1KypnSE0DpCJQi7oJAG6UY7lbIOz8 CRGRt0gM8Xe/GrZa+7ZXmc7BoCrNPaGmW070XX3BoMcqT9m+SFOD2DhT351AVoOI uIPw5gAw0OCYhzkPP1c6AZPdLLFlVGrmPARRto/Af7w4Fnw4UlWGQ/+d6oQlvEyG G0kPxKE3e6uHXOizgm3GkbqpQWjTz649JMNw+xVey9e/pSVXORYIIomPmwahbh8x 7Rfojaccq3w0YYG58z/Q5+WmzWf0jU//G/Cq5LsOm/5ghZTYRna7olA6lxFcn+ab 6WQBW1rV2y8XUpDl4ScB =2egq -----END PGP SIGNATURE----- Merge tag 'dmaengine-fix-4.15-rc4' of git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "This time consisting of fixes in a bunch of drivers and the dmatest module: - Fix for disable clk on error path in fsl-edma driver - Disable clk fail fix in jz4740 driver - Fix long pending bug in dmatest driver for dangling pointer - Fix potential NULL pointer dereference in at_hdmac driver - Error handling path in ioat driver" * tag 'dmaengine-fix-4.15-rc4' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: fsl-edma: disable clks on all error paths dmaengine: jz4740: disable/unprepare clk if probe fails dmaengine: dmatest: move callback wait queue to thread context dmaengine: at_hdmac: fix potential NULL pointer dereference in atc_prep_dma_interleaved dmaengine: ioat: Fix error handling path	2017-12-17 13:28:49 -08:00
Arnd Bergmann	b9f5fb1800	cramfs: fix MTD dependency With CONFIG_MTD=m and CONFIG_CRAMFS=y, we now get a link failure: fs/cramfs/inode.o: In function `cramfs_mount': inode.c:(.text+0x220): undefined reference to `mount_mtd' fs/cramfs/inode.o: In function `cramfs_mtd_fill_super': inode.c:(.text+0x6d8): undefined reference to `mtd_point' inode.c:(.text+0xae4): undefined reference to `mtd_unpoint' This adds a more specific Kconfig dependency to avoid the broken configuration. Alternatively we could make CRAMFS itself depend on "MTD \|\| !MTD" with a similar result. Fixes: `99c18ce580` ("cramfs: direct memory access support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-17 12:20:58 -08:00
Linus Torvalds	73d080d374	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "The alloc_super() one is a regression in this merge window, lazytime thing is older..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Handle lazytime in do_mount() alloc_super(): do ->s_umount initialization earlier	2017-12-17 12:18:35 -08:00
Linus Torvalds	1c6b942d7d	Fix a regression which caused us to fail to interpret symlinks in very ancient ext3 file system images. Also fix two xfstests failures, one of which could cause a OOPS, plus an additional bug fix caught by fuzz testing. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlo1y3EACgkQ8vlZVpUN gaNFOQf/bMf6ynai1dGGRwef+UcT874NZ2Hqm+UqI6pxusz0ZeKWm8HWfPfg31Fa o+OnUsZ7NXFBIHyfXKFJzdOgutjZ5eY0vMu+NrlyBdd6W+ZcHwn1PvQsLapFYvqK Rt+8nWTKqtnksSfh0vyODmUYgItOULOPPepjnIPm/Pd0DinJwo0GY/8MzLkz4SpX g6R60ou0ToEYNqBXAKIBnZ4aq8KWMtCMGcD270U5eAm/63Pt4riRwJbjITxZPAH1 wKzivP4Ce5ce8W2g2/6mFFlBFWvtlB491T+BsgHUEv3OLze+kYS2PcxQthhEmBR8 zeZ2o2/0tTxejE//cyJ4gCe3fYGRDg== =xqLC -----END PGP SIGNATURE----- Merge tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a regression which caused us to fail to interpret symlinks in very ancient ext3 file system images. Also fix two xfstests failures, one of which could cause an OOPS, plus an additional bug fix caught by fuzz testing" * tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix crash when a directory's i_size is too small ext4: add missing error check in __ext4_new_inode() ext4: fix fdatasync(2) after fallocate(2) operation ext4: support fast symlinks from ext3 file systems	2017-12-17 12:14:33 -08:00
John David Anglin	da57c5414f	parisc: Reduce thread stack to 16 kb In testing, I found that the thread stack can be 16 kB when using an irq stack. Without it, the thread stack needs to be 32 kB. Currently, the irq stack is 32 kB. While it probably could be 16 kB, I would prefer to leave it as is for safety. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2017-12-17 21:06:25 +01:00
John David Anglin	9352aeada4	Revert "parisc: Re-enable interrupts early" This reverts commit `5c38602d83`. Interrupts can't be enabled early because the register saves are done on the thread stack prior to switching to the IRQ stack. This caused stack overflows and the thread stack needed increasing to 32k. Even then, stack overflows still occasionally occurred. Background: Even with a 32 kB thread stack, I have seen instances where the thread stack overflowed on the mx3210 buildd. Detection of stack overflow only occurs when we have an external interrupt. When an external interrupt occurs, we switch to the thread stack if we are not already on a kernel stack. Then, registers and specials are saved to the kernel stack. The bug occurs in intr_return where interrupts are reenabled prior to returning from the interrupt. This was done incase we need to schedule or deliver signals. However, it introduces the possibility that multiple external interrupts may occur on the thread stack and cause a stack overflow. These might not be detected and cause the kernel to misbehave in random ways. This patch changes the code back to only reenable interrupts when we are going to schedule or deliver signals. As a result, we generally return from an interrupt before reenabling interrupts. This minimizes the growth of the thread stack. Fixes: `5c38602d83` ("parisc: Re-enable interrupts early") Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Helge Deller <deller@gmx.de>	2017-12-17 21:06:25 +01:00
Pravin Shedge	6a16fc3220	parisc: remove duplicate includes These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>	2017-12-17 21:06:25 +01:00
Helge Deller	bcf3f1752a	parisc: Hide Diva-built-in serial aux and graphics card Diva GSP card has built-in serial AUX port and ATI graphic card which simply don't work and which both don't have external connectors. User Guides even mention that those devices shouldn't be used. So, prevent that Linux drivers try to enable those devices. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v3.0+	2017-12-17 21:06:25 +01:00
Helge Deller	0ed9d3de5f	parisc: Align os_hpmc_size on word boundary The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus triggered the unaligned fault handler at startup. Fix it by aligning it properly. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+	2017-12-17 21:06:25 +01:00
Helge Deller	203c110b39	parisc: Fix indenting in puts() Static analysis tools complain that we intended to have curly braces around this indent block. In this case this assumption is wrong, so fix the indenting. Fixes: `2f3c7b8137` ("parisc: Add core code for self-extracting kernel") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+	2017-12-17 21:06:25 +01:00
Josef Bacik	46df3d209d	trace: reenable preemption if we modify the ip Things got moved around between the original bpf_override_return patches and the final version, and now the ftrace kprobe dispatcher assumes if you modified the ip that you also enabled preemption. Make a comment of this and enable preemption, this fixes the lockdep splat that happened when using this feature. Fixes: `9802d86585` ("bpf: add a bpf_override_function helper") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:47:32 +01:00
Jakub Kicinski	4a29c0db69	nfp: set flags in the correct member of netdev_bpf netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: `92f0292b35` ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:41:59 +01:00
Jakub Kicinski	21567eded9	libbpf: fix Makefile exit code if libelf not found /bin/sh's exit does not recognize -1 as a number, leading to the following error message: /bin/sh: 1: exit: Illegal number: -1 Use 1 as the exit code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:40:29 +01:00
Daniel Borkmann	ef9fde06a2	Merge branch 'bpf-to-bpf-function-calls' Alexei Starovoitov says: ==================== First of all huge thank you to Daniel, John, Jakub, Edward and others who reviewed multiple iterations of this patch set over the last many months and to Dave and others who gave critical feedback during netconf/netdev. The patch is solid enough and we thought through numerous corner cases, but it's not the end. More followups with code reorg and features to follow. TLDR: Allow arbitrary function calls from bpf function to another bpf function. Since the beginning of bpf all bpf programs were represented as a single function and program authors were forced to use always_inline for all functions in their C code. That was causing llvm to unnecessary inflate the code size and forcing developers to move code to header files with little code reuse. With a bit of additional complexity teach verifier to recognize arbitrary function calls from one bpf function to another as long as all of functions are presented to the verifier as a single bpf program. Extended program layout: .. r1 = .. // arg1 r2 = .. // arg2 call pc+1 // function call pc-relative exit .. = r1 // access arg1 .. = r2 // access arg2 .. call pc+20 // second level of function call ... It allows for better optimized code and finally allows to introduce the core bpf libraries that can be reused in different projects, since programs are no longer limited by single elf file. With function calls bpf can be compiled into multiple .o files. This patch is the first step. It detects programs that contain multiple functions and checks that calls between them are valid. It splits the sequence of bpf instructions (one program) into a set of bpf functions that call each other. Calls to only known functions are allowed. Since all functions are presented to the verifier at once conceptually it is 'static linking'. Future plans: - introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions to be loaded into the kernel that can be later linked to other programs with concrete program types. Aka 'dynamic linking'. - introduce function pointer type and indirect calls to allow bpf functions call other dynamically loaded bpf functions while the caller bpf function is already executing. Aka 'runtime linking'. This will be more generic and more flexible alternative to bpf_tail_calls. FAQ: Q: Interpreter and JIT changes mean that new instruction is introduced ? A: No. The call instruction technically stays the same. Now it can call both kernel helpers and other bpf functions. Calling convention stays the same as well. From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn. Q: What had to change on LLVM side? A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release: https://reviews.llvm.org/rL318614 with few bugfixes as well. Make sure to build the latest llvm to have bpf_call support. More details in the patches. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:37 +01:00
Daniel Borkmann	28ab173e96	selftests/bpf: additional bpf_call tests Add some additional checks for few more corner cases. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	db496944fd	bpf: arm64: add JIT support for multi-function programs similar to x64 add support for bpf-to-bpf calls. When program has calls to in-kernel helpers the target call offset is known at JIT time and arm64 architecture needs 2 passes. With bpf-to-bpf calls the dynamically allocated function start is unknown until all functions of the program are JITed. Therefore (just like x64) arm64 JIT needs one extra pass over the program to emit correct call offsets. Implementation detail: Avoid being too clever in 64-bit immediate moves and always use 4 instructions (instead of 3-4 depending on the address) to make sure only one extra pass is needed. If some future optimization would make it worth while to optimize 'call 64-bit imm' further, the JIT would need to do 4 passes over the program instead of 3 as in this patch. For typical bpf program address the mov needs 3 or 4 insns, so unconditional 4 insns to save extra pass is a worthy trade off at this state of JIT. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	1c2a088a66	bpf: x64: add JIT support for multi-function programs Typical JIT does several passes over bpf instructions to compute total size and relative offsets of jumps and calls. With multitple bpf functions calling each other all relative calls will have invalid offsets intially therefore we need to additional last pass over the program to emit calls with correct offsets. For example in case of three bpf functions: main: call foo call bpf_map_lookup exit foo: call bar exit bar: exit We will call bpf_int_jit_compile() indepedently for main(), foo() and bar() x64 JIT typically does 4-5 passes to converge. After these initial passes the image for these 3 functions will be good except call targets, since start addresses of foo() and bar() are unknown when we were JITing main() (note that call bpf_map_lookup will be resolved properly during initial passes). Once start addresses of 3 functions are known we patch call_insn->imm to point to right functions and call bpf_int_jit_compile() again which needs only one pass. Additional safety checks are done to make sure this last pass doesn't produce image that is larger or smaller than previous pass. When constant blinding is on it's applied to all functions at the first pass, since doing it once again at the last pass can change size of the JITed code. Tested on x64 and arm64 hw with JIT on/off, blinding on/off. x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter. All other JITs that support normal BPF_CALL will behave the same way since bpf-to-bpf call is equivalent to bpf-to-kernel call from JITs point of view. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	60b58afc96	bpf: fix net.core.bpf_jit_enable race global bpf_jit_enable variable is tested multiple times in JITs, blinding and verifier core. The malicious root can try to toggle it while loading the programs. This race condition was accounted for and there should be no issues, but it's safer to avoid this race condition. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	1ea47e01ad	bpf: add support for bpf_call to interpreter though bpf_call is still the same call instruction and calling convention 'bpf to bpf' and 'bpf to helper' is the same the interpreter has to oparate on 'struct bpf_insn *'. To distinguish these two cases add a kernel internal opcode and mark call insns with it. This opcode is seen by interpreter only. JITs will never see it. Also add tiny bit of debug code to aid interpreter debugging. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	b0b04fc49e	selftests/bpf: add xdp noinline test add large semi-artificial XDP test with 18 functions to stress test bpf call verification logic Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	3bc35c63cb	selftests/bpf: add bpf_call test strip always_inline from test_l4lb.c and compile it with -fno-inline to let verifier go through 11 function with various function arguments and return values Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:36 +01:00
Alexei Starovoitov	48cca7e44f	libbpf: add support for bpf_call - recognize relocation emitted by llvm - since all regular function will be kept in .text section and llvm takes care of pc-relative offsets in bpf_call instruction simply copy all of .text to relevant program section while adjusting bpf_call instructions in program section to point to newly copied body of instructions from .text - do so for all programs in the elf file - set all programs types to the one passed to bpf_prog_load() Note for elf files with multiple programs that use different functions in .text section we need to do 'linker' style logic. This work is still TBD Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:35 +01:00
Alexei Starovoitov	d98588cef0	selftests/bpf: add tests for stack_zero tracking adjust two tests, since verifier got smarter and add new one to test stack_zero logic Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:35 +01:00
Alexei Starovoitov	cc2b14d510	bpf: teach verifier to recognize zero initialized stack programs with function calls are often passing various pointers via stack. When all calls are inlined llvm flattens stack accesses and optimizes away extra branches. When functions are not inlined it becomes the job of the verifier to recognize zero initialized stack to avoid exploring paths that program will not take. The following program would fail otherwise: ptr = &buffer_on_stack; ptr = 0; ... func_call(.., ptr, ...) { if (..) ptr = bpf_map_lookup(); } ... if (ptr != 0) { // Access (ptr)->field is valid. // Without stack_zero tracking such (*ptr)->field access // will be rejected } since stack slots are no longer uniform invalid \| spill \| misc add liveness marking to all slots, but do it in 8 byte chunks. So if nothing was read or written in [fp-16, fp-9] range it will be marked as LIVE_NONE. If any byte in that range was read, it will be marked LIVE_READ and stacksafe() check will perform byte-by-byte verification. If all bytes in the range were written the slot will be marked as LIVE_WRITTEN. This significantly speeds up state equality comparison and reduces total number of states processed. before after bpf_lb-DLB_L3.o 2051 2003 bpf_lb-DLB_L4.o 3287 3164 bpf_lb-DUNKNOWN.o 1080 1080 bpf_lxc-DDROP_ALL.o 24980 12361 bpf_lxc-DUNKNOWN.o 34308 16605 bpf_netdev.o 15404 10962 bpf_overlay.o 7191 6679 Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:35 +01:00
Alexei Starovoitov	a7ff3eca95	selftests/bpf: add verifier tests for bpf_call Add extensive set of tests for bpf_call verification logic: calls: basic sanity calls: using r0 returned by callee calls: callee is using r1 calls: callee using args1 calls: callee using wrong args2 calls: callee using two args calls: callee changing pkt pointers calls: two calls with args calls: two calls with bad jump calls: recursive call. test1 calls: recursive call. test2 calls: unreachable code calls: invalid call calls: jumping across function bodies. test1 calls: jumping across function bodies. test2 calls: call without exit calls: call into middle of ld_imm64 calls: call into middle of other call calls: two calls with bad fallthrough calls: two calls with stack read calls: two calls with stack write calls: spill into caller stack frame calls: two calls with stack write and void return calls: ambiguous return value calls: two calls that return map_value calls: two calls that return map_value with bool condition calls: two calls that return map_value with incorrect bool check calls: two calls that receive map_value via arg=ptr_stack_of_caller. test1 calls: two calls that receive map_value via arg=ptr_stack_of_caller. test2 calls: two jumps that receive map_value via arg=ptr_stack_of_jumper. test3 calls: two calls that receive map_value_ptr_or_null via arg. test1 calls: two calls that receive map_value_ptr_or_null via arg. test2 calls: pkt_ptr spill into caller stack Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:35 +01:00
Alexei Starovoitov	f4d7e40a5b	bpf: introduce function calls (verification) Allow arbitrary function calls from bpf function to another bpf function. To recognize such set of bpf functions the verifier does: 1. runs control flow analysis to detect function boundaries 2. proceeds with verification of all functions starting from main(root) function It recognizes that the stack of the caller can be accessed by the callee (if the caller passed a pointer to its stack to the callee) and the callee can store map_value and other pointers into the stack of the caller. 3. keeps track of the stack_depth of each function to make sure that total stack depth is still less than 512 bytes 4. disallows pointers to the callee stack to be stored into the caller stack, since they will be invalid as soon as the callee returns 5. to reuse all of the existing state_pruning logic each function call is considered to be independent call from the verifier point of view. The verifier pretends to inline all function calls it sees are being called. It stores the callsite instruction index as part of the state to make sure that two calls to the same callee from two different places in the caller will be different from state pruning point of view 6. more safety checks are added to liveness analysis Implementation details: . struct bpf_verifier_state is now consists of all stack frames that led to this function . struct bpf_func_state represent one stack frame. It consists of registers in the given frame and its stack . propagate_liveness() logic had a premature optimization where mark_reg_read() and mark_stack_slot_read() were manually inlined with loop iterating over parents for each register or stack slot. Undo this optimization to reuse more complex mark__read() logic . skip_callee() logic is not necessary from safety point of view, but without it mark__read() markings become too conservative, since after returning from the funciton call a read of r6-r9 will incorrectly propagate the read marks into callee causing inefficient pruning later . mark_*_read() logic is now aware of control flow which makes it more complex. In the future the plan is to rewrite liveness to be hierarchical. So that liveness can be done within basic block only and control flow will be responsible for propagation of liveness information along cfg and between calls. . tail_calls and ld_abs insns are not allowed in the programs with bpf-to-bpf calls . returning stack pointers to the caller or storing them into stack frame of the caller is not allowed Testing: . no difference in cilium processed_insn numbers . large number of tests follows in next patches Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:34:35 +01:00

... 5 6 7 8 9 ...

724073 commits