Commit graph

724073 commits

Author SHA1 Message Date
Eric Dumazet
c3916ad932 tcp: smoother receiver autotuning
Back in linux-3.13 (commit b0983d3c9b ("tcp: fix dynamic right sizing"))
I addressed the pressing issues we had with receiver autotuning.

But DRS suffers from extra latencies caused by rcv_rtt_est.rtt_us
drifts. One common problem happens during slow start, since the
apparent RTT measured by the receiver can be inflated by ~50%,
at the end of one packet train.

Also, a single drop can delay read() calls by one RTT, meaning
tcp_rcv_space_adjust() can be called one RTT too late.

By replacing the tri-modal heuristic with a continuous function,
we can offset the effects of not growing 'at the optimal time'.

The curve of the function matches prior behavior if the space
increased by 25% and 50% exactly.

Cost of added multiply/divide is small, considering a TCP flow
typically would run this part of the code few times in its life.

I tested this patch with 100 ms RTT / 1% loss link, 100 runs
of (netperf -l 5), and got an average throughput of 4600 Mbit
instead of 1700 Mbit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-12 10:53:04 -05:00
Eric Dumazet
607065bad9 tcp: avoid integer overflows in tcp_rcv_space_adjust()
When using large tcp_rmem[2] values (I did tests with 500 MB),
I noticed overflows while computing rcvwin.

Lets fix this before the following patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-12 10:53:04 -05:00
Eric Dumazet
02db55718d tcp: do not overshoot window_clamp in tcp_rcv_space_adjust()
While rcvbuf is properly clamped by tcp_rmem[2], rcvwin
is left to a potentially too big value.

It has no serious effect, since :
1) tcp_grow_window() has very strict checks.
2) window_clamp can be mangled by user space to any value anyway.

tcp_init_buffer_space() and companions use tcp_full_space(),
we use tcp_win_from_space() to avoid reloading sk->sk_rcvbuf

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-12 10:53:03 -05:00
Jan Beulich
c4f9d9cb2c xen: XEN_ACPI_PROCESSOR is Dom0-only
Add a respective dependency.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-12-12 09:39:43 -05:00
Jan Beulich
0f3922a9b9 x86/Xen: don't report ancient LAPIC version
Unconditionally reporting a value seen on the P4 or older invokes
functionality like io_apic_get_unique_id() on 32-bit builds, resulting
in a panic() with sufficiently many CPUs and/or IO-APICs. Doing what
that function does would be the hypervisor's responsibility anyway, so
makes no sense to be used when running on Xen. Uniformly report a more
modern version; this shouldn't matter much as both LAPIC and IO-APIC are
being managed entirely / mostly by the hypervisor.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-12-12 09:39:17 -05:00
Mark Rutland
8cb562b1d5 checkpatch: Remove ACCESS_ONCE() warning
Now that ACCESS_ONCE() has been excised from the kernel, any uses will
result in a build error, and we no longer need to whine about it in
checkpatch.

This patch removes the newly redundant warning.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20171127103824.36526-5-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 13:22:10 +01:00
Mark Rutland
b899a85043 compiler.h: Remove ACCESS_ONCE()
There are no longer any kernelspace uses of ACCESS_ONCE(), so we can
remove the definition from <linux/compiler.h>.

This patch removes the ACCESS_ONCE() definition, and updates comments
which referred to it. At the same time, some inconsistent and redundant
whitespace is removed from comments.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: apw@canonical.com
Link: http://lkml.kernel.org/r/20171127103824.36526-4-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 13:22:10 +01:00
Mark Rutland
2a22f692bb tools/include: Remove ACCESS_ONCE()
There are no longer any usersapce uses of ACCESS_ONCE(), so we can
remove the definition from our userspace <linux/compiler.h>, which is
only used by tools in the kernel directory (i.e. it isn't a uapi
header).

This patch removes the ACCESS_ONCE() definition, and updates comments
which referred to it. At the same time, some inconsistent and redundant
whitespace is removed from comments.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: apw@canonical.com
Link: http://lkml.kernel.org/r/20171127103824.36526-3-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 13:22:10 +01:00
Mark Rutland
f971e511cb tools/perf: Convert ACCESS_ONCE() to READ_ONCE()
Recently there was a treewide conversion of ACCESS_ONCE() to
{READ,WRITE}_ONCE(), but a new use was introduced concurrently by
commit:

  1695849735 ("perf mmap: Move perf_mmap and methods to separate mmap.[ch] files")

Let's convert this over to READ_ONCE() so that we can remove the
ACCESS_ONCE() definitions in subsequent patches.

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: apw@canonical.com
Link: http://lkml.kernel.org/r/20171127103824.36526-2-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 13:22:09 +01:00
Andrey Konovalov
32fd87b3bb USB: core: only clean up what we allocated
When cleaning up the configurations, make sure we only free the number
of configurations and interfaces that we could have allocated.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-12 13:04:55 +01:00
Will Deacon
0e17cada2a arm64: hw_breakpoint: Use linux/uaccess.h instead of asm/uaccess.h
The only inclusion of asm/uaccess.h should be by linux/uaccess.h. All
other headers should use the latter.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-12 11:53:26 +00:00
Greg Kroah-Hartman
c1ed473554 usb: fixes for v4.15-rc4
We have a few fixes on dwc3:
 
 - one fix which only happens with some implementations where we need to
   wait longer for some commands to finish.
 
 - Another fix for high-bandwidth isochronous endpoint programming making
   sure that we send the correct DATA tokens in the correct sequence
 
 - A couple PM fixes on dwc3-of-simple
 
 The other synopsys controller driver (dwc2) got a fix for FIFO size
 programming.
 
 Other than these, we have a couple Kconfig fixes making sure that
 dependencies are properly setup.
 -----BEGIN PGP SIGNATURE-----
 
 iQJRBAABCgA7FiEElLzh7wn96CXwjh2IzL64meEamQYFAlovtj8dHGZlbGlwZS5i
 YWxiaUBsaW51eC5pbnRlbC5jb20ACgkQzL64meEamQYkhg/+Js7q/csuLfknxIeU
 0FT0exGWBr/XNWDwvckS65yR6zFgG6txRmuuHTq1BcqiOL6jpqV0D8VoAUt/rv1U
 HrUPYMaS8Y5qpYZOe28NyyFgl+5CuoYdLkToAqUrX1MRXClt4Cx+XeJ6yOM6heXb
 codXGef04oCTVYbDuPBbO1S99Pi1nw2T7jOLrhwegAGKEBMCzv+uT/qCR/uG5Fzr
 5GHWWQyHLlo3Av29Rmp4GCFONlyxzGyKTKN+tFAVurGsO9UN8zY2BhfP6mH3kG64
 dyef/hWJ4adhxnbTlrjrnZ52RBePhuUzqXDpKPrCYr9mlsI9NfPk6ablvHkF4W5K
 bXqE8YTuiRdQIRS1+3LbK0BSsuC4svVD3QpN2fghPhkcxPg2FnR4eI3t5IupM6Hz
 tqZvn4BU1UPdrRO3FBmyY/ZMTzsk5u4GB81FUUJRqQlg9cPTcjAWTmwox8fVzf6j
 B2uC59qRn5KDMTKOyCU2CATo3Oi211WCWsUtnO7+mXxknyaWOjWI0MfV0fB3OLIj
 IPm/PVfNcECjiOOpgyhSUY5YbIkQpD3/u9U8cuC+2Q2na7Cftsy/tJLrVlKgLXvQ
 VgHoHGwSPkGdrClRp5mB0c5m2byPAE+BaOleSmeepiq2dDUvgr8hjgBURooKpC9j
 mfNmZrsJgYXQIU9v6E0z92PH/+0=
 =6T1G
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

usb: fixes for v4.15-rc4

We have a few fixes on dwc3:

- one fix which only happens with some implementations where we need to
  wait longer for some commands to finish.

- Another fix for high-bandwidth isochronous endpoint programming making
  sure that we send the correct DATA tokens in the correct sequence

- A couple PM fixes on dwc3-of-simple

The other synopsys controller driver (dwc2) got a fix for FIFO size
programming.

Other than these, we have a couple Kconfig fixes making sure that
dependencies are properly setup.
2017-12-12 12:51:05 +01:00
Shanker Donthineni
932b50c7c1 arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted
to be accessed as the result of a speculative instruction fetch from
an exception level for which all stages of translation are disabled.
Specifically, the core is permitted to speculatively fetch from the
4KB region containing the current program counter 4K and next 4K.

When translation is changed from enabled to disabled for the running
exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the
Falkor core may errantly speculatively access memory locations outside
of the 4KB region permitted by the architecture. The errant memory
access may lead to one of the following unexpected behaviors.

1) A System Error Interrupt (SEI) being raised by the Falkor core due
   to the errant memory access attempting to access a region of memory
   that is protected by a slave-side memory protection unit.
2) Unpredictable device behavior due to a speculative read from device
   memory. This behavior may only occur if the instruction cache is
   disabled prior to or coincident with translation being changed from
   enabled to disabled.

The conditions leading to this erratum will not occur when either of the
following occur:
 1) A higher exception level disables translation of a lower exception level
   (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0).
 2) An exception level disabling its stage-1 translation if its stage-2
    translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1
    to 0 when HCR_EL2[VM] has a value of 1).

To avoid the errant behavior, software must execute an ISB immediately
prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-12 11:45:19 +00:00
Shanker Donthineni
c622cc013c arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies
Falkor CPU in cputype.h. It's unfortunate that the first revision
of the Falkor CPU used the wrong part number 0x800, got fixed in v2
chip with part number 0xC00, and would be used the same value for
future revisions.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-12 11:45:19 +00:00
Will Deacon
86c9e8126e arm64: mm: Fix false positives in set_pte_at access/dirty race detection
Jiankang reports that our race detection in set_pte_at is firing when
copying the page tables in dup_mmap as a result of a fork(). In this
situation, the page table isn't actually live and so there is no way
that we can race with a concurrent update from the hardware page table
walker.

This patch reworks the race detection so that we require either the
mm to match the current active_mm (i.e. currently installed in our TTBR0)
or the mm_users count to be greater than 1, implying that the page table
could be live in another CPU. The mm_users check might still be racy,
but we'll avoid false positives and it's not realistic to validate that
all the necessary locks are held as part of this assertion.

Cc: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Jiankang Chen <chenjiankang1@huawei.com>
Tested-by: Jiankang Chen <chenjiankang1@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-12 11:42:24 +00:00
Ingo Molnar
e966eaeeb6 locking/lockdep: Remove the cross-release locking checks
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.

If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.

Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...

This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.

Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.

( Note that it might make sense to maintain it out of tree and go through
  the false positives every now and then and see whether new bugs were
  introduced. )

Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 12:38:51 +01:00
Felipe Balbi
9dbe416b65 Revert "usb: gadget: allow to enable legacy drivers without USB_ETH"
This reverts commit 7a9618a22a.

Romain Izard recently reported that commit 7a9618a22a ended up
allowing every legacy gadget driver to statically linked to the
kernel, however that doesn't work, since only one legacy gadget can be
bound to a controller. Because of that, let's revert the original commit
and fix the problem.

Reported-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-12-12 12:48:30 +02:00
Arnd Bergmann
54eed78c5c usb: gadget: webcam: fix V4L2 Kconfig dependency
Configuring the USB_G_WEBCAM driver as built-in leads to a link
error when CONFIG_VIDEO_V4L2 is a loadable module:

drivers/usb/gadget/function/f_uvc.o: In function `uvc_function_setup':
f_uvc.c:(.text+0xfe): undefined reference to `v4l2_event_queue'
drivers/usb/gadget/function/f_uvc.o: In function `uvc_function_ep0_complete':
f_uvc.c:(.text+0x188): undefined reference to `v4l2_event_queue'

This changes the Kconfig dependency to disallow that configuration,
and force it to be a module in that case as well.

This is apparently a rather old bug, but very hard to trigger
even in thousands of randconfig builds.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-12-12 12:44:11 +02:00
Will Deacon
d89c70356a locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y
When CONFIG_GENERIC_LOCKBEAK=y, locking structures grow an extra int ->break_lock
field which is used to implement raw_spin_is_contended() by setting the field
to 1 when waiting on a lock and clearing it to zero when holding a lock.
However, there are a few problems with this approach:

  - There is a write-write race between a CPU successfully taking the lock
    (and subsequently writing break_lock = 0) and a waiter waiting on
    the lock (and subsequently writing break_lock = 1). This could result
    in a contended lock being reported as uncontended and vice-versa.

  - On machines with store buffers, nothing guarantees that the writes
    to break_lock are visible to other CPUs at any particular time.

  - READ_ONCE/WRITE_ONCE are not used, so the field is potentially
    susceptible to harmful compiler optimisations,

Consequently, the usefulness of this field is unclear and we'd be better off
removing it and allowing architectures to implement raw_spin_is_contended() by
providing a definition of arch_spin_is_contended(), as they can when
CONFIG_GENERIC_LOCKBREAK=n.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1511894539-7988-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 11:24:01 +01:00
Will Deacon
f87f3a328d locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
Commit:

  a8a217c221 ("locking/core: Remove {read,spin,write}_can_lock()")

removed the definition of raw_spin_can_lock(), causing the GENERIC_LOCKBREAK
spin_lock() routines to poll the ->break_lock field when waiting on a lock.

This has been reported to cause a deadlock during boot on s390, because
the ->break_lock field is also set by the waiters, and can potentially
remain set indefinitely if no other CPUs come in to take the lock after
it has been released.

This patch removes the explicit spinning on ->break_lock from the waiters,
instead relying on the outer trylock() operation to determine when the
lock is available.

Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a8a217c221 ("locking/core: Remove {read,spin,write}_can_lock()")
Link: http://lkml.kernel.org/r/1511894539-7988-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 11:24:01 +01:00
Bart Van Assche
14e3062fb1 scsi: core: Fix a scsi_show_rq() NULL pointer dereference
Avoid that scsi_show_rq() triggers a NULL pointer dereference if called
after sd_uninit_command(). Swap the NULL pointer assignment and the
mempool_free() call in sd_uninit_command() to make it less likely that
scsi_show_rq() triggers a use-after-free. Note: even with these changes
scsi_show_rq() can trigger a use-after-free but that's a lesser evil
than e.g. suppressing debug information for T10 PI Type 2 commands
completely. This patch fixes the following oops:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: scsi_format_opcode_name+0x1a/0x1c0
CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516
Call Trace:
 __scsi_format_command+0x27/0xc0
 scsi_show_rq+0x5c/0xc0
 __blk_mq_debugfs_rq_show+0x116/0x130
 blk_mq_debugfs_rq_show+0xe/0x10
 seq_read+0xfe/0x3b0
 full_proxy_read+0x54/0x90
 __vfs_read+0x37/0x160
 vfs_read+0x96/0x130
 SyS_read+0x55/0xc0
 entry_SYSCALL_64_fastpath+0x1a/0xa5

[mkp: added Type 2]

Fixes: 0eebd005dd ("scsi: Implement blk_mq_ops.show_rq()")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-11 21:56:48 -05:00
Johannes Thumshirn
3e5c63565a scsi: MAINTAINERS: change FCoE list to linux-scsi
fcoe-devel@open-fcoe.org is defunct and all patches are routed via the
SCSI tree anyways.

So update MAINTAINERS accordingly.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-11 21:47:28 -05:00
Jason Yan
621f6401fd scsi: libsas: fix length error in sas_smp_handler()
The return value of smp_execute_task_sg() is the untransferred residual,
but bsg_job_done() requires the length of payload received. This makes
SMP passthrough commands from userland by sg ioctl to libsas get a wrong
response. The userland tools such as smp_utils failed because of these
wrong responses:

~#smp_discover /dev/bsg/expander-2\:13
response too short, len=0
~#smp_discover /dev/bsg/expander-2\:134
response too short, len=0

Fix this by passing the actual received length to bsg_job_done(). And if
smp_execute_task_sg() returns 0, this means received length is exactly
the buffer length.

[mkp: typo]

Fixes: 651a013649 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reported-by: chenqilin <chenqilin2@huawei.com>
Tested-by: chenqilin <chenqilin2@huawei.com>
CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-11 21:45:34 -05:00
Naresh Kamboju
63060c3916 selftests: bpf: Adding config fragment CONFIG_CGROUP_BPF=y
CONFIG_CGROUP_BPF=y is required for test_dev_cgroup test case.

Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-12 02:32:45 +01:00
Dan Carpenter
532298b950 platform/x86: dell-wmi: check for kmalloc() errors
This allocation won't fail in the current kernel because it's small but
not checking for kmalloc() failures introduces static checker warnings
so let's fix it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11 17:26:03 -08:00
Peter Hutterer
bff5bf9db1 platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changes
Sending the switch state change twice within the same frame is invalid
evdev protocol and only works if the client handles keys immediately as
well. Processing events immediately is incorrect, it forces a fake
order of events that does not exist on the device.

Recent versions of libinput changed to only process the device state and
SYN_REPORT time, so now the key event is lost.

https://bugs.freedesktop.org/show_bug.cgi?id=104041

Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11 17:26:02 -08:00
Pali Rohár
68a213d325 platform/x86: dell-laptop: Fix keyboard max lighting for Dell Latitude E6410
This machine reports number of keyboard backlight led levels, instead of
value of the last led level index. Therefore max_brightness properly needs
to be subtracted by 1 to match led max_brightness API.

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Reported-by: Gabriel M. Elder <gabriel@tekgnowsys.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196913
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11 17:24:21 -08:00
Linus Torvalds
a638349bf6 Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
 "Just one patch to work around CRIS boot problem caused by a recent
  change which freed a temporary boot data structure. The root cause is
  on CRIS side but it doesn't seem trivial to fix. For now, work around
  by skipping freeing on CRIS"

* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: hack to let the CRIS architecture to boot until they clean up
2017-12-11 17:13:03 -08:00
Linus Torvalds
085bec853a Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:

 - Prateek posted a couple patches to fix a deadlock involving cpuset
   and workqueue. It unfortunately caused a different deadlock and the
   recent workqueue hotplug simplification removed the original
   deadlock, so Prateek's two patches are reverted for now.

 - The new stat code was missing u64_stats initialization. Fixed.

 - Doc and other misc changes

* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: add warning about RT not being supported on cgroup2
  Revert "cgroup/cpuset: remove circular dependency deadlock"
  Revert "cpuset: Make cpuset hotplug synchronous"
  cgroup: properly init u64_stats
  debug cgroup: use task_css_set instead of rcu_dereference
  cpuset: Make cpuset hotplug synchronous
  cgroup/cpuset: remove circular dependency deadlock
2017-12-11 17:10:05 -08:00
Linus Torvalds
72dd379e67 Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:

 - Lai's hotplug simplifications inadvertently fix a possible deadlock
   involving cpuset and workqueue

 - CPU isolation fix which was reverted due to the changes in the
   housekeeping code resurrected

 - A trivial unused include removal

* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: remove unneeded kallsyms include
  workqueue/hotplug: remove the workaround in rebind_workers()
  workqueue/hotplug: simplify workqueue_offline_cpu()
  workqueue: respect isolated cpus when queueing an unbound work
  main: kernel_start: move housekeeping_init() before workqueue_init_early()
2017-12-11 17:07:26 -08:00
Linus Torvalds
a83cb7e6ad Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Nothing too interesting. David Milburn improved a corner case
  misbehavior during hotplug. Other than that, minor driver-specific
  fixes"

* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata: sata_down_spd_limit should return if driver has not recorded sstatus speed
  ahci: mtk: Change driver name to ahci-mtk
  ahci: qoriq: refine port register configuration
  pata_pdc2027x : make pdc2027x_*_timing structures const
  pata_pdc2027x: Remove unnecessary error check
  ata: mediatek: Fix typo in module description
2017-12-11 17:05:33 -08:00
Linus Torvalds
bfb529ee79 The big changes for IPMI that just went in had a few problems. These
have been in for-next for a while, each since about their creation
 date.  I forgot the bugzilla reference on the second one (ipmi_si: Fix
 oops with PCI devices) so I rebased to add that.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaLo1KAAoJEGHzjJCRm/+B0HsP/joYHLSzoAESfihnx9JD1dn6
 JM8ZV8fu7e1ZpnrxyGj/dPLSBS1k8wsVKAEGrL5ETz4UOuwIR6/61wpzfyrQf/L5
 0ZBhpv8dAxpHvFZGGE1NCF4jNlo20K0i8YQk9lUxB3Nml3udUd+GUA/Li5d2vGo4
 e6xZyS15euNmHwnELaCguS9Vz79xusLmFvgicmi5l7+Y3X4Ul/sNL+pGySYUMqxU
 NvsH3fTDXJfRv2FCnJwn1sUGpPPPH0uYhaLKXNpekt0PgNNTlTzFWGnfRJrbD/+q
 OWXrfuqiwoCSRhfOXooI2vGAIZ+jjL/vBS9827EGjf0tWgTVnOx+wuDND15uZkxP
 LizUG0ZPcov0veDh1mExIBIU2sCGkZ+dlQeGLaVBQ4tNgbyJyWi5HiwvFc5r9s/e
 /ak0kkt9J54T4MgtEMBEEHSMatUixM8eXJ7K9ySZANP5vXlLmcpXVKBHEB42QWBN
 I1V5o1PVHxV8IrG/zOiWYBLYraWocEaNat/LzlbqGMfoVyb1gXpAI8Cbjphq1xOU
 49J5oY6L8MHNIu0VkEV9MtIEyLAM/V/nd8WQ3YpD/4UJnVoWcorBQWSC7NssWFm8
 5N4dq7kXSnUM4yA21PMogFCnRToO6nrK/ijxOkzWmPbDnvDDywQY/bnj7dAKFQri
 iQ67umU2z0+U4juY+Lls
 =3M4K
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.15-2' of git://github.com/cminyard/linux-ipmi

Pull IPMI fixes from Corey Minyard.

* tag 'for-linus-4.15-2' of git://github.com/cminyard/linux-ipmi:
  ipmi_si: fix crash on parisc
  ipmi_si: Fix oops with PCI devices
  ipmi: Stop timers before cleaning up the module
2017-12-11 17:01:59 -08:00
Linus Torvalds
916b20e02e Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This push fixes the following issues:

   - buffer overread in RSA

   - potential use after free in algif_aead.

   - error path null pointer dereference in af_alg

   - forbid combinations such as hmac(hmac(sha3)) which may crash

   - crash in salsa20 due to incorrect API usage"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: salsa20 - fix blkcipher_walk API usage
  crypto: hmac - require that the underlying hash algorithm is unkeyed
  crypto: af_alg - fix NULL pointer dereference in
  crypto: algif_aead - fix reference counting of null skcipher
  crypto: rsa - fix buffer overread when stripping leading zeroes
2017-12-11 16:32:45 -08:00
Steve Wise
c058ecf6e4 iw_cxgb4: only insert drain cqes if wq is flushed
Only insert our special drain CQEs to support ib_drain_sq/rq() after
the wq is flushed. Otherwise, existing but not yet polled CQEs can be
returned out of order to the user application.  This can happen when the
QP has exited RTS but not yet flushed the QP, which can happen during
a normal close (vs abortive close).

In addition never count the drain CQEs when determining how many CQEs
need to be synthesized during the flush operation.  This latter issue
should never happen if the QP is properly flushed before inserting the
drain CQE, but I wanted to avoid corrupting the CQ state.  So we handle
it and log a warning once.

Fixes: 4fe7c2962e ("iw_cxgb4: refactor sq/rq drain logic")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11 15:33:51 -07:00
Chandan Rajendra
9d5afec6b8 ext4: fix crash when a directory's i_size is too small
On a ppc64 machine, when mounting a fuzzed ext2 image (generated by
fsfuzzer) the following call trace is seen,

VFS: brelse: Trying to free free buffer
WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40
.__brelse.part.6+0x20/0x40 (unreliable)
.ext4_find_entry+0x384/0x4f0
.ext4_lookup+0x84/0x250
.lookup_slow+0xdc/0x230
.walk_component+0x268/0x400
.path_lookupat+0xec/0x2d0
.filename_lookup+0x9c/0x1d0
.vfs_statx+0x98/0x140
.SyS_newfstatat+0x48/0x80
system_call+0x58/0x6c

This happens because the directory that ext4_find_entry() looks up has
inode->i_size that is less than the block size of the filesystem. This
causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not
reading any of the directory file's blocks. This renders the entries in
bh_use[] array to continue to have garbage data. buffer_uptodate() on
bh_use[0] can then return a zero value upon which brelse() function is
invoked.

This commit fixes the bug by returning -ENOENT when the directory file
has no associated blocks.

Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
2017-12-11 15:00:57 -05:00
Xin Long
200809716a fou: fix some member types in guehdr
guehdr struct is used to build or parse gue packets, which
are always in big endian. It's better to define all guehdr
members as __beXX types.

Also, in validate_gue_flags it's not good to use a __be32
variable for both Standard flags(__be16) and Private flags
(__be32), and pass it to other funcions.

This patch could fix a bunch of sparse warnings from fou.

Fixes: 5024c33ac3 ("gue: Add infrastructure for flags and options")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 14:10:06 -05:00
Xin Long
2342b8d95b sctp: make sure stream nums can match optlen in sctp_setsockopt_reset_streams
Now in sctp_setsockopt_reset_streams, it only does the check
optlen < sizeof(*params) for optlen. But it's not enough, as
params->srs_number_streams should also match optlen.

If the streams in params->srs_stream_list are less than stream
nums in params->srs_number_streams, later when dereferencing
the stream list, it could cause a slab-out-of-bounds crash, as
reported by syzbot.

This patch is to fix it by also checking the stream numbers in
sctp_setsockopt_reset_streams to make sure at least it's not
greater than the streams in the list.

Fixes: 7f9d68ac94 ("sctp: implement sender-side procedures for SSN Reset Request Parameter")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 14:08:21 -05:00
Mohamed Ghannam
8f659a03a0 net: ipv4: fix for a race condition in raw_sendmsg
inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.

Fixes: c008ba5bdc ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 14:05:31 -05:00
Zhu Yanjun
c360f2b58e forcedeth: remove unnecessary structure member
Since both tx_ring and first_tx are the head of tx ring, it not
necessary to use two structure members to statically indicate
the head of tx ring. So first_tx is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 14:03:56 -05:00
Andrey Ryabinin
0a373d4fc2 x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y
Stackdepot doesn't work well with CONFIG_UNWINDER_GUESS=y.
The 'guess' unwinder generate awfully large and inaccurate stacktraces,
thus stackdepot can't deduplicate stacktraces because they all look like
unique. Eventually stackdepot reaches its capacity limit:

  WARNING: CPU: 0 PID: 545 at lib/stackdepot.c:119 depot_save_stack+0x28e/0x550
  Call Trace:
   ? kasan_kmalloc+0x144/0x160
   ? depot_save_stack+0x1f5/0x550
   ? do_raw_spin_unlock+0xda/0xf0
   ? preempt_count_sub+0x13/0xc0

  <...90 lines...>

   ? do_raw_spin_unlock+0xda/0xf0

Add a STACKDEPOT=n dependency to UNWINDER_GUESS to avoid the problem.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171130123554.4330-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11 19:07:07 +01:00
Changbin Du
f79ce87fa4 x86/build: Don't verify mtools configuration file for isoimage
If mtools.conf is not generated before, 'make isoimage' could complain:

  Kernel: arch/x86/boot/bzImage is ready  (#597)
    GENIMAGE arch/x86/boot/image.iso
   *** Missing file: arch/x86/boot/mtools.conf
  arch/x86/boot/Makefile:144: recipe for target 'isoimage' failed

mtools.conf is not used for isoimage generation, so do not check it.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 4366d57af1 ("x86/build: Factor out fdimage/isoimage generation commands to standalone script")
Link: http://lkml.kernel.org/r/1512053480-8083-1-git-send-email-changbin.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11 18:55:38 +01:00
David S. Miller
23202e0995 Merge branch 'nfp-dead-code-clean-ups-and-slight-improvements'
Jakub Kicinski says:

====================
nfp: dead code, clean ups and slight improvements

This series contains small clean ups from John and Carl, and brings
no functional changes.

John's improvements target the flower code.  First he makes sure we don't
allocate space in FW request messages for MAC matches if the TC rule does
not contain any.  The remaining two patches remove some dead code and
unused defines.

Carl follows up with a slight optimization to his recent ethtool FW state
dumps, byte swapping input parameters once instead of the data for every
dumped item.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 12:08:23 -05:00
Carl Heymann
92a54f4a47 nfp: debug dump - decrease endian conversions
Convert the requested dump level parameter to big-endian at the start of
nfp_net_dump_calculate_size() and nfp_net_dump_populate_buffer(), then
compare and assign it directly where needed in the traversal and prolog
code. This decreases the total number of conversions used.

Signed-off-by: Carl Heymann <carl.heymann@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 12:08:13 -05:00
John Hurley
197171e5ba nfp: flower: remove unused defines
Delete match field defines that are not supported at this time.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 12:08:04 -05:00
John Hurley
a427673e1f nfp: flower: remove dead code paths
Port matching is selected by default on every rule so remove check for it
and delete 'else' side of the statement. Remove nfp_flower_meta_one as now
it will not feature in the code. Rename nfp_flower_meta_two given that one
has been removed.

'Additional metadata' if statement can never be true so remove it as well.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 12:07:57 -05:00
John Hurley
de7d954984 nfp: flower: do not assume mac/mpls matches
Remove the matching of mac/mpls as a default selection. These are not
necessarily set by a TC rule (unlike the port). Previously a mac/mpls
field would exist in every match and be masked out if not used. This patch
has no impact on functionality but removes unnessary memory assignment in
the match cmsg.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 12:07:47 -05:00
Kevin Cernekee
93c647643b netlink: Add netns check on taps
Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity.  Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.

Test case:

    vpnns -- bash -c "ip link add nlmon0 type nlmon; \
                      ip link set nlmon0 up; \
                      tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
    sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
        spi 0x1 mode transport \
        auth sha1 0x6162633132330000000000000000000000000000 \
        enc aes 0x00000000000000000000000000000000
    grep --binary abc123 /tmp/nlmon.pcap

Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 11:58:18 -05:00
Thomas Petazzoni
2aab6b40b0 net: sh_eth: do not advertise Gigabit capabilities when not available
Not all variants of the sh_eth hardware have Gigabit
support. Unfortunately, the current driver doesn't tell the PHY about
the limited MAC capabilities. Due to this, if you have a Gigabit
capable PHY, the PHY will advertise its Gigabit capability and
establish a link at 1Gbit/s, even though the MAC doesn't support it.

In order to avoid this, we use the recently introduced
phy_set_max_speed() to tell the PHY to not advertise speed higher than
100 MBit/s.

Tested on a SH7786 platform, with a Gigabit PHY.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 11:53:55 -05:00
Fabio Estevam
24fd319081 dt-bindings: fec: Make the phy-reset-gpio polarity explicit
The GPIO polarity passed to phy-reset-gpio is ignored by the FEC
driver and it is assumed to be active low.

It can be active high only when the 'phy-reset-active-high' property
is present.

The current examples pass active high polarity and work fine, but
in order to improve the documentation make it explicit what the real
polarity is.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 11:26:59 -05:00
David S. Miller
b9622ed42c Merge branch 'sctp-stream-interleave-part-1'
Xin Long says:

====================
sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving

Stream Interleave would be Implemented in two Parts:

   1. The I-DATA Chunk Supporting User Message Interleaving
   2. Interaction with Other SCTP Extensions

Overview in section 1.1 of RFC8260 for Part 1:

   This document describes a new chunk carrying payload data called
   I-DATA.  This chunk incorporates the properties of the current SCTP
   DATA chunk, all the flags and fields except the Stream Sequence
   Number (SSN), and also adds two new fields in its chunk header -- the
   Fragment Sequence Number (FSN) and the Message Identifier (MID).  The
   FSN is only used for reassembling all fragments that have the same
   MID and the same ordering property.  The TSN is only used for the
   reliable transfer in combination with Selective Acknowledgment (SACK)
   chunks.

   In addition, the MID is also used for ensuring ordered delivery
   instead of using the stream sequence number (the I-DATA chunk omits
   an SSN).

As the 1st part of Stream Interleave Implementation, this patchset adds
an ops framework named sctp_stream_interleave with a bunch of stuff that
does lots of things needed somewhere.

Then it defines sctp_stream_interleave_0 to work for normal DATA chunks
and sctp_stream_interleave_1 for I-DATA chunks.

With these functions, hundreds of if-else checks for the different process
on I-DATA chunks would be avoided. Besides, very few codes could be shared
in these two function sets.

In this patchset, it adds some basic variables, structures and socket
options firstly, then implement these functions one by one to add the
procedures for ordered idata gradually, at last adjusts some codes to
make them work for unordered idata.

To make it safe to be implemented and also not break the normal data
chunk process, this feature can't be enabled to use until all stream
interleave codes are completely accomplished.

v1 -> v2:
  - fixed a checkpatch warning that a blank line was missed.
  - avoided a kbuild warning reported from gcc-4.9.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 11:23:06 -05:00