Commit graph

740184 commits

Author SHA1 Message Date
John Fastabend
4c4c3c276c bpf: sockmap sample, add option to attach SK_MSG program
Add sockmap option to use SK_MSG program types.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:40 +01:00
John Fastabend
1acc60b6a4 bpf: add verifier tests for BPF_PROG_TYPE_SK_MSG
Test read and writes for BPF_PROG_TYPE_SK_MSG.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
John Fastabend
82a8616889 bpf: add map tests for BPF_PROG_TYPE_SK_MSG
Add map tests to attach BPF_PROG_TYPE_SK_MSG types to a sockmap.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
John Fastabend
015632bb30 bpf: sk_msg program helper bpf_sk_msg_pull_data
Currently, if a bpf sk msg program is run the program
can only parse data that the (start,end) pointers already
consumed. For sendmsg hooks this is likely the first
scatterlist element. For sendpage this will be the range
(0,0) because the data is shared with userspace and by
default we want to avoid allowing userspace to modify
data while (or after) BPF verdict is being decided.

To support pulling in additional bytes for parsing use
a new helper bpf_sk_msg_pull(start, end, flags) which
works similar to cls tc logic. This helper will attempt
to point the data start pointer at 'start' bytes offest
into msg and data end pointer at 'end' bytes offset into
message.

After basic sanity checks to ensure 'start' <= 'end' and
'end' <= msg_length there are a few cases we need to
handle.

First the sendmsg hook has already copied the data from
userspace and has exclusive access to it. Therefor, it
is not necessesary to copy the data. However, it may
be required. After finding the scatterlist element with
'start' offset byte in it there are two cases. One the
range (start,end) is entirely contained in the sg element
and is already linear. All that is needed is to update the
data pointers, no allocate/copy is needed. The other case
is (start, end) crosses sg element boundaries. In this
case we allocate a block of size 'end - start' and copy
the data to linearize it.

Next sendpage hook has not copied any data in initial
state so that data pointers are (0,0). In this case we
handle it similar to the above sendmsg case except the
allocation/copy must always happen. Then when sending
the data we have possibly three memory regions that
need to be sent, (0, start - 1), (start, end), and
(end + 1, msg_length). This is required to ensure any
writes by the BPF program are correctly transmitted.

Lastly this operation will invalidate any previous
data checks so BPF programs will have to revalidate
pointers after making this BPF call.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
John Fastabend
91843d540a bpf: sockmap, add msg_cork_bytes() helper
In the case where we need a specific number of bytes before a
verdict can be assigned, even if the data spans multiple sendmsg
or sendfile calls. The BPF program may use msg_cork_bytes().

The extreme case is a user can call sendmsg repeatedly with
1-byte msg segments. Obviously, this is bad for performance but
is still valid. If the BPF program needs N bytes to validate
a header it can use msg_cork_bytes to specify N bytes and the
BPF program will not be called again until N bytes have been
accumulated. The infrastructure will attempt to coalesce data
if possible so in many cases (most my use cases at least) the
data will be in a single scatterlist element with data pointers
pointing to start/end of the element. However, this is dependent
on available memory so is not guaranteed. So BPF programs must
validate data pointer ranges, but this is the case anyways to
convince the verifier the accesses are valid.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
John Fastabend
2a100317c9 bpf: sockmap, add bpf_msg_apply_bytes() helper
A single sendmsg or sendfile system call can contain multiple logical
messages that a BPF program may want to read and apply a verdict. But,
without an apply_bytes helper any verdict on the data applies to all
bytes in the sendmsg/sendfile. Alternatively, a BPF program may only
care to read the first N bytes of a msg. If the payload is large say
MB or even GB setting up and calling the BPF program repeatedly for
all bytes, even though the verdict is already known, creates
unnecessary overhead.

To allow BPF programs to control how many bytes a given verdict
applies to we implement a bpf_msg_apply_bytes() helper. When called
from within a BPF program this sets a counter, internal to the
BPF infrastructure, that applies the last verdict to the next N
bytes. If the N is smaller than the current data being processed
from a sendmsg/sendfile call, the first N bytes will be sent and
the BPF program will be re-run with start_data pointing to the N+1
byte. If N is larger than the current data being processed the
BPF verdict will be applied to multiple sendmsg/sendfile calls
until N bytes are consumed.

Note1 if a socket closes with apply_bytes counter non-zero this
is not a problem because data is not being buffered for N bytes
and is sent as its received.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:39 +01:00
John Fastabend
4f738adba3 bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
This implements a BPF ULP layer to allow policy enforcement and
monitoring at the socket layer. In order to support this a new
program type BPF_PROG_TYPE_SK_MSG is used to run the policy at
the sendmsg/sendpage hook. To attach the policy to sockets a
sockmap is used with a new program attach type BPF_SK_MSG_VERDICT.

Similar to previous sockmap usages when a sock is added to a
sockmap, via a map update, if the map contains a BPF_SK_MSG_VERDICT
program type attached then the BPF ULP layer is created on the
socket and the attached BPF_PROG_TYPE_SK_MSG program is run for
every msg in sendmsg case and page/offset in sendpage case.

BPF_PROG_TYPE_SK_MSG Semantics/API:

BPF_PROG_TYPE_SK_MSG supports only two return codes SK_PASS and
SK_DROP. Returning SK_DROP free's the copied data in the sendmsg
case and in the sendpage case leaves the data untouched. Both cases
return -EACESS to the user. Returning SK_PASS will allow the msg to
be sent.

In the sendmsg case data is copied into kernel space buffers before
running the BPF program. The kernel space buffers are stored in a
scatterlist object where each element is a kernel memory buffer.
Some effort is made to coalesce data from the sendmsg call here.
For example a sendmsg call with many one byte iov entries will
likely be pushed into a single entry. The BPF program is run with
data pointers (start/end) pointing to the first sg element.

In the sendpage case data is not copied. We opt not to copy the
data by default here, because the BPF infrastructure does not
know what bytes will be needed nor when they will be needed. So
copying all bytes may be wasteful. Because of this the initial
start/end data pointers are (0,0). Meaning no data can be read or
written. This avoids reading data that may be modified by the
user. A new helper is added later in this series if reading and
writing the data is needed. The helper call will do a copy by
default so that the page is exclusively owned by the BPF call.

The verdict from the BPF_PROG_TYPE_SK_MSG applies to the entire msg
in the sendmsg() case and the entire page/offset in the sendpage case.
This avoids ambiguity on how to handle mixed return codes in the
sendmsg case. Again a helper is added later in the series if
a verdict needs to apply to multiple system calls and/or only
a subpart of the currently being processed message.

The helper msg_redirect_map() can be used to select the socket to
send the data on. This is used similar to existing redirect use
cases. This allows policy to redirect msgs.

Pseudo code simple example:

The basic logic to attach a program to a socket is as follows,

  // load the programs
  bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG,
		&obj, &msg_prog);

  // lookup the sockmap
  bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map");

  // get fd for sockmap
  map_fd_msg = bpf_map__fd(bpf_map_msg);

  // attach program to sockmap
  bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0);

Adding sockets to the map is done in the normal way,

  // Add a socket 'fd' to sockmap at location 'i'
  bpf_map_update_elem(map_fd_msg, &i, fd, BPF_ANY);

After the above any socket attached to "my_sock_map", in this case
'fd', will run the BPF msg verdict program (msg_prog) on every
sendmsg and sendpage system call.

For a complete example see BPF selftests or sockmap samples.

Implementation notes:

It seemed the simplest, to me at least, to use a refcnt to ensure
psock is not lost across the sendmsg copy into the sg, the bpf program
running on the data in sg_data, and the final pass to the TCP stack.
Some performance testing may show a better method to do this and avoid
the refcnt cost, but for now use the simpler method.

Another item that will come after basic support is in place is
supporting MSG_MORE flag. At the moment we call sendpages even if
the MSG_MORE flag is set. An enhancement would be to collect the
pages into a larger scatterlist and pass down the stack. Notice that
bpf_tcp_sendmsg() could support this with some additional state saved
across sendmsg calls. I built the code to support this without having
to do refactoring work. Other features TBD include ZEROCOPY and the
TCP_RECV_QUEUE/TCP_NO_QUEUE support. This will follow initial series
shortly.

Future work could improve size limits on the scatterlist rings used
here. Currently, we use MAX_SKB_FRAGS simply because this was being
used already in the TLS case. Future work could extend the kernel sk
APIs to tune this depending on workload. This is a trade-off
between memory usage and throughput performance.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:38 +01:00
John Fastabend
8c05dbf04b net: generalize sk_alloc_sg to work with scatterlist rings
The current implementation of sk_alloc_sg expects scatterlist to always
start at entry 0 and complete at entry MAX_SKB_FRAGS.

Future patches will want to support starting at arbitrary offset into
scatterlist so add an additional sg_start parameters and then default
to the current values in TLS code paths.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:38 +01:00
John Fastabend
312fc2b4c8 net: do_tcp_sendpages flag to avoid SKBTX_SHARED_FRAG
When calling do_tcp_sendpages() from in kernel and we know the data
has no references from user side we can omit SKBTX_SHARED_FRAG flag.
This patch adds an internal flag, NO_SKBTX_SHARED_FRAG that can be used
to omit setting SKBTX_SHARED_FRAG.

The flag is not exposed to userspace because the sendpage call from
the splice logic masks out all bits except MSG_MORE.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:38 +01:00
John Fastabend
ffa3566001 sockmap: convert refcnt to an atomic refcnt
The sockmap refcnt up until now has been wrapped in the
sk_callback_lock(). So its not actually needed any locking of its
own. The counter itself tracks the lifetime of the psock object.
Sockets in a sockmap have a lifetime that is independent of the
map they are part of. This is possible because a single socket may
be in multiple maps. When this happens we can only release the
psock data associated with the socket when the refcnt reaches
zero. There are three possible delete sock reference decrement
paths first through the normal sockmap process, the user deletes
the socket from the map. Second the map is removed and all sockets
in the map are removed, delete path is similar to case 1. The third
case is an asyncronous socket event such as a closing the socket. The
last case handles removing sockets that are no longer available.
For completeness, although inc does not pose any problems in this
patch series, the inc case only happens when a psock is added to a
map.

Next we plan to add another socket prog type to handle policy and
monitoring on the TX path. When we do this however we will need to
keep a reference count open across the sendmsg/sendpage call and
holding the sk_callback_lock() here (on every send) seems less than
ideal, also it may sleep in cases where we hit memory pressure.
Instead of dealing with these issues in some clever way simply make
the reference counting a refcnt_t type and do proper atomic ops.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:38 +01:00
John Fastabend
2c3682f0be sock: make static tls function alloc_sg generic sock helper
The TLS ULP module builds scatterlists from a sock using
page_frag_refill(). This is going to be useful for other ULPs
so move it into sock file for more general use.

In the process remove useless goto at end of while loop.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-19 21:14:38 +01:00
Leon Romanovsky
80cf79ae4f RDMA/verbs: Remove restrack entry from XRCD structure
XRCD object is not implemented in the restrack, so lets remove it.

Fixes: 02d8883f52 ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-19 14:10:30 -06:00
Leon Romanovsky
ed65a4dc22 RDMA/ucma: Fix use-after-free access in ucma_close
The error in ucma_create_id() left ctx in the list of contexts belong
to ucma file descriptor. The attempt to close this file descriptor causes
to use-after-free accesses while iterating over such list.

Fixes: 7521663857 ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <syzbot+dcfd344365a56fbebd0f@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-19 14:01:35 -06:00
David S. Miller
c314c7ba40 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-03-19

This series contains updates to i40e and i40evf only.

Alex fixes a potential deadlock in the configure_clsflower function in
i40evf, where we exit with the "IN_CRITICAL_TASK" bit set while
notifying the PF of flower filters.

Jan fixed an issue where it was possible to set a mode that is not
allowed which resulted in link being down, so fixed the parity between
i40e_set_link_ksettings() and i40e_get_link_ksettings().

Patryk fixes a bug where a backplane device was allowing the setting of
link settings, which is not allowed.

Shiraz fixes a crash when entering S3 because the client interface was
freeing the MSIx vectors while they are still in use.

Jake fixes up a function header comment to document a newly added
parameter.  Also cleaned up flags that were never used.

Doug fixes the incorrect return type for i40e_aq_add_cloud_filters().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-19 15:12:03 -04:00
Tejun Heo
b3a5d11199 percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
percpu_ref internally uses sched-RCU to implement the percpu -> atomic
mode switching and the documentation suggested that this could be
depended upon.  This doesn't seem like a good idea.

* percpu_ref uses sched-RCU which has different grace periods regular
  RCU.  Users may combine percpu_ref with regular RCU usage and
  incorrectly believe that regular RCU grace periods are performed by
  percpu_ref.  This can lead to, for example, use-after-free due to
  premature freeing.

* percpu_ref has a grace period when switching from percpu to atomic
  mode.  It doesn't have one between the last put and release.  This
  distinction is subtle and can lead to surprising bugs.

* percpu_ref allows starting in and switching to atomic mode manually
  for debugging and other purposes.  This means that there may not be
  any grace periods from kill to release.

This patch makes it clear that the grace periods are percpu_ref's
internal implementation detail and can't be depended upon by the
users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 10:09:44 -07:00
Paweł Jabłoński
12d80bca0b i40e: Fix the polling mechanism of GLGEN_RSTAT.DEVSTATE
This fixes the polling mechanism of GLGEN_RSTAT.DEVSTATE in the
PF Reset path when Global Reset is in progress. While the driver
is polling for the end of the PF Reset and the Global Reset is
triggered, abandon the PF Reset path and prepare for the
upcoming Global Reset.

Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:57:52 -07:00
Jacob Keller
6b9a9c26ef i40evf: remove flags that are never used
These flags were defined, but there is no use within the driver code, so
we don't need to keep them.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:56:27 -07:00
Patryk Małek
3f8c843727 i40e: Prevent setting link speed on I40E_DEV_ID_25G_B
Setting link settings on backplane devices shouldn't be allowed.
This patch adds one more device id to the list which we check
that against.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:55:00 -07:00
Doug Dziggel
85925cd0b8 i40e: Fix incorrect return types
Fix return types from i40e_status to enum i40e_status_code.

Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:52:32 -07:00
Jacob Keller
35bea90436 i40e: add doxygen comment for new mode parameter
A recent patch updated the signature for i40e_aq_set_switch_config() to
add a new parameter 'mode'. It forgot to document the parameter in the
doxygen function header comment. Add the parameter to the function
description now.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:50:51 -07:00
Shiraz Saleem
ddbb8d5dd9 i40e: Close client on suspend and restore client MSIx on resume
During suspend client MSIx vectors are freed while they are still
in use causing a crash on entering S3.

Fix this calling client close before freeing up its MSIx vectors.
Also update the client MSIx vectors on resume before client
open is called.

Fixes commit b980c0634f ("i40e: shutdown all IRQs and disable MSI-X
when suspended")

Reported-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:46:09 -07:00
Patryk Małek
88244a48d2 i40e: Prevent setting link speed on KX_X722
Setting link settings on backplane devices shouldn't be allowed.
This patch adds one more device id to the list which we check
that against.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:44:23 -07:00
Jan Sokolowski
32c23b47db i40e: Properly check allowed advertisement capabilities
The i40e_set_link_ksettings and i40e_get_link_ksettings use different
codepaths to check available and supported advertisement modes. This
creates scenarios where it's possible to set a mode that's not allowed,
resulting in a link down.

Fix setting advertisement in i40e_set_link_ksettings by calling
i40e_get_link_ksettings to check what modes are allowed.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:42:20 -07:00
Kirill Tkhai
f52ba1fef7 mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
In case of memory deficit and low percpu memory pages,
pcpu_balance_workfn() takes pcpu_alloc_mutex for a long
time (as it makes memory allocations itself and waits
for memory reclaim). If tasks doing pcpu_alloc() are
choosen by OOM killer, they can't exit, because they
are waiting for the mutex.

The patch makes pcpu_alloc() to care about killing signal
and use mutex_lock_killable(), when it's allowed by GFP
flags. This guarantees, a task does not miss SIGKILL
from OOM killer.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 09:38:50 -07:00
Tejun Heo
71546d1004 percpu: include linux/sched.h for cond_resched()
microblaze build broke due to missing declaration of the
cond_resched() invocation added recently.  Let's include linux/sched.h
explicitly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
2018-03-19 09:38:18 -07:00
Alexander Duyck
640a8af584 i40evf: Reorder configure_clsflower to avoid deadlock on error
While doing some code review I noticed that we can get into a state where
we exit with the "IN_CRITICAL_TASK" bit set while notifying the PF of
flower filters. This patch is meant to address that plus tweak the ordering
of the while loop waiting on it slightly so that we don't wait an extra
period after we have failed for the last time.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-19 09:28:03 -07:00
Boris Brezillon
7997f3b2df clk: bcm2835: Protect sections updating shared registers
CM_PLLx and A2W_XOSC_CTRL registers are accessed by different clock
handlers and must be accessed with ->regs_lock held.
Update the sections where this protection is missing.

Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-19 09:27:37 -07:00
Boris Brezillon
49012d1bf5 clk: bcm2835: Fix ana->maskX definitions
ana->maskX values are already '~'-ed in bcm2835_pll_set_rate(). Remove
the '~' in the definition to fix ANA setup.

Note that this commit fixes a long standing bug preventing one from
using an HDMI display if it's plugged after the FW has booted Linux.
This is because PLLH is used by the HDMI encoder to generate the pixel
clock.

Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-03-19 09:27:23 -07:00
Shirish S
cd2d6c92a8 drm/amd/display: fix dereferencing possible ERR_PTR()
This patch fixes static checker warning caused by
"36cc549d5986: "drm/amd/display: disable CRTCs with
NULL FB on their primary plane (V2)"

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-19 11:23:23 -05:00
Clark Zheng
219be9dda6 drm/amd/display: Refine disable VGA
bad case won't follow normal sense, it will not enable vga1 as usual, but vga2,3,4 is on.

Signed-off-by: Clark Zheng <clark.zheng@amd.com>
Reviewed-by: Tony Cheng <tony.cheng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-19 11:23:03 -05:00
Hans de Goede
d418ff56b8 libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
When commit 9c7be59fc5 ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.

This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.

Fixes: 9c7be59fc5 ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 08:36:38 -07:00
Hans de Goede
3bf7b5d6d0 libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
Commit b17e5729a6 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware

MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.

Fixes: b17e5729a6 ("libata: disable LPM for Crucial BX100 SSD 500GB...")
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 08:36:38 -07:00
Hans de Goede
62ac3f7305 libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.

It has not been tested with medium_power, but that typically has no
measurable power-savings.

Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.

In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.

Cc: stable@vger.kernel.org
Reported-and-tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 08:36:38 -07:00
Andri Yngvason
9ffd750394 can: cc770: Fix use after free in cc770_tx_interrupt()
This fixes use after free introduced by the last cc770 patch.

Signed-off-by: Andri Yngvason <andri.yngvason@marel.com>
Fixes: 746201235b ("can: cc770: Fix queue stall & dropped RTR reply")
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-03-19 10:57:29 +01:00
Daniel Drake
82bf43b291 Revert "ACPI / battery: Add quirk for Asus GL502VSK and UX305LA"
Revert commit c68f0676ef ("ACPI / battery: Add quirk for Asus
GL502VSK and UX305LA") and commit 4446823e25 ("ACPI / battery: Add
quirk for Asus UX360UA and UX410UAK").

On many many Asus products, the battery is sometimes reported as
charging or discharging even when it is full and you are on AC power.
This change quirked the kernel to avoid advertising the discharging
state when this happens on 4 laptop models, under the belief that
this was incorrect information.  I presume it originates from user
reports who are confused that their battery status icon says that it
is discharging.

However, the reported information is indeed correct, and the quirk
approach taken is inadequate and more thought is needed first.
Specifically:

 1. It only quirks discharging state, not charging

 2. There are so many different Asus products and DMI naming variants
    within those product families that behave this way; Linux could
    grow to quirk hundreds of products and still not even be close at
    "winning" this battle.

 3. Asus previously clarified that this behaviour is intentional. The
    platform will periodically do a partial discharge/charge cycle
    when the battery is full, because this is one way to extend the
    lifetime of the battery (leaving a battery at 100% charge and
    unused will decrease its usable capacity over time).

    My understanding is that any decent consumer product will have
    this behaviour, but it appears that Asus is different in that
    they expose this info through ACPI.

    However, the behaviour seems correct. The ACPI spec does not
    suggest in that the platform should hide the truth.  It lets you
    report that the battery is full of charge, and discharging, and
    with external power connected; and Asus does this.

 4. In terms of not confusing the user, this seems like something that
    could/should be handled by userspace, which can also detect these
    same (accurate) conditions in the general case.

Revert this quirk before it gets included in a release, while we look
for better approaches.

Signed-off-by: Daniel Drake <drake@endlessm.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-19 10:08:39 +01:00
Thierry Reding
192b4af6cd drm/tegra: Shutdown on driver unbind
Since commit 846c7dfc11 ("drm/atomic: Try to preserve the crtc enabled
state in drm_atomic_remove_fb, v2."), removing the last framebuffer will
no longer disable the corresponding pipeline, which causes the KMS core
to complain about leaked connectors on driver unbind.

Fix this by calling drm_atomic_helper_shutdown() on driver unbind, which
will cause all display pipelines to be shut down and therefore drop the
extra references on the connectors.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-19 09:57:32 +01:00
Thierry Reding
8dafb8301c drm/tegra: dsi: Don't disable regulator on ->exit()
The regulator is controlled as part of runtime PM, so it should not be
additionally disabled from the ->exit() callback.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-19 09:57:28 +01:00
Thierry Reding
b1d0b34b74 drm/tegra: dc: Detach IOMMU group from domain only once
Detaching from an IOMMU group multiple times can lead to a crash. This
could potentially be fixed in the IOMMU driver, but it's easy to avoid
the subsequent detach operations in this driver, so do that as well.

Signed-off-by: Thierry Reding <treding@nvidia.com>
2018-03-19 09:57:04 +01:00
Linus Torvalds
c698ca5278 Linux 4.16-rc6 2018-03-18 17:48:42 -07:00
Sylwester Nawrocki
c15619a88a dt-bindings: exynos: Document #sound-dai-cells property of the HDMI node
The #sound-dai-cells DT property is required to describe link between
the HDMI IP block and the SoC's audio subsystem.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-03-19 08:42:53 +09:00
Stefano Brivio
e3c72f3d37 selftests: pmtu: Drop prints to kernel log from pmtu_vti6_link_change_mtu
Reported-by: David Ahern <dsahern@gmail.com>
Fixes: 1fad59ea1c ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 17:13:29 -04:00
David S. Miller
ef57e31e2b Merge branch 'mv88e6xxx-auto-phy-intr'
Andrew Lunn says:

====================
Automatic PHY interrupts

Now that the mv88e6xxx driver either installs in interrupt handler, or
polls for interrupts, it is possible to always handle PHY interrupts,
rather than have phylib perform the polling. This speeds up detection
of link changes and reduces the load on the MDIO bus, which is
beneficial for PTP.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:52:59 -04:00
Andrew Lunn
6f88284f3b net: dsa: mv88e6xxx: Add MDIO interrupts for internal PHYs
When registering an MDIO bus, it is possible to pass an array of
interrupts, one per address on the bus. phylib will then associate the
interrupt to the PHY device, if no other interrupt is provided.

Some of the global2 interrupts are PHY interrupts. Place them into the
MDIO bus structure.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:52:59 -04:00
Andrew Lunn
bc3931557d net: dsa: mv88e6xxx: Add number of internal PHYs
Add to the info structure the number of internal PHYs, if they generate
interrupts. Some of the older generations of switches have internal
PHYs, but no interrupt registers. In this case, set the count to zero.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:52:58 -04:00
Andrew Lunn
adfccf1182 net: dsa: mv88e6xxx: Add missing g1 IRQ numbers
With the recent change to polling for interrupts, it is important that
the number of global 1 interrupts is listed. Without it, the driver
requests an interrupt domain for zero interrupts, which returns
EINVAL, and the probe fails.

Add two missing entries.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:52:58 -04:00
Florian Fainelli
ef44d78d89 net: dsa: mv88e6xxx: Fix missing register lock in serdes_get_stats
We can hit the register lock not held assertion with the following path:

[   34.170631] mv88e6085 0.1:00: Switch registers lock not held!
[   34.176510] CPU: 0 PID: 950 Comm: ethtool Not tainted 4.16.0-rc4 #143
[   34.182985] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
[   34.189519] Backtrace:
[   34.192033] [<8010c4b4>] (dump_backtrace) from [<8010c788>] (show_stack+0x20/0x24)
[   34.199680]  r6:9f5dc010 r5:00000011 r4:9f5dc010 r3:00000000
[   34.205434] [<8010c768>] (show_stack) from [<80679d38>] (dump_stack+0x24/0x28)
[   34.212719] [<80679d14>] (dump_stack) from [<804844a8>] (mv88e6xxx_read+0x70/0x7c)
[   34.220376] [<80484438>] (mv88e6xxx_read) from [<804870dc>] (mv88e6xxx_port_get_cmode+0x34/0x4c)
[   34.229257]  r5:a09cd128 r4:9ee31d07
[   34.232880] [<804870a8>] (mv88e6xxx_port_get_cmode) from [<80487e6c>] (mv88e6352_port_has_serdes+0x24/0x64)
[   34.242690]  r4:9f5dc010
[   34.245309] [<80487e48>] (mv88e6352_port_has_serdes) from [<804880b8>] (mv88e6352_serdes_get_stats+0x28/0x12c)
[   34.255389]  r4:00000001
[   34.257973] [<80488090>] (mv88e6352_serdes_get_stats) from [<804811e8>] (mv88e6xxx_get_ethtool_stats+0xb0/0xc0)
[   34.268156]  r10:00000000 r9:00000000 r8:00000000 r7:a09cd020 r6:00000001 r5:9f5dc01c
[   34.276052]  r4:9f5dc010
[   34.278631] [<80481138>] (mv88e6xxx_get_ethtool_stats) from [<8064f740>] (dsa_slave_get_ethtool_stats+0xbc/0xc4)

mv88e6xxx_get_ethtool_stats() calls mv88e6xxx_get_stats() which calls both
chip->info->ops->stats_get_stats(), which holds the register lock, and
chip->info->ops->serdes_get_stats() which does not. Have
chip->info->ops->serdes_get_stats() be running with the register lock held to
avoid such assertions.

Fixes: 436fe17d27 ("net: dsa: mv88e6xxx: Allow the SERDES interfaces to have statistics")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:35:44 -04:00
Florian Fainelli
a069215cf5 net: fec: Fix unbalanced PM runtime calls
When unbinding/removing the driver, we will run into the following warnings:

[  259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator
[  259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable!
[  259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
[  259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
[  259.696239] libphy: fec_enet_mii_bus: probed

Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-18 16:32:47 -04:00
Linus Torvalds
9e1909b9da Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
 "Another set of melted spectrum updates:

   - Iron out the last late microcode loading issues by actually
     checking whether new microcode is present and preventing the CPU
     synchronization to run into a timeout induced hang.

   - Remove Skylake C2 from the microcode blacklist according to the
     latest Intel documentation

   - Fix the VM86 POPF emulation which traps if VIP is set, but VIF is
     not. Enhance the selftests to catch that kind of issue

   - Annotate indirect calls/jumps for objtool on 32bit. This is not a
     functional issue, but for consistency sake its the right thing to
     do.

   - Fix a jump label build warning observed on SPARC64 which uses 32bit
     storage for the code location which is casted to 64 bit pointer w/o
     extending it to 64bit first.

   - Add two new cpufeature bits. Not really an urgent issue, but
     provides them for both x86 and x86/kvm work. No impact on the
     current kernel"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Fix CPU synchronization routine
  x86/microcode: Attempt late loading only when new microcode is present
  x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
  jump_label: Fix sparc64 warning
  x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels
  x86/vm86/32: Fix POPF emulation
  selftests/x86/entry_from_vm86: Add test cases for POPF
  selftests/x86/entry_from_vm86: Exit with 1 if we fail
  x86/cpufeatures: Add Intel PCONFIG cpufeature
  x86/cpufeatures: Add Intel Total Memory Encryption cpufeature
2018-03-18 12:03:15 -07:00
Linus Torvalds
df4fe17802 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single fix for vmalloc_fault() which uses p*d_huge() unconditionally
  whether CONFIG_HUGETLBFS is set or not. In case of CONFIG_HUGETLBFS=n
  this results in a crash as p*d_huge() returns 0 in that case"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix vmalloc_fault to use pXd_large
2018-03-18 12:01:14 -07:00
Linus Torvalds
d2149e13e5 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Three fixes for irq chip drivers:

   - Make sure the allocations in the GIC-V3 ITS driver are large enough
     to accomodate the interrupt space

   - Fix a misplaced __iomem annotation which causes a splat of 26
     sparse warnings

   - Remove an unused function in the IMX GPCV2 driver which causes
     build warnings"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/irq-imx-gpcv2: Remove unused function
  irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis
  irqchip/gic-v3-its: Fix misplaced __iomem annotations
2018-03-18 11:59:14 -07:00