mirror of
https://github.com/Fishwaldo/Star64_linux.git
synced 2025-07-23 07:12:09 +00:00
mm/gup: introduce pin_user_pages*() and FOLL_PIN
Introduce pin_user_pages*() variations of get_user_pages*() calls, and also pin_longterm_pages*() variations. For now, these are placeholder calls, until the various call sites are converted to use the correct get_user_pages*() or pin_user_pages*() API. These variants will eventually all set FOLL_PIN, which is also introduced, and thoroughly documented. pin_user_pages() pin_user_pages_remote() pin_user_pages_fast() All pages that are pinned via the above calls, must be unpinned via put_user_page(). The underlying rules are: * FOLL_PIN is a gup-internal flag, so the call sites should not directly set it. That behavior is enforced with assertions. * Call sites that want to indicate that they are going to do DirectIO ("DIO") or something with similar characteristics, should call a get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers will: * Start with "pin_user_pages" instead of "get_user_pages". That makes it easy to find and audit the call sites. * Set FOLL_PIN * For pages that are received via FOLL_PIN, those pages must be returned via put_user_page(). Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases in this documentation. (I've reworded it and expanded upon it.) Link: http://lkml.kernel.org/r/20200107224558.2362728-12-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> [Documentation] Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
3c7470b6f6
commit
eddb1c228f
4 changed files with 426 additions and 34 deletions
|
@ -1042,16 +1042,14 @@ static inline void put_page(struct page *page)
|
|||
* put_user_page() - release a gup-pinned page
|
||||
* @page: pointer to page to be released
|
||||
*
|
||||
* Pages that were pinned via get_user_pages*() must be released via
|
||||
* either put_user_page(), or one of the put_user_pages*() routines
|
||||
* below. This is so that eventually, pages that are pinned via
|
||||
* get_user_pages*() can be separately tracked and uniquely handled. In
|
||||
* particular, interactions with RDMA and filesystems need special
|
||||
* handling.
|
||||
* Pages that were pinned via pin_user_pages*() must be released via either
|
||||
* put_user_page(), or one of the put_user_pages*() routines. This is so that
|
||||
* eventually such pages can be separately tracked and uniquely handled. In
|
||||
* particular, interactions with RDMA and filesystems need special handling.
|
||||
*
|
||||
* put_user_page() and put_page() are not interchangeable, despite this early
|
||||
* implementation that makes them look the same. put_user_page() calls must
|
||||
* be perfectly matched up with get_user_page() calls.
|
||||
* be perfectly matched up with pin*() calls.
|
||||
*/
|
||||
static inline void put_user_page(struct page *page)
|
||||
{
|
||||
|
@ -1509,9 +1507,16 @@ long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
|
|||
unsigned long start, unsigned long nr_pages,
|
||||
unsigned int gup_flags, struct page **pages,
|
||||
struct vm_area_struct **vmas, int *locked);
|
||||
long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
|
||||
unsigned long start, unsigned long nr_pages,
|
||||
unsigned int gup_flags, struct page **pages,
|
||||
struct vm_area_struct **vmas, int *locked);
|
||||
long get_user_pages(unsigned long start, unsigned long nr_pages,
|
||||
unsigned int gup_flags, struct page **pages,
|
||||
struct vm_area_struct **vmas);
|
||||
long pin_user_pages(unsigned long start, unsigned long nr_pages,
|
||||
unsigned int gup_flags, struct page **pages,
|
||||
struct vm_area_struct **vmas);
|
||||
long get_user_pages_locked(unsigned long start, unsigned long nr_pages,
|
||||
unsigned int gup_flags, struct page **pages, int *locked);
|
||||
long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
|
||||
|
@ -1519,6 +1524,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
|
|||
|
||||
int get_user_pages_fast(unsigned long start, int nr_pages,
|
||||
unsigned int gup_flags, struct page **pages);
|
||||
int pin_user_pages_fast(unsigned long start, int nr_pages,
|
||||
unsigned int gup_flags, struct page **pages);
|
||||
|
||||
int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc);
|
||||
int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
|
||||
|
@ -2583,13 +2590,15 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
|
|||
#define FOLL_ANON 0x8000 /* don't do file mappings */
|
||||
#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */
|
||||
#define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */
|
||||
#define FOLL_PIN 0x40000 /* pages must be released via put_user_page() */
|
||||
|
||||
/*
|
||||
* NOTE on FOLL_LONGTERM:
|
||||
* FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each
|
||||
* other. Here is what they mean, and how to use them:
|
||||
*
|
||||
* FOLL_LONGTERM indicates that the page will be held for an indefinite time
|
||||
* period _often_ under userspace control. This is contrasted with
|
||||
* iov_iter_get_pages() where usages which are transient.
|
||||
* period _often_ under userspace control. This is in contrast to
|
||||
* iov_iter_get_pages(), whose usages are transient.
|
||||
*
|
||||
* FIXME: For pages which are part of a filesystem, mappings are subject to the
|
||||
* lifetime enforced by the filesystem and we need guarantees that longterm
|
||||
|
@ -2604,11 +2613,39 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
|
|||
* Currently only get_user_pages() and get_user_pages_fast() support this flag
|
||||
* and calls to get_user_pages_[un]locked are specifically not allowed. This
|
||||
* is due to an incompatibility with the FS DAX check and
|
||||
* FAULT_FLAG_ALLOW_RETRY
|
||||
* FAULT_FLAG_ALLOW_RETRY.
|
||||
*
|
||||
* In the CMA case: longterm pins in a CMA region would unnecessarily fragment
|
||||
* that region. And so CMA attempts to migrate the page before pinning when
|
||||
* In the CMA case: long term pins in a CMA region would unnecessarily fragment
|
||||
* that region. And so, CMA attempts to migrate the page before pinning, when
|
||||
* FOLL_LONGTERM is specified.
|
||||
*
|
||||
* FOLL_PIN indicates that a special kind of tracking (not just page->_refcount,
|
||||
* but an additional pin counting system) will be invoked. This is intended for
|
||||
* anything that gets a page reference and then touches page data (for example,
|
||||
* Direct IO). This lets the filesystem know that some non-file-system entity is
|
||||
* potentially changing the pages' data. In contrast to FOLL_GET (whose pages
|
||||
* are released via put_page()), FOLL_PIN pages must be released, ultimately, by
|
||||
* a call to put_user_page().
|
||||
*
|
||||
* FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different
|
||||
* and separate refcounting mechanisms, however, and that means that each has
|
||||
* its own acquire and release mechanisms:
|
||||
*
|
||||
* FOLL_GET: get_user_pages*() to acquire, and put_page() to release.
|
||||
*
|
||||
* FOLL_PIN: pin_user_pages*() to acquire, and put_user_pages to release.
|
||||
*
|
||||
* FOLL_PIN and FOLL_GET are mutually exclusive for a given function call.
|
||||
* (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based
|
||||
* calls applied to them, and that's perfectly OK. This is a constraint on the
|
||||
* callers, not on the pages.)
|
||||
*
|
||||
* FOLL_PIN should be set internally by the pin_user_pages*() APIs, never
|
||||
* directly by the caller. That's in order to help avoid mismatches when
|
||||
* releasing pages: get_user_pages*() pages must be released via put_page(),
|
||||
* while pin_user_pages*() pages must be released via put_user_page().
|
||||
*
|
||||
* Please see Documentation/vm/pin_user_pages.rst for more information.
|
||||
*/
|
||||
|
||||
static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue