2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

10 Commits

Author SHA1 Message Date
Linus Torvalds
9ad57f6dfc Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...
2020-08-12 11:24:12 -07:00
Peter Xu
bce617edec mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b98270 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:02 -07:00
Lu Baolu
02f3effddf iommu/vt-d: Rename intel-pasid.h to pasid.h
As Intel VT-d files have been moved to its own subdirectory, the prefix
makes no sense. No functional changes.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-13-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu
8b73712115 iommu/vt-d: Add page response ops support
After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.

Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-12-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu
eb8d93ea3c iommu/vt-d: Report page request faults for guest SVA
A pasid might be bound to a page table from a VM guest via the iommu
ops.sva_bind_gpasid. In this case, when a DMA page fault is detected
on the physical IOMMU, we need to inject the page fault request into
the guest. After the guest completes handling the page fault, a page
response need to be sent back via the iommu ops.page_response().

This adds support to report a page request fault. Any external module
which is interested in handling this fault should regiester a notifier
with iommu_register_device_fault_handler().

Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu
19abcf70c2 iommu/vt-d: Add a helper to get svm and sdev for pasid
There are several places in the code that need to get the pointers of
svm and sdev according to a pasid and device. Add a helper to achieve
this for code consolidation and readability.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu
dd6692f1b8 iommu/vt-d: Refactor device_to_iommu() helper
It is refactored in two ways:

- Make it global so that it could be used in other files.

- Make bus/devfn optional so that callers could ignore these two returned
values when they only want to get the coresponding iommu pointer.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Jacob Pan
d315e9e684 iommu/vt-d: Disable multiple GPASID-dev bind
For the unlikely use case where multiple aux domains from the same pdev
are attached to a single guest and then bound to a single process
(thus same PASID) within that guest, we cannot easily support this case
by refcounting the number of users. As there is only one SL page table
per PASID while we have multiple aux domains thus multiple SL page tables
for the same PASID.

Extra unbinding guest PASID can happen due to race between normal and
exception cases. Termination of one aux domain may affect others unless
we actively track and switch aux domains to ensure the validity of SL
page tables and TLB states in the shared PASID entry.

Support for sharing second level PGDs across domains can reduce the
complexity but this is not available due to the limitations on VFIO
container architecture. We can revisit this decision once sharing PGDs
are available.

Overall, the complexity and potential glitch do not warrant this unlikely
use case thereby removed by this patch.

Fixes: 56722a4398 ("iommu/vt-d: Add bind guest PASID support")
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Linus Torvalds
8f02f363f7 IOMMU drivers directory structure cleanup:
- Move the Intel and AMD IOMMU drivers into their own
 	  subdirectory. Both drivers consist of several files by now and
 	  giving them their own directory unclutters the IOMMU top-level
 	  directory a bit.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl7jme4ACgkQK/BELZcB
 GuNNMw//U7AL3Qq6J8DqU+Ay+gIblxKUhWtYLVHad1+agSWmcbfy4E6iV8FqXLbP
 HnCSmA7ScgEMN+3GAve/WpWccMI3aeAgp4xI4MElz/6p4QeJXfNu9COrllif+OX7
 4fDpxXyd0fhKev4lPGZFRY8yGgvgP5ZHvDG0juoxi3bKCqiC2bkAga3itC9RPCQb
 8kBefKIb7/q+UUGGVppTvVIW0mrqWLQ1TcnfKf0hovU7yZs4i4RO+8br6Q5eNUcB
 Vb64vCV3qkQ/zPdr4vK6rvuZTPRMKkCgY4+MJr/g2/JQWuZxF1O+q+TsTYI1ISAS
 qNPRdxgNrZbSBDowg2QfQtPBHPpq3m4eNDeD+ewyQkrVt0/Eneg6Np0FG9j3tGAG
 +IS64r2E25O0tGtBIQ9Mi2TC68S0C7VtMbzx55zVcTGF0JH9T2YW4sSdRcTjVdW6
 WBFqu5fXEKk63ln3h/8JEP7zPWGp+Q3cuOChDvcmIMjCxQ84k5jOB5AIZppGIgJ9
 0nGf45t8YCvIXMbNKufYqjesJZOC2bd+Swi1MZXVlO/gSVv19O40UW+F1X0e7YOp
 MHOzsV44rE2posS/huHOLR4q0AQTdc9O1mywCCGDxNW8tlwIBHsLLJ8b9C9raIRn
 mZkq94QZQXta+WYtoGvbk6nHQ89FtBOOdEH2TSlEbvvYowpjZZE=
 =gX8z
 -----END PGP SIGNATURE-----

Merge tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu driver directory structure cleanup from Joerg Roedel:
 "Move the Intel and AMD IOMMU drivers into their own subdirectory.

  Both drivers consist of several files by now and giving them their own
  directory unclutters the IOMMU top-level directory a bit"

* tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Move Intel IOMMU driver into subdirectory
  iommu/amd: Move AMD IOMMU driver into subdirectory
2020-06-12 12:19:13 -07:00
Joerg Roedel
672cf6df9b iommu/vt-d: Move Intel IOMMU driver into subdirectory
Move all files related to the Intel IOMMU driver into its own
subdirectory.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org
2020-06-10 17:46:43 +02:00