Commit Graph

82526 Commits

Author SHA1 Message Date
Linus Torvalds
36db171cc7 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Bigger kernel side changes:

   - Add backwards writing capability to the perf ring-buffer code,
     which is preparation for future advanced features like robust
     'overwrite support' and snapshot mode.  (Wang Nan)

   - Add pause and resume ioctls for the perf ringbuffer (Wang Nan)

   - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)

   - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
     Zijlstra)

   - x86 AUX (Intel PT) related enhancements and updates (Alexander
     Shishkin)

   - x86 MSR PMU driver enhancements and updates (Huang Rui)

   - ... and lots of other changes spread out over 40+ commits.

  Biggest tooling side changes:

   - 'perf trace' features and enhancements.  (Arnaldo Carvalho de Melo)

   - BPF tooling updates (Wang Nan)

   - 'perf sched' updates (Jiri Olsa)

   - 'perf probe' updates (Masami Hiramatsu)

   - ... plus 200+ other enhancements, fixes and cleanups to tools/

  The merge commits, the shortlog and the changelogs contain a lot more
  details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
  perf/core: Disable the event on a truncated AUX record
  perf/x86/intel/pt: Generate PMI in the STOP region as well
  perf buildid-cache: Use lsdir() for looking up buildid caches
  perf symbols: Use lsdir() for the search in kcore cache directory
  perf tools: Use SBUILD_ID_SIZE where applicable
  perf tools: Fix lsdir to set errno correctly
  perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
  perf trace: Move flock op beautifier to tools/perf/trace/beauty/
  perf build: Add build-test for debug-frame on arm/arm64
  perf build: Add build-test for libunwind cross-platforms support
  perf script: Fix export of callchains with recursion in db-export
  perf script: Fix callchain addresses in db-export
  perf script: Fix symbol insertion behavior in db-export
  perf symbols: Add dso__insert_symbol function
  perf scripting python: Use Py_FatalError instead of die()
  perf tools: Remove xrealloc and ALLOC_GROW
  perf help: Do not use ALLOC_GROW in add_cmd_list
  perf pmu: Make pmu_formats_string to check return value of strbuf
  perf header: Make topology checkers to check return value of strbuf
  perf tools: Make alias handler to check return value of strbuf
  ...
2016-05-16 14:08:43 -07:00
Linus Torvalds
3469d261ea Merge branch 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull support for killable rwsems from Ingo Molnar:
 "This, by Michal Hocko, implements down_write_killable().

  The main usecase will be to update mm_sem usage sites to use this new
  API, to allow the mm-reaper introduced in commit aac4536355 ("mm,
  oom: introduce oom reaper") to tear down oom victim address spaces
  asynchronously with minimum latencies and without deadlock worries"

[ The vfs will want it too as the inode lock is changed from a mutex to
  a rwsem due to the parallel lookup and readdir updates ]

* 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Fix comment on register clobbering
  locking/rwsem: Fix down_write_killable()
  locking/rwsem, x86: Add frame annotation for call_rwsem_down_write_failed_killable()
  locking/rwsem: Provide down_write_killable()
  locking/rwsem, x86: Provide __down_write_killable()
  locking/rwsem, s390: Provide __down_write_killable()
  locking/rwsem, ia64: Provide __down_write_killable()
  locking/rwsem, alpha: Provide __down_write_killable()
  locking/rwsem: Introduce basis for down_write_killable()
  locking/rwsem, sparc: Drop superfluous arch specific implementation
  locking/rwsem, sh: Drop superfluous arch specific implementation
  locking/rwsem, xtensa: Drop superfluous arch specific implementation
  locking/rwsem: Drop explicit memory barriers
  locking/rwsem: Get rid of __down_write_nested()
2016-05-16 13:41:02 -07:00
Linus Torvalds
1c19b68a27 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - pvqspinlock statistics fixes (Davidlohr Bueso)

   - flip atomic_fetch_or() arguments (Peter Zijlstra)

   - locktorture simplification (Paul E.  McKenney)

   - documentation updates (SeongJae Park, David Howells, Davidlohr
     Bueso, Paul E McKenney, Peter Zijlstra, Will Deacon)

   - various fixes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Flip atomic_fetch_or() arguments
  locking/pvqspinlock: Robustify init_qspinlock_stat()
  locking/pvqspinlock: Avoid double resetting of stats
  lcoking/locktorture: Simplify the torture_runnable computation
  locking/Documentation: Clarify that ACQUIRE applies to loads, RELEASE applies to stores
  locking/Documentation: State purpose of memory-barriers.txt
  locking/Documentation: Add disclaimer
  locking/Documentation/lockdep: Fix spelling mistakes
  locking/lockdep: Deinline register_lock_class(), save 2328 bytes
  locking/locktorture: Fix NULL pointer dereference for cleanup paths
  locking/locktorture: Fix deboosting NULL pointer dereference
  locking/Documentation: Mention smp_cond_acquire()
  locking/Documentation: Insert white spaces consistently
  locking/Documentation: Fix formatting inconsistencies
  locking/Documentation: Add missed subsection in TOC
  locking/Documentation: Fix missed s/lock/acquire renames
  locking/Documentation: Clarify relationship of barrier() to control dependencies
2016-05-16 13:17:56 -07:00
Linus Torvalds
49817c3343 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Drop the unused EFI_SYSTEM_TABLES efi.flags bit and ensure the
     ARM/arm64 EFI System Table mapping is read-only (Ard Biesheuvel)

   - Add a comment to explain that one of the code paths in the x86/pat
     code is only executed for EFI boot (Matt Fleming)

   - Improve Secure Boot status checks on arm64 and handle unexpected
     errors (Linn Crosetto)

   - Remove the global EFI memory map variable 'memmap' as the same
     information is already available in efi::memmap (Matt Fleming)

   - Add EFI Memory Attribute table support for ARM/arm64 (Ard
     Biesheuvel)

   - Add EFI GOP framebuffer support for ARM/arm64 (Ard Biesheuvel)

   - Add EFI Bootloader Control driver for storing reboot(2) data in EFI
     variables for consumption by bootloaders (Jeremy Compostella)

   - Add Core EFI capsule support (Matt Fleming)

   - Add EFI capsule char driver (Kweh, Hock Leong)

   - Unify EFI memory map code for ARM and arm64 (Ard Biesheuvel)

   - Add generic EFI support for detecting when firmware corrupts CPU
     status register bits (like IRQ flags) when performing EFI runtime
     service calls (Mark Rutland)

  ... and other misc cleanups"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  efivarfs: Make efivarfs_file_ioctl() static
  efi: Merge boolean flag arguments
  efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()
  efibc: Fix excessive stack footprint warning
  efi/capsule: Make efi_capsule_pending() lockless
  efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver
  efi/runtime-wrappers: Remove ARCH_EFI_IRQ_FLAGS_MASK #ifdef
  x86/efi: Enable runtime call flag checking
  arm/efi: Enable runtime call flag checking
  arm64/efi: Enable runtime call flag checking
  efi/runtime-wrappers: Detect firmware IRQ flag corruption
  efi/runtime-wrappers: Remove redundant #ifdefs
  x86/efi: Move to generic {__,}efi_call_virt()
  arm/efi: Move to generic {__,}efi_call_virt()
  arm64/efi: Move to generic {__,}efi_call_virt()
  efi/runtime-wrappers: Add {__,}efi_call_virt() templates
  efi/arm-init: Reserve rather than unmap the memory map for ARM as well
  efi: Add misc char driver interface to update EFI firmware
  x86/efi: Force EFI reboot to process pending capsules
  efi: Add 'capsule' update support
  ...
2016-05-16 13:06:27 -07:00
Linus Torvalds
230e51f211 Merge branch 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core signal updates from Ingo Molnar:
 "These updates from Stas Sergeev and Andy Lutomirski, improve the
  sigaltstack interface by extending its ABI with the SS_AUTODISARM
  feature, which makes it possible to use swapcontext() in a sighandler
  that works on sigaltstack.  Without this flag, the subsequent signal
  will corrupt the state of the switched-away sighandler.

  The inspiration is more robust dosemu signal handling"

* 'core-signals-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)
  signals/sigaltstack: Report current flag bits in sigaltstack()
  selftests/sigaltstack: Fix the sigaltstack test on old kernels
  signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
  selftests/sigaltstack: Add new testcase for sigaltstack(SS_ONSTACK|SS_AUTODISARM)
  signals/sigaltstack: Implement SS_AUTODISARM flag
  signals/sigaltstack: Prepare to add new SS_xxx flags
  signals/sigaltstack, x86/signals: Unify the x86 sigaltstack check with other architectures
2016-05-16 12:25:25 -07:00
Linus Torvalds
a3871bd434 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes are:

   - Documentation updates, including fixes to the design-level
     requirements documentation and a fixed version of the design-level
     data-structure documentation.  These fixes include removing
     cartoons and getting rid of the html/htmlx duplication.

   - Further improvements to the new-age expedited grace periods.

   - Miscellaneous fixes.

   - Torture-test changes, including a new rcuperf module for measuring
     RCU grace-period performance and scalability, which is useful for
     the expedited-grace-period changes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  rcutorture: Add boot-time adjustment of leaf fanout
  rcutorture: Add irqs-disabled test for call_rcu()
  rcutorture: Dump trace buffer upon shutdown
  rcutorture: Don't rebuild identical kernel
  rcutorture: Add OS-jitter capability
  documentation: Add documentation for RCU's major data structures
  rcutorture: Convert test duration to seconds early
  torture: Kill qemu, not parent process
  torture: Clarify refusal to run more than one torture test
  rcutorture: Consider FROZEN hotplug notifier transitions
  rcutorture: Remove redundant initialization to zero
  rcuperf: Do not wake up shutdown wait queue if "shutdown" is false.
  rcutorture: Add largish-system rcuperf scenario
  rcutorture: Avoid RCU CPU stall warning and RT throttling
  rcutorture: Add rcuperf holdoff boot parameter to reduce interference
  rcutorture: Make scripts analyze rcuperf trace data, if present
  rcutorture: Make rcuperf collect expedited event-trace data
  rcutorture: Print measure of batching efficiency
  rcutorture: Set rcuperf writer kthreads to real-time priority
  rcutorture: Bind rcuperf reader/writer kthreads to CPUs
  ...
2016-05-16 12:02:08 -07:00
Linus Torvalds
0052af4411 Merge branch 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/lib update from Ingo Molnar:
 "This contains a single commit that removes an unused facility that the
  scheduler used to make use of"

* 'core-lib-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/proportions: Remove unused code
2016-05-16 11:36:02 -07:00
Linus Torvalds
6ba5b85fd4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Overlayfs fixes from Miklos, assorted fixes from me.

  Stable fodder of varying severity, all sat in -next for a while"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: ignore permissions on underlying lookup
  vfs: add lookup_hash() helper
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  get_rock_ridge_filename(): handle malformed NM entries
  ecryptfs: fix handling of directory opening
  atomic_open(): fix the handling of create_error
  fix the copy vs. map logics in blk_rq_map_user_iov()
  do_splice_to(): cap the size before passing to ->splice_read()
2016-05-14 11:59:43 -07:00
Linus Torvalds
1410b74e40 Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "During v4.6-rc1 cgroup namespace support was merged.  There is an
  issue where it's impossible to tell whether a given cgroup mount point
  is bind mounted or namespaced.  Serge has been working on the issue
  but it took longer than expected to resolve, so the late pull request.

  Given that it's a completely new feature and the patches don't touch
  anything else, the risk seems acceptable.  However, if this is too
  late, an alternative is plugging new cgroup ns creation for v4.6 and
  retrying for v4.7"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix compile warning
  kernfs: kernfs_sop_show_path: don't return 0 after seq_dentry call
  cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces
  kernfs_path_from_node_locked: don't overwrite nlen
2016-05-13 16:26:46 -07:00
Linus Torvalds
c3548b730d Merge tag 'regulator-fix-v4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "A small collection of driver specific fixes for the regulator
  subsysetem:

   - Fix handling of probe deferral for GPIO regulators

   - Fix a typo in the module alias for DA9053

   - Fix the definition of BUCK9 in the S2MPS11 driver.  This change
     looks larger than it is because an irregularity in the hardware
     means that the macro used to define bucks 6-10 needs duplicating
     and tweaking to have a separate macro for 9

   - Fix a series of errors in the definitions of the LDOs the AXP20x
     regulators, some of which had always been present and some of which
     were introduced in the merge window"

* tag 'regulator-fix-v4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: da9063: Correct module alias prefix to fix module autoloading
  regulator: axp20x: Fix axp22x ldo_io registration error on cold boot
  regulator: axp20x: Fix axp22x ldo_io voltage ranges
  regulator: axp20x: Fix LDO4 linear voltage range
  regulator: s2mps11: Fix invalid selector mask and voltages for buck9
  regulator: gpio: check return value of of_get_named_gpio
2016-05-13 09:46:00 -07:00
Mark Brown
9689dab30a Merge remote-tracking branches 'regulator/fix/axp20x', 'regulator/fix/da9063', 'regulator/fix/gpio' and 'regulator/fix/s2mps11' into regulator-linus 2016-05-13 11:11:08 +01:00
Andrea Arcangeli
6d0a07edd1 mm: thp: calculate the mapcount correctly for THP pages during WP faults
This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning will not get broken by false
positive copy-on-writes.

total_mapcount() isn't the right calculation needed in
reuse_swap_page(), so this introduces a page_trans_huge_mapcount()
that is effectively the full accurate return value for page_mapcount()
if dealing with Transparent Hugepages, however we only use the
page_trans_huge_mapcount() during COW faults where it strictly needed,
due to its higher runtime cost.

This also provide at practical zero cost the total_mapcount
information which is needed to know if we can still relocate the page
anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we
can reuse the page no matter if it's a pte or a pmd_trans_huge
triggering the fault, but we can only relocate the page anon_vma to
the local vma->anon_vma if we're sure it's only this "vma" mapping the
whole THP physical range.

Kirill A. Shutemov discovered the problem with moving the page
anon_vma to the local vma->anon_vma in a previous version of this
patch and another problem in the way page_move_anon_rmap() was called.

Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a
previous version, because reuse_swap_page must be a macro to call
page_trans_huge_mapcount from swap.h, so this uses a macro again
instead of an inline function. With this change at least it's a less
dangerous usage than it was before, because "page" is used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.

Dean Luick noticed an uninitialized variable that could result in a
rmap inefficiency for the non-THP case in a previous version.

Mike Marciniszyn said:

: Our RDMA tests are seeing an issue with memory locking that bisects to
: commit 61f5d698cc ("mm: re-enable THP")
:
: The test program registers two rather large MRs (512M) and RDMA
: writes data to a passive peer using the first and RDMA reads it back
: into the second MR and compares that data.  The sizes are chosen randomly
: between 0 and 1024 bytes.
:
: The test will get through a few (<= 4 iterations) and then gets a
: compare error.
:
: Tracing indicates the kernel logical addresses associated with the individual
: pages at registration ARE correct , the data in the "RDMA read response only"
: packets ARE correct.
:
: The "corruption" occurs when the packet crosse two pages that are not physically
: contiguous.   The second page reads back as zero in the program.
:
: It looks like the user VA at the point of the compare error no longer points to
: the same physical address as was registered.
:
: This patch totally resolves the issue!

Link: http://lkml.kernel.org/r/1462547040-1737-2-git-send-email-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Tested-by: Josh Collier <josh.d.collier@intel.com>
Cc: Marc Haber <mh+linux-kernel@zugschlus.de>
Cc: <stable@vger.kernel.org>	[4.5]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-12 15:52:50 -07:00
Ingo Molnar
d2950158d0 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11 16:56:38 +02:00
Al Viro
e4d35be584 Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
Miklos Szeredi
3c9fe8cdff vfs: add lookup_hash() helper
Overlayfs needs lookup without inode_permission() and already has the name
hash (in form of dentry->d_name on overlayfs dentry).  It also doesn't
support filesystems with d_op->d_hash() so basically it only needs
the actual hashed lookup from lookup_one_len_unlocked()

So add a new helper that does unlocked lookup of a hashed name.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-10 23:56:28 -04:00
Miklos Szeredi
54d5ca871e vfs: add vfs_select_inode() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
2016-05-10 23:55:01 -04:00
Jamal Hadi Salim
d99079e2fb export tc ife uapi header
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10 01:08:39 -04:00
Mikko Rapeli
4a91cb61bb uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h
glibc's net/if.h contains copies of definitions from linux/if.h and these
conflict and cause build failures if both files are included by application
source code. Changes in uapi headers, which fixed header file dependencies to
include linux/if.h when it was needed, e.g. commit 1ffad83d, made the
net/if.h and linux/if.h incompatibilities visible as build failures for
userspace applications like iproute2 and xtables-addons.

This patch fixes compile errors when glibc net/if.h is included before
linux/if.h:

./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’
./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’
./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’
./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’
./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’
./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’
./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’
./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’
./linux/if.h:252:8: error: redefinition of ‘struct ifconf’
./linux/if.h:203:8: error: redefinition of ‘struct ifreq’
./linux/if.h:169:8: error: redefinition of ‘struct ifmap’
./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’
./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’
./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’
./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’
./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’
./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’
./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’
./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’

The cases where linux/if.h is included before net/if.h need a similar fix in
the glibc side, or the order of include files can be changed userspace
code as a workaround.

This change was tested in x86 userspace on Debian unstable with
scripts/headers_compile_test.sh:

$ make headers_install && \
  cd usr/include && ../../scripts/headers_compile_test.sh -l -k
...
cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu -o /dev/null ./linux/if.h_libc_before_kernel.h
PASSED libc before kernel test: ./linux/if.h

Reported-by: Jan Engelhardt <jengelh@inai.de>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Reported-by: Stephen Hemminger <shemming@brocade.com>
Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 21:29:31 -04:00
Linus Torvalds
26acc792c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Check klogctl failure correctly, from Colin Ian King.

 2) Prevent OOM when under memory pressure in flowcache, from Steffen
    Klassert.

 3) Fix info leak in llc and rtnetlink ifmap code, from Kangjie Lu.

 4) Memory barrier and multicast handling fixes in bnxt_en, from Michael
    Chan.

 5) Endianness bug in mlx5, from Daniel Jurgens.

 6) Fix disconnect handling in VSOCK, from Ian Campbell.

 7) Fix locking of netdev list walking in get_bridge_ifindices(), from
    Nikolay Aleksandrov.

 8) Bridge multicast MLD parser can look at wrong packet offsets, fix
    from Linus Lüssing.

 9) Fix chip hang in qede driver, from Sudarsana Reddy Kalluru.

10) Fix missing setting of encapsulation before inner handling completes
    in udp_offload code, from Jarno Rajahalme.

11) Missing rollbacks during LAG join and flood configuration failures
    in mlxsw driver, from Ido Schimmel.

12) Fix error code checks in netxen driver, from Dan Carpenter.

13) Fix key size in new macsec driver, from Sabrina Dubroca.

14) Fix mlx5/VXLAN dependencies, from Arnd Bergmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  net/mlx5e: make VXLAN support conditional
  Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
  macsec: key identifier is 128 bits, not 64
  Documentation/networking: more accurate LCO explanation
  macvtap: segmented packet is consumed
  tools: bpf_jit_disasm: check for klogctl failure
  qede: uninitialized variable in qede_start_xmit()
  netxen: netxen_rom_fast_read() doesn't return -1
  netxen: reversed condition in netxen_nic_set_link_parameters()
  netxen: fix error handling in netxen_get_flash_block()
  mlxsw: spectrum: Add missing rollback in flood configuration
  mlxsw: spectrum: Fix rollback order in LAG join failure
  udp_offload: Set encapsulation before inner completes.
  udp_tunnel: Remove redundant udp_tunnel_gro_complete().
  qede: prevent chip hang when increasing channels
  net: ipv6: tcp reset, icmp need to consider L3 domain
  bridge: fix igmp / mld query parsing
  net: bridge: fix old ioctl unlocked net device walk
  VSOCK: do not disconnect socket when peer has shutdown SEND only
  net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  ...
2016-05-09 12:11:37 -07:00
Josh Poimboeuf
8634de6d25 compiler-gcc: require gcc 4.8 for powerpc __builtin_bswap16()
gcc support for __builtin_bswap16() was supposedly added for powerpc in
gcc 4.6, and was then later added for other architectures in gcc 4.8.

However, Stephen Rothwell reported that attempting to use it on powerpc
in gcc 4.6 fails with:

  lib/vsprintf.c:160:2: error: initializer element is not constant
  lib/vsprintf.c:160:2: error: (near initialization for 'decpair[0]')
  lib/vsprintf.c:160:2: error: initializer element is not constant
  lib/vsprintf.c:160:2: error: (near initialization for 'decpair[1]')
  ...

I'm not entirely sure what those errors mean, but I don't see them on
gcc 4.8.  So let's consider gcc 4.8 to be the official starting point
for __builtin_bswap16().

Arnd Bergmann adds:
 "I found the commit in gcc-4.8 that replaced the powerpc-specific
  implementation of __builtin_bswap16 with an architecture-independent
  one.  Apparently the powerpc version (gcc-4.6 and 4.7) just mapped to
  the lhbrx/sthbrx instructions, so it ended up not being a constant,
  though the intent of the patch was mainly to add support for the
  builtin to x86:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52624

  has the patch that went into gcc-4.8 and more information."

Fixes: 7322dd755e ("byteswap: try to avoid __builtin_constant_p gcc bug")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09 11:54:29 -07:00
Serge E. Hallyn
4f41fc5962 cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces
Patch summary:

When showing a cgroupfs entry in mountinfo, show the path of the mount
root dentry relative to the reader's cgroup namespace root.

Short explanation (courtesy of mkerrisk):

If we create a new cgroup namespace, then we want both /proc/self/cgroup
and /proc/self/mountinfo to show cgroup paths that are correctly
virtualized with respect to the cgroup mount point.  Previous to this
patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo
does not.

Long version:

When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup
namespace, and then mounts a new instance of the freezer cgroup, the new
mount will be rooted at /a/b.  The root dentry field of the mountinfo
entry will show '/a/b'.

 cat > /tmp/do1 << EOF
 mount -t cgroup -o freezer freezer /mnt
 grep freezer /proc/self/mountinfo
 EOF

 unshare -Gm  bash /tmp/do1
 > 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
 > 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer

The task's freezer cgroup entry in /proc/self/cgroup will simply show
'/':

 grep freezer /proc/self/cgroup
 9:freezer:/

If instead the same task simply bind mounts the /a/b cgroup directory,
the resulting mountinfo entry will again show /a/b for the dentry root.
However in this case the task will find its own cgroup at /mnt/a/b,
not at /mnt:

 mount --bind /sys/fs/cgroup/freezer/a/b /mnt
 130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer

In other words, there is no way for the task to know, based on what is
in mountinfo, which cgroup directory is its own.

Example (by mkerrisk):

First, a little script to save some typing and verbiage:

echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup | grep freezer)"
cat /proc/self/mountinfo | grep freezer |
        awk '{print "\tmountinfo:\t\t" $4 "\t" $5}'

Create cgroup, place this shell into the cgroup, and look at the state
of the /proc files:

2653
2653                         # Our shell
14254                        # cat(1)
        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer

Create a shell in new cgroup and mount namespaces. The act of creating
a new cgroup namespace causes the process's current cgroups directories
to become its cgroup root directories. (Here, I'm using my own version
of the "unshare" utility, which takes the same options as the util-linux
version):

Look at the state of the /proc files:

        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /sys/fs/cgroup/freezer

The third entry in /proc/self/cgroup (the pathname of the cgroup inside
the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which
is rooted at /a/b in the outer namespace.

However, the info in /proc/self/mountinfo is not for this cgroup
namespace, since we are seeing a duplicate of the mount from the
old mount namespace, and the info there does not correspond to the
new cgroup namespace. However, trying to create a new mount still
doesn't show us the right information in mountinfo:

                                      # propagating to other mountns
        /proc/self/cgroup:      7:freezer:/
        mountinfo:              /a/b    /mnt/freezer

The act of creating a new cgroup namespace caused the process's
current freezer directory, "/a/b", to become its cgroup freezer root
directory. In other words, the pathname directory of the directory
within the newly mounted cgroup filesystem should be "/",
but mountinfo wrongly shows us "/a/b". The consequence of this is
that the process in the cgroup namespace cannot correctly construct
the pathname of its cgroup root directory from the information in
/proc/PID/mountinfo.

With this patch, the dentry root field in mountinfo is shown relative
to the reader's cgroup namespace.  So the same steps as above:

        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /../..  /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /mnt/freezer

cgroup.clone_children  freezer.parent_freezing  freezer.state      tasks
cgroup.procs           freezer.self_freezing    notify_on_release
3164
2653                   # First shell that placed in this cgroup
3164                   # Shell started by 'unshare'
14197                  # cat(1)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09 12:15:03 -04:00
Sabrina Dubroca
8acca6aceb macsec: key identifier is 128 bits, not 64
The MACsec standard mentions a key identifier for each key, but
doesn't specify anything about it, so I arbitrarily chose 64 bits.

IEEE 802.1X-2010 specifies MKA (MACsec Key Agreement), and defines the
key identifier to be 128 bits (96 bits "member identifier" + 32 bits
"key number").

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 00:09:01 -04:00
Julia Lawall
1cfd63166c efi: Merge boolean flag arguments
The parameters atomic and duplicates of efivar_init always have opposite
values.  Drop the parameter atomic, replace the uses of !atomic with
duplicates, and update the call sites accordingly.

The code using duplicates is slightly reorganized with an 'else', to avoid
duplicating the lock code.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Saurabh Sengar <saurabh.truth@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462570771-13324-5-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-07 07:06:13 +02:00
Ingo Molnar
35dc9ec107 Merge branch 'linus' into efi/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-07 07:00:07 +02:00
Jarno Rajahalme
229740c631 udp_offload: Set encapsulation before inner completes.
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them.  Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done.  This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.

This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 18:25:26 -04:00
Jarno Rajahalme
43b8448cd7 udp_tunnel: Remove redundant udp_tunnel_gro_complete().
The setting of the UDP tunnel GSO type is already performed by
udp[46]_gro_complete().

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 18:25:26 -04:00
Linus Torvalds
01ec716761 Merge tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
 "Fixes for problems introduced or discovered recently (intel_pstate,
  sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
  generic device properties framework) and one fix for a hotplug-related
  deadlock in ACPICA that's been there forever, but is nasty enough.

  Specifics:

   - Fix for a recent regression in the intel_pstate driver causing it
     to fail to restore the HWP (HW-managed P-states) configuration of
     the boot CPU after suspend-to-RAM (Rafael Wysocki).

   - Fix for two recent regressions in the intel_pstate driver, one that
     can trigger a divide by zero if the driver is accessed via sysfs
     before it manages to take the first sample and one causing it to
     fail to update a structure field used in a trace point, so the
     information coming from it is less useful (Rafael Wysocki).

   - Fix for a problem in the sti-cpufreq driver introduced during the
     4.5 cycle that causes it to break CPU PM in multi-platform kernels
     by registering cpufreq-dt (which subsequently doesn't work)
     unconditionally and preventing the driver that would actually work
     from registering (Sudeep Holla).

   - Stable-candidate fix for an ARM64 cpuidle issue causing idle state
     usage counters to be incorrectly updated for idle states that were
     not entered due to errors (James Morse).

   - Fix for a recently introduced issue in the OPP (Operating
     Performance Points) framework causing it to print bogus error
     messages for missing optional regulators (Viresh Kumar).

   - Fix for a recently introduced issue in the generic device
     properties framework that may cause it to attempt to dereferece and
     invalid pointer in some cases (Heikki Krogerus).

   - Fix for a deadlock in the ACPICA core that may be triggered by
     device (eg Thunderbolt) hotplug (Prarit Bhargava)"

* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / OPP: Remove useless check
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  intel_pstate: Fix intel_pstate_get()
  cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
  cpufreq: st: enable selective initialization based on the platform
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 11:58:45 -07:00
Rafael J. Wysocki
7c21b38ca9 Merge branches 'acpica-fixes' and 'device-properties-fixes'
* acpica-fixes:
  ACPICA: Dispatcher: Update thread ID for recursive method calls

* device-properties-fixes:
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 13:15:52 +02:00
Linus Torvalds
9caa7e7848 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "14 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  byteswap: try to avoid __builtin_constant_p gcc bug
  lib/stackdepot: avoid to return 0 handle
  mm: fix kcompactd hang during memory offlining
  modpost: fix module autoloading for OF devices with generic compatible property
  proc: prevent accessing /proc/<PID>/environ until it's ready
  mm/zswap: provide unique zpool name
  mm: thp: kvm: fix memory corruption in KVM with THP enabled
  MAINTAINERS: fix Rajendra Nayak's address
  mm, cma: prevent nr_isolated_* counters from going negative
  mm: update min_free_kbytes from khugepaged after core initialization
  huge pagecache: mmap_sem is unlocked when truncation splits pmd
  rapidio/mport_cdev: fix uapi type definitions
  mm: memcontrol: let v2 cgroups follow changes in system swappiness
  mm: thp: correct split_huge_pages file permission
2016-05-05 20:48:35 -07:00
Arnd Bergmann
7322dd755e byteswap: try to avoid __builtin_constant_p gcc bug
This is another attempt to avoid a regression in wwn_to_u64() after that
started using get_unaligned_be64(), which in turn ran into a bug on
gcc-4.9 through 6.1.

The regression got introduced due to the combination of two separate
workarounds (commits e3bde9568d: "include/linux/unaligned: force
inlining of byteswap operations" and ef3fb2422f: "scsi: fc: use
get/put_unaligned64 for wwn access") that each try to sidestep distinct
problems with gcc behavior (code growth and increased stack usage).

Unfortunately after both have been applied, a more serious gcc bug has
been uncovered, leading to incorrect object code that discards part of a
function and causes undefined behavior.

As part of this problem is how __builtin_constant_p gets evaluated on an
argument passed by reference into an inline function, this avoids the
use of __builtin_constant_p() for all architectures that set
CONFIG_ARCH_USE_BUILTIN_BSWAP.  Most architectures do not set
ARCH_SUPPORTS_OPTIMIZED_INLINING, which means they probably do not
suffer from the problem in the qla2xxx driver, but they might still run
into it elsewhere.

Both of the original workarounds were only merged in the 4.6 kernel, and
the bug that is fixed by this patch should only appear if both are
there, so we probably don't need to backport the fix.  On the other
hand, it works by simplifying the code path and should not have any
negative effects.

[arnd@arndb.de: fix older gcc warnings]
  (http://lkml.kernel.org/r/12243652.bxSxEgjgfk@wuerfel)
Link: https://lkml.org/lkml/headers/2016/4/12/1103
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
Fixes: e3bde9568d ("include/linux/unaligned: force inlining of byteswap operations")
Fixes: ef3fb2422f ("scsi: fc: use get/put_unaligned64 for wwn access")
Link: http://lkml.kernel.org/r/1780465.XdtPJpi8Tt@wuerfel
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Josh Poimboeuf <jpoimboe@redhat.com> # on gcc-5.3
Tested-by: Quinn Tran <quinn.tran@qlogic.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Jan Hubicka <hubicka@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Andrea Arcangeli
127393fbe5 mm: thp: kvm: fix memory corruption in KVM with THP enabled
After the THP refcounting change, obtaining a compound pages from
get_user_pages() no longer allows us to assume the entire compound page
is immediately mappable from a secondary MMU.

A secondary MMU doesn't want to call get_user_pages() more than once for
each compound page, in order to know if it can map the whole compound
page.  So a secondary MMU needs to know from a single get_user_pages()
invocation when it can map immediately the entire compound page to avoid
a flood of unnecessary secondary MMU faults and spurious
atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
users).

Ideally instead of the page->_mapcount < 1 check, get_user_pages()
should return the granularity of the "page" mapping in the "mm" passed
to get_user_pages().  However it's non trivial change to pass the "pmd"
status belonging to the "mm" walked by get_user_pages up the stack (up
to the caller of get_user_pages).  So the fix just checks if there is
not a single pte mapping on the page returned by get_user_pages, and in
turn if the caller can assume that the whole compound page is mapped in
the current "mm" (in a pmd_trans_huge()).  In such case the entire
compound page is safe to map into the secondary MMU without additional
get_user_pages() calls on the surrounding tail/head pages.  In addition
of being faster, not having to run other get_user_pages() calls also
reduces the memory footprint of the secondary MMU fault in case the pmd
split happened as result of memory pressure.

Without this fix after a MADV_DONTNEED (like invoked by QEMU during
postcopy live migration or balloning) or after generic swapping (with a
failure in split_huge_page() that would only result in pmd splitting and
not a physical page split), KVM would map the whole compound page into
the shadow pagetables, despite regular faults or userfaults (like
UFFDIO_COPY) may map regular pages into the primary MMU as result of the
pte faults, leading to the guest mode and userland mode going out of
sync and not working on the same memory at all times.

Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will need the same information if it doesn't want to
run a flood of get_user_pages_fast and it can support multiple
granularity in the secondary MMU mappings, so I think it is justified to
be exposed not just to KVM.

The other option would be to move transparent_hugepage_adjust to
mm/huge_memory.c but that currently has all kind of KVM data structures
in it, so it's definitely not a cut-and-paste work, so I couldn't do a
fix as cleaner as this one for 4.6.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Li, Liang Z" <liang.z.li@intel.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Alexandre Bounine
4e1016dac1 rapidio/mport_cdev: fix uapi type definitions
Fix problems in uapi definitions reported by Gabriel Laskar: (see
https://lkml.org/lkml/2016/4/5/205 for details)

 - move public header file rio_mport_cdev.h to include/uapi/linux directory
 - change types in data structures passed as IOCTL parameters
 - improve parameter checking in some IOCTL service routines

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Gabriel Laskar <gabriel@lse.epita.fr>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Johannes Weiner
4550c4e157 mm: memcontrol: let v2 cgroups follow changes in system swappiness
Cgroup2 currently doesn't have a per-cgroup swappiness setting.  We
might want to add one later - that's a different discussion - but until
we do, the cgroups should always follow the system setting.  Otherwise
it will be unchangeably set to whatever the ancestor inherited from the
system setting at the time of cgroup creation.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org>	[4.5]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Linus Torvalds
85f397a97a Merge tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic syscall fix from Arnd Bergmann:
 "My last pull request for asm-generic had just one patch that added two
  new system calls to asm/unistd.h, but unfortunately it turned out to
  be wrong, pointing arch/tile compat mode at the native handlers rather
  than the compat ones.

  This was spotted by Yury Norov, who is working on ILP32 mode for
  arch/arm64, which would have the same problem when merged.  This fixes
  the table to use the correct compat syscalls, like the other 64-bit
  architectures do.

  I'll try to find the time to come up with a solution that prevents
  this problem from happening again, by allowing all future system calls
  to just get added in a single file for use by all architectures"

* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: use compat version for preadv2 and pwritev2
2016-05-05 15:40:38 -07:00
Mark Rutland
5101ef20f0 perf/arm: Special-case hetereogeneous CPUs
Commit:

  2665784850 ("perf/core: Verify we have a single perf_hw_context PMU")

forcefully prevents multiple PMUs from sharing perf_hw_context, as this
generally doesn't make sense. It is a common bug for uncore PMUs to
use perf_hw_context rather than perf_invalid_context, which this detects.

However, systems exist with heterogeneous CPUs (and hence heterogeneous
HW PMUs), for which sharing perf_hw_context is necessary, and possible
in some limited cases.

To make this work we have to perform some gymnastics, as we did in these
commits:

  66eb579e66 ("perf: allow for PMU-specific event filtering")
  c904e32a69 ("arm: perf: filter unschedulable events")

To allow those systems to work, we must allow PMUs for heterogeneous
CPUs to share perf_hw_context, though we must still disallow sharing
otherwise to detect the common misuse of perf_hw_context.

This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates
the core logic to account for this, and makes use of it in the arm_pmu
code that is used for systems with heterogeneous CPUs. Comments are
added to make the rationale clear and hopefully avoid accidental abuse.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:59 +02:00
Alexander Shishkin
375637bc52 perf/core: Introduce address range filtering
Many instruction tracing PMUs out there support address range-based
filtering, which would, for example, generate trace data only for a
given range of instruction addresses, which is useful for tracing
individual functions, modules or libraries. Other PMUs may also
utilize this functionality to allow filtering to or filtering out
code at certain address ranges.

This patch introduces the interface for userspace to specify these
filters and for the PMU drivers to apply these filters to hardware
configuration.

The user interface is an ASCII string that is passed via an ioctl()
and specifies (in the form of an ASCII string) address ranges within
certain object files or within kernel. There is no special treatment
for kernel modules yet, but it might be a worthy pursuit.

The PMU driver interface basically adds two extra callbacks to the
PMU driver structure, one of which validates the filter configuration
proposed by the user against what the hardware is actually capable of
doing and the other one translates hardware-independent filter
configuration into something that can be programmed into the
hardware.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-6-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:57 +02:00
Ingo Molnar
1a618c2cfe Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:12:37 +02:00
Peter Zijlstra
a1cc5bcfcf locking/atomics: Flip atomic_fetch_or() arguments
All the atomic operations have their arguments the wrong way around;
make atomic_fetch_or() consistent and flip them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:58:52 +02:00
Yury Norov
1f93e9f231 asm-generic: use compat version for preadv2 and pwritev2
Compat architectures that does not use generic unistd (mips, s390),
declare compat version in their syscall tables for preadv2 and
pwritev2. Generic unistd syscall table should do it as well.

[arnd: this initially slipped through the review and an
 incorrect patch got merged. arch/tile/ is the only architecture
 that could be affected for their 32-bit compat mode, every
 other architecture we support today is fine.]

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-05-05 00:42:20 +02:00
David S. Miller
32b583a0cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2016-05-04

1) The flowcache can hit an OOM condition if too
   many entries are in the gc_list. Fix this by
   counting the entries in the gc_list and refuse
   new allocations if the value is too high.

2) The inner headers are invalid after a xfrm transformation,
   so reset the skb encapsulation field to ensure nobody tries
   access the inner headers. Otherwise tunnel devices stacked
   on top of xfrm may build the outer headers based on wrong
   informations.

3) Add pmtu handling to vti, we need it to report
   pmtu informations for local generated packets.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 16:35:31 -04:00
Linus Torvalds
41143b774a Merge tag 'for-linus-4.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen regression fixes from David Vrabel:

 - Fix two regressions causing crashes in 32-bit PV guests

 - Fix a regression in the evtchn driver

* tag 'for-linus-4.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/evtchn: fix ring resize when binding new events
  xen/balloon: Fix crash when ballooning on x86 32 bit PAE
  xen: Fix page <-> pfn conversion on 32 bit systems
2016-05-04 11:00:05 -07:00
Andy Lutomirski
91c6180572 signals/sigaltstack: Change SS_AUTODISARM to (1U << 31)
Using bit 4 divides the space of available bits strangely.  Use bit
31 instead so that we have a better chance of keeping flag and mode
bits separate in the long run.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/bb996508a600af14b406810c3d58fe0e0d0afe0d.1462296606.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04 08:34:14 +02:00
Andy Lutomirski
c876eeab64 signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
If a signal stack is set up with SS_AUTODISARM, then the kernel
inherently avoids incorrectly resetting the signal stack if signals
recurse: the signal stack will be reset on the first signal
delivery.  This means that we don't need check the stack pointer
when delivering signals if SS_AUTODISARM is set.

This will make segmented x86 programs more robust: currently there's
a hole that could be triggered if ESP/RSP appears to point to the
signal stack but actually doesn't due to a nonzero SS base.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/c46bee4654ca9e68c498462fd11746e2bd0d98c8.1462296606.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04 08:34:13 +02:00
Linus Torvalds
7391daf2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Some straggler bug fixes:

   1) Batman-adv DAT must consider VLAN IDs when choosing candidate
      nodes, from Antonio Quartulli.

   2) Fix botched reference counting of vlan objects and neigh nodes in
      batman-adv, from Sven Eckelmann.

   3) netem can crash when it sees GSO packets, the fix is to segment
      then upon ->enqueue.  Fix from Neil Horman with help from Eric
      Dumazet.

   4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew
      Finlay.

   5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5,
      since it can sleep.  Fix also from Matthew Finlay.

   6) Check mdiobus_scan() return values properly in pxa168_eth and macb
      drivers.  From Sergei Shtylyov.

   7) If the netdevice doesn't support checksumming, disable
      segmentation.  From Alexandery Duyck.

   8) Fix races between RDS tcp accept and sending, from Sowmini
      Varadhan.

   9) In macb driver, probe MDIO bus before we register the netdev,
      otherwise we can try to open the device before it is really ready
      for that.  Fix from Florian Fainelli.

  10) Netlink attribute size for ILA "tunnels" not calculated properly,
      fix from Nicolas Dichtel"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6/ila: fix nlsize calculation for lwtunnel
  net: macb: Probe MDIO bus before registering netdev
  RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
  RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
  vxlan: Add checksum check to the features check function
  net: Disable segmentation if checksumming is not supported
  net: mvneta: Remove superfluous SMP function call
  macb: fix mdiobus_scan() error check
  pxa168_eth: fix mdiobus_scan() error check
  net/mlx5e: Use workqueue for vxlan ops
  net/mlx5e: Implement a mlx5e workqueue
  net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
  net/mlx5: Unmap only the relevant IO memory mapping
  netem: Segment GSO packets on enqueue
  batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node
  batman-adv: Fix reference counting of vlan object for tt_local_entry
  batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event
  batman-adv: fix DAT candidate selection (must use vid)
2016-05-03 15:07:50 -07:00
Alexander Duyck
af67eb9e7e vxlan: Add checksum check to the features check function
We need to perform an additional check on the inner headers to determine if
we can offload the checksum for them.  Previously this check didn't occur
so we would generate an invalid frame in the case of an IPv6 header
encapsulated inside of an IPv4 tunnel.  To fix this I added a secondary
check to vxlan_features_check so that we can verify that we can offload the
inner checksum.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:00:54 -04:00
Stas Sergeev
2a74213838 signals/sigaltstack: Implement SS_AUTODISARM flag
This patch implements the SS_AUTODISARM flag that can be OR-ed with
SS_ONSTACK when forming ss_flags.

When this flag is set, sigaltstack will be disabled when entering
the signal handler; more precisely, after saving sas to uc_stack.
When leaving the signal handler, the sigaltstack is restored by
uc_stack.

When this flag is used, it is safe to switch from sighandler with
swapcontext(). Without this flag, the subsequent signal will corrupt
the state of the switched-away sighandler.

To detect the support of this functionality, one can do:

  err = sigaltstack(SS_DISABLE | SS_AUTODISARM);
  if (err && errno == EINVAL)
	unsupported();

Signed-off-by: Stas Sergeev <stsp@list.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460665206-13646-4-git-send-email-stsp@list.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03 08:37:59 +02:00
Stas Sergeev
407bc16ad1 signals/sigaltstack: Prepare to add new SS_xxx flags
This patch adds SS_FLAG_BITS - the mask that splits sigaltstack
mode values and bit-flags. Since there is no bit-flags yet, the
mask is defined to 0. The flags are added by subsequent patches.
With every new flag, the mask should have the appropriate bit cleared.

This makes sure if some flag is tried on a kernel that doesn't
support it, the -EINVAL error will be returned, because such a
flag will be treated as an invalid mode rather than the bit-flag.

That way the existence of the particular features can be probed
at run-time.

This change was suggested by Andy Lutomirski:

  https://lkml.org/lkml/2016/3/6/158

Signed-off-by: Stas Sergeev <stsp@list.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460665206-13646-3-git-send-email-stsp@list.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03 08:37:59 +02:00
Linus Torvalds
689de1d6ca Minimal fix-up of bad hashing behavior of hash_64()
This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: stable@vger.kernel.org
Cc: George Spelvin <linux@horizon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-02 13:01:51 -07:00
Linus Torvalds
9c5d1bc2b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) MODULE_FIRMWARE firmware string not correct for iwlwifi 8000 chips,
    from Sara Sharon.

 2) Fix SKB size checks in batman-adv stack on receive, from Sven
    Eckelmann.

 3) Leak fix on mac80211 interface add error paths, from Johannes Berg.

 4) Cannot invoke napi_disable() with BH disabled in myri10ge driver,
    fix from Stanislaw Gruszka.

 5) Fix sign extension problem when computing feature masks in
    net_gso_ok(), from Marcelo Ricardo Leitner.

 6) lan78xx driver doesn't count packets and packet lengths in its
    statistics properly, fix from Woojung Huh.

 7) Fix the buffer allocation sizes in pegasus USB driver, from Petko
    Manolov.

 8) Fix refcount overflows in bpf, from Alexei Starovoitov.

 9) Unified dst cache handling introduced a preempt warning in
    ip_tunnel, fix by resetting rather then setting the cached route.
    From Paolo Abeni.

10) Listener hash collision test fix in soreuseport, from Craig Gallak

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  tipc: only process unicast on intended node
  cxgb3: fix out of bounds read
  net/smscx5xx: use the device tree for mac address
  soreuseport: Fix TCP listener hash collision
  net: l2tp: fix reversed udp6 checksum flags
  ip_tunnel: fix preempt warning in ip tunnel creation/updating
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  drivers: net: cpsw: use of_phy_connect() in fixed-link case
  dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusive
  drivers: net: cpsw: don't ignore phy-mode if phy-handle is used
  drivers: net: cpsw: fix segfault in case of bad phy-handle
  drivers: net: cpsw: fix parsing of phy-handle DT property in dual_emac config
  MAINTAINERS: net: Change maintainer for GRETH 10/100/1G Ethernet MAC device driver
  gre: reject GUE and FOU in collect metadata mode
  pegasus: fixes reported packet length
  pegasus: fixes URB buffer allocation size;
  ...
2016-05-02 09:40:42 -07:00
Tim Bingham
2c94b53738 net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
Prior to commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op
when !DEBUG") the implementation of net_dbg_ratelimited() was buggy
for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases.

The bug was that net_ratelimit() was being called and, despite
returning true, nothing was being printed to the console. This
resulted in messages like the following -

"net_ratelimit: %d callbacks suppressed"

with no other output nearby.

After commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op when
!DEBUG") the bug is fixed for the DEBUG case. However, there's no
output at all for CONFIG_DYNAMIC_DEBUG case.

This patch restores debug output (if enabled) for the
CONFIG_DYNAMIC_DEBUG case.

Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG
case. The implementation takes care to check that dynamic debugging is
enabled before calling net_ratelimit().

Fixes: d92cff89a0 ("net_dbg_ratelimited: turn into no-op when !DEBUG")
Signed-off-by: Tim Bingham <tbingham@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 21:34:01 -04:00