Pull livepatching updates from Jiri Kosina:
- support for something we call 'atomic replace', and allows for much
better handling of cumulative patches (which is something very useful
for distros), from Jason Baron with help of Petr Mladek and Joe
Lawrence
- improvement of handling of tasks blocking finalization, from Miroslav
Benes
- update of MAINTAINERS file to reflect move towards group
maintainership
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: (22 commits)
livepatch/selftests: use "$@" to preserve argument list
livepatch: Module coming and going callbacks can proceed with all listed patches
livepatch: Proper error handling in the shadow variables selftest
livepatch: return -ENOMEM on ptr_id() allocation failure
livepatch: Introduce klp_for_each_patch macro
livepatch: core: Return EOPNOTSUPP instead of ENOSYS
selftests/livepatch: add DYNAMIC_DEBUG config dependency
livepatch: samples: non static warnings fix
livepatch: update MAINTAINERS
livepatch: Remove signal sysfs attribute
livepatch: Send a fake signal periodically
selftests/livepatch: introduce tests
livepatch: Remove ordering (stacking) of the livepatches
livepatch: Atomic replace and cumulative patches documentation
livepatch: Remove Nop structures when unused
livepatch: Add atomic replace
livepatch: Use lists to manage patches, objects and functions
livepatch: Simplify API by removing registration step
livepatch: Don't block the removal of patches loaded after a forced transition
livepatch: Consolidate klp_free functions
...
Here is the big char/misc driver patch pull request for 5.1-rc1.
The largest thing by far is the new habanalabs driver for their AI
accelerator chip. For now it is in the drivers/misc directory but will
probably move to a new directory soon along with other drivers of this
type.
Other than that, just the usual set of individual driver updates and
fixes. There's an "odd" merge in here from the DRM tree that they asked
me to do as the MEI driver is starting to interact with the i915 driver,
and it needed some coordination. All of those patches have been
properly acked by the relevant subsystem maintainers.
All of these have been in linux-next with no reported issues, most for
quite some time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXH+dPQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym1fACgvpZAxjNzoRQJ6f06tc8ujtPk9rUAnR+tCtrZ
9e3l7H76oe33o96Qjhor
=8A2k
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver patch pull request for 5.1-rc1.
The largest thing by far is the new habanalabs driver for their AI
accelerator chip. For now it is in the drivers/misc directory but will
probably move to a new directory soon along with other drivers of this
type.
Other than that, just the usual set of individual driver updates and
fixes. There's an "odd" merge in here from the DRM tree that they
asked me to do as the MEI driver is starting to interact with the i915
driver, and it needed some coordination. All of those patches have
been properly acked by the relevant subsystem maintainers.
All of these have been in linux-next with no reported issues, most for
quite some time"
* tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits)
habanalabs: adjust Kconfig to fix build errors
habanalabs: use %px instead of %p in error print
habanalabs: use do_div for 64-bit divisions
intel_th: gth: Fix an off-by-one in output unassigning
habanalabs: fix little-endian<->cpu conversion warnings
habanalabs: use NULL to initialize array of pointers
habanalabs: fix little-endian<->cpu conversion warnings
habanalabs: soft-reset device if context-switch fails
habanalabs: print pointer using %p
habanalabs: fix memory leak with CBs with unaligned size
habanalabs: return correct error code on MMU mapping failure
habanalabs: add comments in uapi/misc/habanalabs.h
habanalabs: extend QMAN0 job timeout
habanalabs: set DMA0 completion to SOB 1007
habanalabs: fix validation of WREG32 to DMA completion
habanalabs: fix mmu cache registers init
habanalabs: disable CPU access on timeouts
habanalabs: add MMU DRAM default page mapping
habanalabs: Dissociate RAZWI info from event types
misc/habanalabs: adjust Kconfig to fix build errors
...
This adds a new kernel module for analysis of vmalloc allocator. It is
only enabled as a module. There are two main reasons this module should
be used for: performance evaluation and stressing of vmalloc subsystem.
It consists of several test cases. As of now there are 8. The module
has five parameters we can specify to change its the behaviour.
1) run_test_mask - set of tests to be run
id: 1, name: fix_size_alloc_test
id: 2, name: full_fit_alloc_test
id: 4, name: long_busy_list_alloc_test
id: 8, name: random_size_alloc_test
id: 16, name: fix_align_alloc_test
id: 32, name: random_size_align_alloc_test
id: 64, name: align_shift_alloc_test
id: 128, name: pcpu_alloc_test
By default all tests are in run test mask. If you want to select some
specific tests it is possible to pass the mask. For example for first,
second and fourth tests we go 11 value.
2) test_repeat_count - how many times each test should be repeated
By default it is one time per test. It is possible to pass any number.
As high the value is the test duration gets increased.
3) test_loop_count - internal test loop counter. By default it is set
to 1000000.
4) single_cpu_test - use one CPU to run the tests
By default this parameter is set to false. It means that all online
CPUs execute tests. By setting it to 1, the tests are executed by
first online CPU only.
5) sequential_test_order - run tests in sequential order
By default this parameter is set to false. It means that before running
tests the order is shuffled. It is possible to make it sequential, just
set it to 1.
Performance analysis:
In order to evaluate performance of vmalloc allocations, usually it
makes sense to use only one CPU that runs tests, use sequential order,
number of repeat tests can be different as well as set of test mask.
For example if we want to run all tests, to use one CPU and repeat each
test 3 times. Insert the module passing following parameters:
single_cpu_test=1 sequential_test_order=1 test_repeat_count=3
with following output:
<snip>
Summary: fix_size_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 901177 usec
Summary: full_fit_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 1039341 usec
Summary: long_busy_list_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 11775763 usec
Summary: random_size_alloc_test passed 3: failed: 0 repeat: 3 loops: 1000000 avg: 6081992 usec
Summary: fix_align_alloc_test passed: 3 failed: 0 repeat: 3, loops: 1000000 avg: 2003712 usec
Summary: random_size_align_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 2895689 usec
Summary: align_shift_alloc_test passed: 0 failed: 3 repeat: 3 loops: 1000000 avg: 573 usec
Summary: pcpu_alloc_test passed: 3 failed: 0 repeat: 3 loops: 1000000 avg: 95802 usec
All test took CPU0=192945605995 cycles
<snip>
The align_shift_alloc_test is expected to be failed.
Stressing:
In order to stress the vmalloc subsystem we run all available test cases
on all available CPUs simultaneously. In order to prevent constant behaviour
pattern, the test cases array is shuffled by default to randomize the order
of test execution.
For example if we want to run all tests(default), use all online CPUs(default)
with shuffled order(default) and to repeat each test 30 times. The command
would be like:
modprobe vmalloc_test test_repeat_count=30
Expected results are the system is alive, there are no any BUG_ONs or Kernel
Panics the tests are completed, no memory leaks.
[urezki@gmail.com: fix 32-bit builds]
Link: http://lkml.kernel.org/r/20190106214839.ffvjvmrn52uqog7k@pc636
[urezki@gmail.com: make CONFIG_TEST_VMALLOC depend on CONFIG_MMU]
Link: http://lkml.kernel.org/r/20190219085441.s6bg2gpy4esny5vw@pc636
Link: http://lkml.kernel.org/r/20190103142108.20744-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use after scope bugs detector seems to be almost entirely useless for
the linux kernel. It exists over two years, but I've seen only one
valid bug so far [1]. And the bug was fixed before it has been
reported. There were some other use-after-scope reports, but they were
false-positives due to different reasons like incompatibility with
structleak plugin.
This feature significantly increases stack usage, especially with GCC <
9 version, and causes a 32K stack overflow. It probably adds
performance penalty too.
Given all that, let's remove use-after-scope detector entirely.
While preparing this patch I've noticed that we mistakenly enable
use-after-scope detection for clang compiler regardless of
CONFIG_KASAN_EXTRA setting. This is also fixed now.
[1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com>
Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Will Deacon <will.deacon@arm.com> [arm64]
Cc: Qian Cai <cai@lca.pw>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_DEBUG_INFO_SPLIT and CONFIG_DEBUG_INFO_DWARF4 enable extra
dwarf options if supported. You never know if they are really enabled
since Makefile may silently turn them off.
The actual behavior will match to the kernel configuration by
testing those compiler flags in the Kconfig stage.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
This is a follow up to the commit cf65a0f6f6
("dma-mapping: move all DMA mapping code to kernel/dma")
which moved source code of DMA API to kernel/dma folder. Since there is
no file left in the lib that require DMA API debugging options move the
latter to kernel/dma as well.
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The livepatch selftest scripts turn on dynamic_debug of livepatch
kernel source to determine expected behavior. TEST_LIVEPATCH should
therefore include DYNAMIC_DEBUG in its list of dependencies.
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Add a few livepatch modules and simple target modules that the included
regression suite can run tests against:
- basic livepatching (multiple patches, atomic replace)
- pre/post (un)patch callbacks
- shadow variable API
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tested-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Alice Ferrazzi <alice.ferrazzi@gmail.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
After the transition to kprobes, symbols are resolved at runtime. This
means there is no need to have all the Kconfig and header logic to
avoid build failures. This also paves the way to having arbitrary test
locations.
Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Kees Cook <keescook@chromium.org>
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJcJieuAAoJED2LAQed4NsGHlIP/1s0fQ86XD9dIMyHzAO0gh2f
7rylfe2kEXJgIzJ0DyZdLu4iZtwbkEUqTQrRS1abriNGVemPkfBAnZdM5d92lOQX
3iREa700AJ2xo7V7gYZ6AbhZoG3p0S9U9Q2qE5S+tFTe8c2Gy4xtjnODF+Vel85r
S0P8tF5sE1/d00lm+yfMI/CJVfDjyNaMm+aVEnL0kZTPiRkaktjWgo6Fc2p4z1L5
HFmMMP6/iaXmRZ+tHJGPQ2AT70GFVZw5ePxPcl50EotUP25KHbuUdzs8wDpYm3U/
rcESVsIFpgqHWmTsdBk6dZk0q8yFZNkMlkaP/aYukVZpUn/N6oAXgTFckYl8dmQL
fQBkQi6DTfr9EBPVbj18BKm7xI3Y4DdQ2fzTfYkJ2XwNRGFA5r9N3sjd7ZTVGjxC
aeeMHCwvGdSx1x8PeZAhZfsUHW8xVDMSQiT713+ljBY+6cwzA+2NF0kP7B6OAqwr
ETFzd4Xu2/lZcL7gQRH8WU3L2S5iedmDG6RnZgJMXI0/9V4qAA+nlsWaCgnl1TgA
mpxYlLUMrd6AUJevE34FlnyFdk8IMn9iKRFsvF0f3doO5C7QzTVGqFdJu5a0CuWO
4NBJvZjFT8/4amoWLfnDlfApWXzTfwLbKG+r6V2F30fLuXpYg5LxWhBoGRPYLZSq
oi4xN1Mpx3TvXz6WcKVZ
=r3Fl
-----END PGP SIGNATURE-----
Merge tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
kconfig: add static qualifiers to fix gconf warnings
kconfig: split the lexer out of zconf.y
kconfig: split some C files out of zconf.y
kconfig: convert to SPDX License Identifier
kconfig: remove keyword lookup table entirely
kconfig: update current_pos in the second lexer
kconfig: switch to ASSIGN_VAL state in the second lexer
kconfig: stop associating kconf_id with yylval
kconfig: refactor end token rules
kconfig: stop supporting '.' and '/' in unquoted words
treewide: surround Kconfig file paths with double quotes
microblaze: surround string default in Kconfig with double quotes
kconfig: use T_WORD instead of T_VARIABLE for variables
kconfig: use specific tokens instead of T_ASSIGN for assignments
kconfig: refactor scanning and parsing "option" properties
kconfig: use distinct tokens for type and default properties
kconfig: remove redundant token defines
kconfig: rename depends_list to comment_option_list
...
Kmemleak scan can be cpu intensive and can stall user tasks at times. To
prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto
scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as
this is meant for only first automatic scan.
Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Signed-off-by: Sachin Nikam <snikam@nvidia.com>
Signed-off-by: Prateek <prpatel@nvidia.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.
I do not see a good reason to complicate Kconfig for the room of
ambiguity.
The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.
Make it treewide consistent now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Sometimes we want to print a series of printk() messages to consoles
without being disturbed by concurrent printk() from interrupts and/or
other threads. But we can't enforce printk() callers to use their local
buffers because we need to ask them to make too much changes. Also, even
buffering up to one line inside printk() might cause failing to emit
an important clue under critical situation.
Therefore, instead of trying to help buffering, let's try to help
reconstructing messages by saving caller information as of calling
log_store() and adding it as "[T$thread_id]" or "[C$processor_id]"
upon printing to consoles.
Some examples for console output:
[ 1.222773][ T1] x86: Booting SMP configuration:
[ 2.779635][ T1] pci 0000:00:01.0: PCI bridge to [bus 01]
[ 5.069193][ T268] Fusion MPT base driver 3.04.20
[ 9.316504][ C2] random: fast init done
[ 13.413336][ T3355] Initialized host personality
Some examples for /dev/kmsg output:
6,496,1222773,-,caller=T1;x86: Booting SMP configuration:
6,968,2779635,-,caller=T1;pci 0000:00:01.0: PCI bridge to [bus 01]
SUBSYSTEM=pci
DEVICE=+pci:0000:00:01.0
6,1353,5069193,-,caller=T268;Fusion MPT base driver 3.04.20
5,1526,9316504,-,caller=C2;random: fast init done
6,1575,13413336,-,caller=T3355;Initialized host personality
Note that this patch changes max length of messages which can be printed
by printk() or written to /dev/kmsg interface from 992 bytes to 976 bytes,
based on an assumption that userspace won't try to write messages hitting
that border line to /dev/kmsg interface.
Link: http://lkml.kernel.org/r/93f19e57-5051-c67d-9af4-b17624062d44@i-love.sakura.ne.jp
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
This lib tracks objects which could be of two types:
1) root object
2) nested object - with a "delta" which differentiates it from
the associated root object
The objects are tracked by a hashtable and reference-counted. User is
responsible of implementing callbacks to create/destroy root entity
related to each root object and callback to create/destroy nested object
delta.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All architecures use memblock for early memory management. There is no need
for the CONFIG_HAVE_MEMBLOCK configuration option.
[rppt@linux.vnet.ibm.com: of/fdt: fixup #ifdefs]
Link: http://lkml.kernel.org/r/20180919103457.GA20545@rapoport-lnx
[rppt@linux.vnet.ibm.com: csky: fixups after bootmem removal]
Link: http://lkml.kernel.org/r/20180926112744.GC4628@rapoport-lnx
[rppt@linux.vnet.ibm.com: remove stale #else and the code it protects]
Link: http://lkml.kernel.org/r/1538067825-24835-1-git-send-email-rppt@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1536927045-23536-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull XArray conversion from Matthew Wilcox:
"The XArray provides an improved interface to the radix tree data
structure, providing locking as part of the API, specifying GFP flags
at allocation time, eliminating preloading, less re-walking the tree,
more efficient iterations and not exposing RCU-protected pointers to
its users.
This patch set
1. Introduces the XArray implementation
2. Converts the pagecache to use it
3. Converts memremap to use it
The page cache is the most complex and important user of the radix
tree, so converting it was most important. Converting the memremap
code removes the only other user of the multiorder code, which allows
us to remove the radix tree code that supported it.
I have 40+ followup patches to convert many other users of the radix
tree over to the XArray, but I'd like to get this part in first. The
other conversions haven't been in linux-next and aren't suitable for
applying yet, but you can see them in the xarray-conv branch if you're
interested"
* 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits)
radix tree: Remove multiorder support
radix tree test: Convert multiorder tests to XArray
radix tree tests: Convert item_delete_rcu to XArray
radix tree tests: Convert item_kill_tree to XArray
radix tree tests: Move item_insert_order
radix tree test suite: Remove multiorder benchmarking
radix tree test suite: Remove __item_insert
memremap: Convert to XArray
xarray: Add range store functionality
xarray: Move multiorder_check to in-kernel tests
xarray: Move multiorder_shrink to kernel tests
xarray: Move multiorder account test in-kernel
radix tree test suite: Convert iteration test to XArray
radix tree test suite: Convert tag_tagged_items to XArray
radix tree: Remove radix_tree_clear_tags
radix tree: Remove radix_tree_maybe_preload_order
radix tree: Remove split/join code
radix tree: Remove radix_tree_update_node_t
page cache: Finish XArray conversion
dax: Convert page fault handlers to XArray
...
Here is the big set of char/misc patches for 4.20-rc1.
Loads of things here, we have new code in all of these driver
subsystems:
fpga
stm
extcon
nvmem
eeprom
hyper-v
gsmi
coresight
thunderbolt
vmw_balloon
goldfish
soundwire
along with lots of fixes and minor changes to other small drivers.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW9Le5A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn+BQCfZ6DtCIgqo0UW3dLV8Fd0wya9kw0AoNglzJJ6
YRZiaSdRiggARpNdh3ME
=97BX
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of char/misc patches for 4.20-rc1.
Loads of things here, we have new code in all of these driver
subsystems:
- fpga
- stm
- extcon
- nvmem
- eeprom
- hyper-v
- gsmi
- coresight
- thunderbolt
- vmw_balloon
- goldfish
- soundwire
along with lots of fixes and minor changes to other small drivers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (245 commits)
Documentation/security-bugs: Clarify treatment of embargoed information
lib: Fix ia64 bootloader linkage
MAINTAINERS: Clarify UIO vs UIOVEC maintainer
docs/uio: fix a grammar nitpick
docs: fpga: document programming fpgas using regions
fpga: add devm_fpga_region_create
fpga: bridge: add devm_fpga_bridge_create
fpga: mgr: add devm_fpga_mgr_create
hv_balloon: Replace spin_is_locked() with lockdep
sgi-xp: Replace spin_is_locked() with lockdep
eeprom: New ee1004 driver for DDR4 memory
eeprom: at25: remove unneeded 'at25_remove'
w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size).
misc: mic: scif: remove set but not used variables 'src_dma_addr, dst_dma_addr'
misc: mic: fix a DMA pool free failure
platform: goldfish: pipe: Add a blank line to separate varibles and code
platform: goldfish: pipe: Remove redundant casting
platform: goldfish: pipe: Call misc_deregister if init fails
platform: goldfish: pipe: Move the file-scope goldfish_pipe_dev variable into the driver state
platform: goldfish: pipe: Move the file-scope goldfish_pipe_miscdev variable into the driver state
...
Pull ARM updates from Russell King:
"The main item in this pull request are the Spectre variant 1.1 fixes
from Julien Thierry.
A few other patches to improve various areas, and removal of some
obsolete mcount bits and a redundant kbuild conditional"
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8802/1: Call syscall_trace_exit even when system call skipped
ARM: 8797/1: spectre-v1.1: harden __copy_to_user
ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()
ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit
ARM: 8793/1: signal: replace __put_user_error with __put_user
ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()
ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state
ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context
ARM: 8789/1: signal: copy registers using __copy_to_user()
ARM: 8801/1: makefile: use ARMv3M mode for RiscPC
ARM: 8800/1: use choice for kernel unwinders
ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional
ARM: 8788/1: ftrace: remove old mcount support
ARM: 8786/1: Debug kernel copy by printing
The xa_load function brings with it a lot of infrastructure; xa_empty(),
xa_is_err(), and large chunks of the XArray advanced API that are used
to implement xa_load.
As the test-suite demonstrates, it is possible to use the XArray functions
on a radix tree. The radix tree functions depend on the GFP flags being
stored in the root of the tree, so it's not possible to use the radix
tree functions on an XArray.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
This adds a helper to paste 2 pointer arrays together, useful for merging
various types of attribute arrays. There are a few places in the kernel
tree where this is open coded, and I just added one more in the STM class.
The naming is inspired by memset_p() and memcat(), and partial credit for
it goes to Andy Shevchenko.
This patch adds the function wrapped in a type-enforcing macro and a test
module.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While in theory multiple unwinders could be compiled in, it does
not make sense in practise. Use a choice to make the unwinder
selection mutually exclusive and mandatory.
Already before this commit it has not been possible to deselect
FRAME_POINTER. Remove the obsolete comment.
Furthermore, to produce a meaningful backtrace with FRAME_POINTER
enabled the kernel needs a specific function prologue:
mov ip, sp
stmfd sp!, {fp, ip, lr, pc}
sub fp, ip, #4
To get to the required prologue gcc uses apcs and no-sched-prolog.
This compiler options are not available on clang, and clang is not
able to generate the required prologue. Make the FRAME_POINTER
config symbol depending on !clang.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Fix three typos in CONFIG_WARN_ALL_UNSEEDED_RANDOM help text.
Link: http://lkml.kernel.org/r/20180830194505.4778-1-thibaut@sautereau.fr
Signed-off-by: Thibaut Sautereau <thibaut@sautereau.fr>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull IDA updates from Matthew Wilcox:
"A better IDA API:
id = ida_alloc(ida, GFP_xxx);
ida_free(ida, id);
rather than the cumbersome ida_simple_get(), ida_simple_remove().
The new IDA API is similar to ida_simple_get() but better named. The
internal restructuring of the IDA code removes the bitmap
preallocation nonsense.
I hope the net -200 lines of code is convincing"
* 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits)
ida: Change ida_get_new_above to return the id
ida: Remove old API
test_ida: check_ida_destroy and check_ida_alloc
test_ida: Convert check_ida_conv to new API
test_ida: Move ida_check_max
test_ida: Move ida_check_leaf
idr-test: Convert ida_check_nomem to new API
ida: Start new test_ida module
target/iscsi: Allocate session IDs from an IDA
iscsi target: fix session creation failure handling
drm/vmwgfx: Convert to new IDA API
dmaengine: Convert to new IDA API
ppc: Convert vas ID allocation to new IDA API
media: Convert entity ID allocation to new IDA API
ppc: Convert mmu context allocation to new IDA API
Convert net_namespace to new IDA API
cb710: Convert to new IDA API
rsxx: Convert to new IDA API
osd: Convert to new IDA API
sd: Convert to new IDA API
...
It seems contributors follow the style of Kconfig entries where explicit
'default n' is present. The default 'default' is 'n' already, thus, drop
these lines from Kconfig to make it more clear.
Link: http://lkml.kernel.org/r/20180719085131.79541-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We haven't had lots of deprecation warnings lately, but the rdma use of
it made them flare up again.
They are not useful. They annoy everybody, and nobody ever does
anything about them, because it's always "somebody elses problem". And
when people start thinking that warnings are normal, they stop looking
at them, and the real warnings that mean something go unnoticed.
If you want to get rid of a function, just get rid of it. Convert every
user to the new world order.
And if you can't do that, then don't annoy everybody else with your
marking that says "I couldn't be bothered to fix this, so I'll just spam
everybody elses build logs with warnings about my laziness".
Make a kernelnewbies wiki page about things that could be cleaned up,
write a blog post about it, or talk to people on the mailing lists. But
don't add warnings to the kernel build about cleanup that you think
should happen but you aren't doing yourself.
Don't. Just don't.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights:
- Gustavo A. R. Silva keeps working on the implicit switch fallthru
changes.
- Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
Luca Coelho.
- Re-enable ASPM in r8169, from Kai-Heng Feng.
- Add virtual XFRM interfaces, which avoids all of the limitations of
existing IPSEC tunnels. From Steffen Klassert.
- Convert GRO over to use a hash table, so that when we have many
flows active we don't traverse a long list during accumluation.
- Many new self tests for routing, TC, tunnels, etc. Too many
contributors to mention them all, but I'm really happy to keep
seeing this stuff.
- Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.
- Lots of cleanups and fixes in L2TP code from Guillaume Nault.
- Add IPSEC offload support to netdevsim, from Shannon Nelson.
- Add support for slotting with non-uniform distribution to netem
packet scheduler, from Yousuk Seung.
- Add UDP GSO support to mlx5e, from Boris Pismenny.
- Support offloading of Team LAG in NFP, from John Hurley.
- Allow to configure TX queue selection based upon RX queue, from
Amritha Nambiar.
- Support ethtool ring size configuration in aquantia, from Anton
Mikaev.
- Support DSCP and flowlabel per-transport in SCTP, from Xin Long.
- Support list based batching and stack traversal of SKBs, this is
very exciting work. From Edward Cree.
- Busyloop optimizations in vhost_net, from Toshiaki Makita.
- Introduce the ETF qdisc, which allows time based transmissions. IGB
can offload this in hardware. From Vinicius Costa Gomes.
- Add parameter support to devlink, from Moshe Shemesh.
- Several multiplication and division optimizations for BPF JIT in
nfp driver, from Jiong Wang.
- Lots of prepatory work to make more of the packet scheduler layer
lockless, when possible, from Vlad Buslov.
- Add ACK filter and NAT awareness to sch_cake packet scheduler, from
Toke Høiland-Jørgensen.
- Support regions and region snapshots in devlink, from Alex Vesker.
- Allow to attach XDP programs to both HW and SW at the same time on
a given device, with initial support in nfp. From Jakub Kicinski.
- Add TLS RX offload and support in mlx5, from Ilya Lesokhin.
- Use PHYLIB in r8169 driver, from Heiner Kallweit.
- All sorts of changes to support Spectrum 2 in mlxsw driver, from
Ido Schimmel.
- PTP support in mv88e6xxx DSA driver, from Andrew Lunn.
- Make TCP_USER_TIMEOUT socket option more accurate, from Jon
Maxwell.
- Support for templates in packet scheduler classifier, from Jiri
Pirko.
- IPV6 support in RDS, from Ka-Cheong Poon.
- Native tproxy support in nf_tables, from Máté Eckl.
- Maintain IP fragment queue in an rbtree, but optimize properly for
in-order frags. From Peter Oskolkov.
- Improvde handling of ACKs on hole repairs, from Yuchung Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
hv/netvsc: Fix NULL dereference at single queue mode fallback
net: filter: mark expected switch fall-through
xen-netfront: fix warn message as irq device name has '/'
cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
net: dsa: mv88e6xxx: missing unlock on error path
rds: fix building with IPV6=m
inet/connection_sock: prefer _THIS_IP_ to current_text_addr
net: dsa: mv88e6xxx: bitwise vs logical bug
net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
ieee802154: hwsim: using right kind of iteration
net: hns3: Add vlan filter setting by ethtool command -K
net: hns3: Set tx ring' tc info when netdev is up
net: hns3: Remove tx ring BD len register in hns3_enet
net: hns3: Fix desc num set to default when setting channel
net: hns3: Fix for phy link issue when using marvell phy driver
net: hns3: Fix for information of phydev lost problem when down/up
net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
net: hns3: Add support for serdes loopback selftest
bnxt_en: take coredump_record structure off stack
...
Move the source statements of arch-independent Kconfig files instead of
duplicating the includes in every arch/$(SRCARCH)/Kconfig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbdFsfAAoJED2LAQed4NsGxHsP/1tmA57OOOj8oGxO2OXhXVbr
Q0MZqCoV4bqMvK/hgCQdl9f+tp0m+j12x4xDLdVf4OqnTXMbqvPDu3uQVKvaj/k1
gHhsFA1tFgSbuJ8InltUsrPEQqbceeJsj50xHVAKijqI6LYeRPPSU7aE9obn+OzH
n2nd5sLKvMI/dqdJvW6i5KPydqTH3r3iA7D+ne/XQj0s0EMXvXUPmDT1+ijTnM4a
yfm6W5p7L/c3Ugf1Pz5PfnPl4BxBwZMfW5ie/UO8j5C6Rl0iPaOGuuHurocaaJb3
MefR/7NEAR3G8MhJyL2+70jbbwhjpqR2b5ooz1vpuulPHxjeU45BY60XIBWq1afR
ewsc12MMCYB695ieYWoHdaWgxD/jhffyRuajfpkXKIZEMgDxS03sMhdULXENVMx1
M0ZQ01g/NLWt9ti9DY3eTKB3ymOhnBa1sa77nGGUHkITq4DQKwPX1J9FP/HT6RNt
uOvzeH5kGzc7tqOlZAO0kHbwhQG1uqGcd78IYd4lgf/XfkSgDERTWjnJmnQbwr9m
3PFuST2u8eyO+8Lh1MK76TXOEkXsHMdFugPmb6SlgtMEPKGVLDPlsj52o/LFtgzl
eygfMiBFr2+ttkZ6IpNcpmQ4IztmDpz6XoMk3PqDAfUTUSYpCnq1gAEuff/eisCM
Odva1ZZaeQ7WpxhsP8rr
=gsQJ
-----END PGP SIGNATURE-----
Merge tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig consolidation from Masahiro Yamada:
"Consolidation of Kconfig files by Christoph Hellwig.
Move the source statements of arch-independent Kconfig files instead
of duplicating the includes in every arch/$(SRCARCH)/Kconfig"
* tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: add a Memory Management options" menu
kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt
kconfig: use a menu in arch/Kconfig to reduce clutter
kconfig: include kernel/Kconfig.preempt from init/Kconfig
Kconfig: consolidate the "Kernel hacking" menu
kconfig: include common Kconfig files from top-level Kconfig
kconfig: remove duplicate SWAP symbol defintions
um: create a proper drivers Kconfig
um: cleanup Kconfig files
um: stop abusing KBUILD_KCONFIG
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJbct5hAAoJEFKgDEdIgJTysFoP/jRoozkNpUr4fqcbN+1GCSjY
/vaWMJ6cQBHm9b5AE+EYwomTqx5gkaaAw8oR3tnHvBa2NGsE4xzO2LUyuH95FgMo
cE74MGS492xeCskEBHk0pbrdD+DOuIIqWlQcDP0mH0OhlBiuW/+j72HhDfOU5yTv
uE/X+SyxnYgWW9rKGrkmHqVwJPiEgjtr1tsfsewWVF4MNndCAFYeOgpFZBfhBdnq
g57IRjTHLnkAClFrOzUpGIZ/tvN6iyENug860EORYFzSO2SVZ3Jm7v8AcXLQ6lj0
LKmBAoFWeJ95rqJS+rBq3s0hWQLtbVJgQyz2ujN5+a+oZ/+brtQmTHjD9Nri0cSf
cKUmSfMX4tqdMcGjc7B2RqAtdthst2kYy6RXlp4BzYcdoxDr4U45VTqS8gZQuPSt
cdLUgoUEEX7nt1zlQER6GDzWc3FVKDXSHbDM+4OykFF3k1JCgoDq0kemAmsZw2A6
2QGs7RqBKYidR7DB+beZj2YV0ihRkykwb5c+NUASJV5mIB8TjjmOEw52Q0Fc9Hl1
WbA0wm/+pNG7ur9hF0VZ7zy9Rfxgx2Ec5ekWGTgt8xog3B1Dsqix5RpGRDX/f1u8
/dJxHD70pzADwdXXF5hrdCRtCXsTd/27SnWREYhMhfqwu/awc3boOKonOITCYmvv
7eiffcIaUbfiKsIhgrf+
=TTFG
-----END PGP SIGNATURE-----
Merge tag 'printk-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Different vendors have a different expectation about a console
quietness. Make it configurable to reduce bike-shedding about the
upstream default
- Decide about the message visibility when the message is stored. It
avoids races caused by a delayed console handling
- Always store printk() messages into the per-CPU buffers again in NMI.
The only exception is when flushing trace log in panic(). There the
risk of loosing messages is worth an eventual reordering
- Handle invalid %pO printf modifiers correctly
- Better handle %p printf modifier tests before crng is initialized
- Some clean up
* tag 'printk-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
lib/vsprintf: Do not handle %pO[^F] as %px
printk: Fix warning about unused suppress_message_printing
printk/nmi: Prevent deadlock when accessing the main log buffer in NMI
printk: Create helper function to queue deferred console handling
printk: Split the code for storing a message into the log buffer
printk: Clean up syslog_print_all()
printk: Remove unnecessary kmalloc() from syslog during clear
printk: Make CONSOLE_LOGLEVEL_QUIET configurable
printk: make sure to print log on console.
lib/test_printf.c: accept "ptrval" as valid result for plain 'p' tests
Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to
disable preempt support for alpha, hexagon, non-coldfire m68k and
user mode Linux.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to
the top-level Kconfig. For two architectures that means moving their
arch-specific symbols in that menu into a new arch Kconfig.debug file,
and for a few more creating a dummy file so that we can include it
unconditionally.
Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where
it belongs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Add tests for the bitfield helpers. The constant ones will all
be folded to nothing by the compiler (if everything is correct
in the header file), and the variable ones do some tests against
open-coding the necessary shifts.
A few test cases that should fail/warn compilation are provided
under ifdef.
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The goal of passing the "quiet" option to the kernel is for the kernel
to be quiet unless something really is wrong.
Sofar passing quiet has been (mostly) equivalent to passing
loglevel=4 on the kernel commandline. Which means to show any messages
with a level of KERN_ERR or higher severity on the console.
In practice this often does not result in a quiet boot though, since
there are many false-positive or otherwise harmless error messages printed,
defeating the purpose of the quiet option. Esp. the ACPICA code is really
bad wrt this, but there are plenty of others too.
This commit makes CONSOLE_LOGLEVEL_QUIET configurable.
This for example will allow distros which want quiet to really mean quiet
to set CONSOLE_LOGLEVEL_QUIET so that only messages with a higher severity
then KERN_ERR (CRIT, ALERT, EMERG) get printed, avoiding an endless game
of whack-a-mole silencing harmless error messages.
Link: http://lkml.kernel.org/r/20180619115726.3098-1-hdegoede@redhat.com
To: Petr Mladek <pmladek@suse.com>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
As Documentation/kbuild/kconfig-language.txt notes, 'select' should be
be used with care - it forces a lower limit of another symbol, ignoring
the dependency. Currently, KCOV can select GCC_PLUGINS even if arch
does not select HAVE_GCC_PLUGINS. This could cause the unmet direct
dependency.
Now that Kconfig can test compiler capability, let's handle this in a
more sophisticated way.
There are two ways to enable KCOV; use the compiler that natively
supports -fsanitize-coverage=trace-pc, or build the SANCOV plugin if
the compiler has ability to build GCC plugins. Hence, the correct
dependency for KCOV is:
depends on CC_HAS_SANCOV_TRACE_PC || GCC_PLUGINS
You do not need to build the SANCOV plugin if the compiler already
supports -fsanitize-coverage=trace-pc. Hence, the select should be:
select GCC_PLUGIN_SANCOV if !CC_HAS_SANCOV_TRACE_PC
With this, GCC_PLUGIN_SANCOV is selected only when necessary, so
scripts/Makefile.gcc-plugins can be cleaner.
I also cleaned up Kconfig and scripts/Makefile.kcov as well.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsYJ1gWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJlCTEACwdEeriAd2VwxknnsstojGD/3g
8TTFA19vSu4Gxa6WiDkjGoSmIlfhXTlZo1Nlmencv16ytSvIVDNLUIB3uDxUIv1J
2+dyHML9JpXYHHR7zLXXnGFJL0wazqjbsD3NYQgXqmun7EVVYnOsAlBZ7h/Lwiej
jzEJd8DaHT3TA586uD3uggiFvQU0yVyvkDCDONIytmQx+BdtGdg9TYCzkBJaXuDZ
YIthyKDvxIw5nh/UaG3L+SKo73tUr371uAWgAfqoaGQQCWe+mxnWL4HkCKsjFzZL
u9ouxxF/n6pij3E8n6rb0i2fCzlsTDdDF+aqV1rQ4I4hVXCFPpHUZgjDPvBWbj7A
m6AfRHVNnOgI8HGKqBGOfViV+2kCHlYeQh3pPW33dWzy/4d/uq9NIHKxE63LH+S4
bY3oO2ela8oxRyvEgXLjqmRYGW1LB/ZU7FS6Rkx2gRzo4k8Rv+8K/KzUHfFVRX61
jEbiPLzko0xL9D53kcEn0c+BhofK5jgeSWxItdmfuKjLTW4jWhLRlU+bcUXb6kSS
S3G6aF+L+foSUwoq63AS8QxCuabuhreJSB+BmcGUyjthCbK/0WjXYC6W/IJiRfBa
3ZTxBC/2vP3uq/AGRNh5YZoxHL8mSxDfn62F+2cqlJTTKR/O+KyDb1cusyvk3H04
KCDVLYPxwQQqK1Mqig==
=/3L8
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"This adds the new overflow checking helpers and adds them to the
2-factor argument allocators. And this adds the saturating size
helpers and does a treewide replacement for the struct_size() usage.
Additionally this adds the overflow testing modules to make sure
everything works.
I'm still working on the treewide replacements for allocators with
"simple" multiplied arguments:
*alloc(a * b, ...) -> *alloc_array(a, b, ...)
and
*zalloc(a * b, ...) -> *calloc(a, b, ...)
as well as the more complex cases, but that's separable from this
portion of the series. I expect to have the rest sent before -rc1
closes; there are a lot of messy cases to clean up.
Summary:
- Introduce arithmetic overflow test helper functions (Rasmus)
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)"
* tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
treewide: Use struct_size() for devm_kmalloc() and friends
treewide: Use struct_size() for vmalloc()-family
treewide: Use struct_size() for kmalloc()-family
device: Use overflow helpers for devm_kmalloc()
mm: Use overflow helpers in kvmalloc()
mm: Use overflow helpers in kmalloc_array*()
test_overflow: Add memory allocation overflow tests
overflow.h: Add allocation size calculation helpers
test_overflow: Report test failures
test_overflow: macrofy some more, do more tests for free
lib: add runtime test of check_*_overflow functions
compiler.h: enable builtin overflow checkers and add fallback code
This adds a small module for testing that the check_*_overflow
functions work as expected, whether implemented in C or using gcc
builtins.
Example output:
test_overflow: u8 : 18 tests
test_overflow: s8 : 19 tests
test_overflow: u16: 17 tests
test_overflow: s16: 17 tests
test_overflow: u32: 17 tests
test_overflow: s32: 17 tests
test_overflow: u64: 17 tests
test_overflow: s64: 21 tests
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
[kees: add output to commit log, drop u64 tests on 32-bit]
Signed-off-by: Kees Cook <keescook@chromium.org>
Drivers/subsystems creating scatterlists for DMA should be taking care
to respect the scatter-gather limitations of the appropriate device, as
described by dma_parms. A DMA API implementation cannot feasibly split
a scatterlist into *more* entries than originally passed, so it is not
well defined what they should do when given a segment larger than the
limit they are also required to respect.
Conversely, devices which are less limited than the rather conservative
defaults, or indeed have no limitations at all (e.g. GPUs with their own
internal MMU), should be encouraged to set appropriate dma_parms, as
they may get more efficient DMA mapping performance out of it.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This way we have one central definition of it, and user can select it as
needed. Note that we now also always select it when CONFIG_DMA_API_DEBUG
is select, which fixes some incorrect checks in a few network drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
There is no arch specific code required for dma-debug, so there is no
need to opt into the support either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Keep all of the SOFTLOCKUP kconfig symbols together (instead of
injecting the HARDLOCKUP symbols in the midst of them) so that the
config tools display them with their dependencies.
Tested with 'make {menuconfig/nconfig/gconfig/xconfig}'.
Link: http://lkml.kernel.org/r/6be2d9ed-4656-5b94-460d-7f051e2c7570@infradead.org
Fixes: 05a4a95279 ("kernel/watchdog: split up config options")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- add a shell script to get Clang version
- improve portability of build scripts
- drop always-enabled CONFIG_THIN_ARCHIVE and remove unused code
- rename built-in.o which is now thin archive to built-in.a
- process clean/build targets one by one to get along with -j option
- simplify ld-option
- improve building with CONFIG_TRIM_UNUSED_KSYMS
- define KBUILD_MODNAME even for objects shared among multiple modules
- avoid linking multiple instances of same objects from composite objects
- move <linux/compiler_types.h> to c_flags to include it only for C files
- clean-up various Makefiles
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaw6eWAAoJED2LAQed4NsGrK8QAJmbYg83TTNoOgQRK/7Lg+sj
KL1+RGFxmdHRVOqG5n18L7Q4LmTD19tUFNQImrQTTrKrbH2vbMSTF2PfzdmDRwMz
R5vW5+wsagfhSttOce/GR4p9+bM9XEclzEa3liqNVQxijOFXmkV14pn0x5anYfeB
ABthxFFHcVn3exP/q3lmq048x1yNE71wUU5WQIWf6V/ZKf+++wQU8r7HpnATWYeO
vtf8gZq+xyLLjhxoJF6n6olSZXI7Yhz4jV2G68/VroS312AUFWPogK+cSshWGlSw
zGixM1q55oj9CXjZ37nR6pTzQhSZLf/uHX5beatlpeoJ4Hho6HlIABvx2oEnat7b
o5RW64RB0gVJqlYZdKxL29HNrovr9tlWPTaIPGFRvWDpl3c1w+rMKXE+5hwu8jMJ
2jgxd5FZCgBaDsAKojmeQR7PAo2ffAdUO0Dj/SuAaMOpuHWHcnJk9kIN2PUrC+Sf
d/H2soT9x+60KbQmtCEo5VfEN8bvNP3+ZSnadEG/gRN2IIa5FZAUQykU+i50gAvj
tuKAokdRGZHvXM+buYFBfN6RbhVCXzBF/fAG7r37QVR2u1zaUszmgFOUqERhTQfm
RNnyeAs9G9rljtna/AD7cIOdKTg8oETPISxt8Y6EzNMpI8PhF0aGoxso3yD19oH1
M+fq55RigsR48Kic40hY
=N5BL
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- add a shell script to get Clang version
- improve portability of build scripts
- drop always-enabled CONFIG_THIN_ARCHIVE and remove unused code
- rename built-in.o which is now thin archive to built-in.a
- process clean/build targets one by one to get along with -j option
- simplify ld-option
- improve building with CONFIG_TRIM_UNUSED_KSYMS
- define KBUILD_MODNAME even for objects shared among multiple modules
- avoid linking multiple instances of same objects from composite
objects
- move <linux/compiler_types.h> to c_flags to include it only for C
files
- clean-up various Makefiles
* tag 'kbuild-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
kbuild: get <linux/compiler_types.h> out of <linux/kconfig.h>
kbuild: clean up link rule of composite modules
kbuild: clean up archive rule of built-in.a
kbuild: remove partial section mismatch detection for built-in.a
net: liquidio: clean up Makefile for simpler composite object handling
lib: zstd: clean up Makefile for simpler composite object handling
kbuild: link $(real-obj-y) instead of $(obj-y) into built-in.a
kbuild: rename real-objs-y/m to real-obj-y/m
kbuild: move modname and modname-multi close to modname_flags
kbuild: simplify modname calculation
kbuild: fix modname for composite modules
kbuild: define KBUILD_MODNAME even if multiple modules share objects
kbuild: remove unnecessary $(subst $(obj)/, , ...) in modname-multi
kbuild: Use ls(1) instead of stat(1) to obtain file size
kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS
kbuild: move include/config/ksym/* to include/ksym/*
kbuild: move CONFIG_TRIM_UNUSED_KSYMS code unneeded for external module
kbuild: restore autoksyms.h touch to the top Makefile
kbuild: move 'scripts' target below
kbuild: remove wrong 'touch' in adjust_autoksyms.sh
...
This removes the entire architecture code for blackfin, cris, frv, m32r,
metag, mn10300, score, and tile, including the associated device drivers.
I have been working with the (former) maintainers for each one to ensure
that my interpretation was right and the code is definitely unused in
mainline kernels. Many had fond memories of working on the respective
ports to start with and getting them included in upstream, but also saw
no point in keeping the port alive without any users.
In the end, it seems that while the eight architectures are extremely
different, they all suffered the same fate: There was one company
in charge of an SoC line, a CPU microarchitecture and a software
ecosystem, which was more costly than licensing newer off-the-shelf
CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
that all the SoC product lines are still around, but have not used the
custom CPU architectures for several years at this point. In contrast,
CPU instruction sets that remain popular and have actively maintained
kernel ports tend to all be used across multiple licensees.
The removal came out of a discussion that is now documented at
https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
marking any ports as deprecated but remove them all at once after I made
sure that they are all unused. Some architectures (notably tile, mn10300,
and blackfin) are still being shipped in products with old kernels,
but those products will never be updated to newer kernel releases.
After this series, we still have a few architectures without mainline
gcc support:
- unicore32 and hexagon both have very outdated gcc releases, but the
maintainers promised to work on providing something newer. At least
in case of hexagon, this will only be llvm, not gcc.
- openrisc, risc-v and nds32 are still in the process of finishing their
support or getting it added to mainline gcc in the first place.
They all have patched gcc-7.3 ports that work to some degree, but
complete upstream support won't happen before gcc-8.1. Csky posted
their first kernel patch set last week, their situation will be similar.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
3iIVHovno/JuJnTOE8LY
=fQ8z
-----END PGP SIGNATURE-----
Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pul removal of obsolete architecture ports from Arnd Bergmann:
"This removes the entire architecture code for blackfin, cris, frv,
m32r, metag, mn10300, score, and tile, including the associated device
drivers.
I have been working with the (former) maintainers for each one to
ensure that my interpretation was right and the code is definitely
unused in mainline kernels. Many had fond memories of working on the
respective ports to start with and getting them included in upstream,
but also saw no point in keeping the port alive without any users.
In the end, it seems that while the eight architectures are extremely
different, they all suffered the same fate: There was one company in
charge of an SoC line, a CPU microarchitecture and a software
ecosystem, which was more costly than licensing newer off-the-shelf
CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
seems that all the SoC product lines are still around, but have not
used the custom CPU architectures for several years at this point. In
contrast, CPU instruction sets that remain popular and have actively
maintained kernel ports tend to all be used across multiple licensees.
[ See the new nds32 port merged in the previous commit for the next
generation of "one company in charge of an SoC line, a CPU
microarchitecture and a software ecosystem" - Linus ]
The removal came out of a discussion that is now documented at
https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
marking any ports as deprecated but remove them all at once after I
made sure that they are all unused. Some architectures (notably tile,
mn10300, and blackfin) are still being shipped in products with old
kernels, but those products will never be updated to newer kernel
releases.
After this series, we still have a few architectures without mainline
gcc support:
- unicore32 and hexagon both have very outdated gcc releases, but the
maintainers promised to work on providing something newer. At least
in case of hexagon, this will only be llvm, not gcc.
- openrisc, risc-v and nds32 are still in the process of finishing
their support or getting it added to mainline gcc in the first
place. They all have patched gcc-7.3 ports that work to some
degree, but complete upstream support won't happen before gcc-8.1.
Csky posted their first kernel patch set last week, their situation
will be similar
[ Palmer Dabbelt points out that RISC-V support is in mainline gcc
since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]"
This really says it all:
2498 files changed, 95 insertions(+), 467668 deletions(-)
* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
MAINTAINERS: UNICORE32: Change email account
staging: iio: remove iio-trig-bfin-timer driver
tty: hvc: remove tile driver
tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
serial: remove tile uart driver
serial: remove m32r_sio driver
serial: remove blackfin drivers
serial: remove cris/etrax uart drivers
usb: Remove Blackfin references in USB support
usb: isp1362: remove blackfin arch glue
usb: musb: remove blackfin port
usb: host: remove tilegx platform glue
pwm: remove pwm-bfin driver
i2c: remove bfin-twi driver
spi: remove blackfin related host drivers
watchdog: remove bfin_wdt driver
can: remove bfin_can driver
mmc: remove bfin_sdh driver
input: misc: remove blackfin rotary driver
input: keyboard: remove bf54x driver
...
Two config options in the lock debugging menu that are probably the most
frequently used, as far as I am concerned, is the PROVE_LOCKING and
LOCK_STAT. From a UI perspective, they should be front and center. So
these two options are now moved to the top of the lock debugging menu.
The DEBUG_WW_MUTEX_SLOWPATH option is also added to the PROVE_LOCKING
umbrella.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-4-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are a couples of lock debugging Kconfig options that depends on
the following support options:
- TRACE_IRQFLAGS_SUPPORT
- STACKTRACE_SUPPORT
- LOCKDEP_SUPPORT
That makes those lock debugging options harder to read and understand.
So a new LOCK_DEBUGGING_SUPPORT option is added that is equivalent to
the above three options together. That makes the Kconfig.debug file
more readable.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-3-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For a rwsem, locking can either be exclusive or shared. The corresponding
exclusive or shared unlock must be used. Otherwise, the protected data
structures may get corrupted or the lock may be in an inconsistent state.
In order to detect such anomaly, a new configuration option DEBUG_RWSEMS
is added which can be enabled to look for such mismatches and print
warnings that that happens.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A lot of Kconfig symbols have architecture specific dependencies.
In those cases that depend on architectures we have already removed,
they can be omitted.
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Incremental linking is gone, so rename built-in.o to built-in.a, which
is the usual extension for archive files.
This patch does two things, first is a simple search/replace:
git grep -l 'built-in\.o' | xargs sed -i 's/built-in\.o/built-in\.a/g'
The second is to invert nesting of nested text manipulations to avoid
filtering built-in.a out from libs-y2:
-libs-y2 := $(filter-out %.a, $(patsubst %/, %/built-in.a, $(libs-y)))
+libs-y2 := $(patsubst %/, %/built-in.a, $(filter-out %.a, $(libs-y)))
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
These patches remove the metag architecture and tightly dependent
drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
based metag toolchain we have been using is hitting compiler bugs, so
now seems a good time to drop it altogether.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlqdcgQACgkQbAtpk944
dno/1BAAvaiRcKcNxMrYkxG+Wn4r68odu7+E1dy99AaUnvPFT42R5XLMOv4BCu/Y
bhMQ14lMJ9ZBKdYg9E97ulTV0YFhCBHuEWDyDnk/G3CVAEvdPuAQ6ktHDZxRQBFK
JoTUKky53OZbWU9KhLeWpFg4F4E64FBm1kyAkqhs8pPM/LwmrxwIG2sxdTTqkhkc
b+6ABf2NKtmQwHXWmKWCB8rmXMzulYth2ePC/r9MVj92xGKxADsiFArZk4kmoIUb
H5eZ8FkemtUEfZp600dsGR/ffaTBwZJ3SULSkAklUnrcvdIRM+Fu8osG8O8yQKTd
H7xnmtTJ2kCnhhuUMxt6v8WrDbKB8JdFxFOpXW93YKpKAkiGMvoUEZjlwPYIqWxL
xtnDb9Rv+uZ4RpqZf9AtE4Td8lHTH7OZ78RDs9eMo6n1ZIr5CwcLaM2k5skAeyPr
yt1lXePhXFqSS+OpOV6hn95ROqlkuZgvPfkcdNpCJPfM4SpfRLlUjIVqiVK0LDRk
FAkk0VIfzjjNuyV9yr2XXuw90DerhFUgUl6ZYggkgf6umOHhZQdDTFr8gsfvaLm1
1k1banUEF1tpDcUeShylDvqNmVSZZC6siTQMA7T0zjbjYJD25hJWLpFEcPkx/Anp
4oGQNNoe4WgJIrJAoTJTiBVwC/xLDeZV6b5t2pOXBlH+v2eKgMg=
=zDIl
-----END PGP SIGNATURE-----
Merge tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag into asm-generic
Remove metag architecture
These patches remove the metag architecture and tightly dependent
drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
based metag toolchain we have been using is hitting compiler bugs, so
now seems a good time to drop it altogether.
* tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
i2c: img-scb: Drop METAG dependency
media: img-ir: Drop METAG dependency
watchdog: imgpdc: Drop METAG dependency
MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE
tty: Remove metag DA TTY and console driver
clocksource: Remove metag generic timer driver
irqchip: Remove metag irqchip drivers
Drop a bunch of metag references
docs: Remove remaining references to metag
docs: Remove metag docs
metag: Remove arch/metag/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Now that arch/metag/ has been removed, drop a bunch of metag references
in various codes across the whole tree:
- VM_GROWSUP and __VM_ARCH_SPECIFIC_1.
- MT_METAG_* ELF note types.
- METAG Kconfig dependencies (FRAME_POINTER) and ranges
(MAX_STACK_SIZE_MB).
- metag cases in tools (checkstack.pl, recordmcount.c, perf).
Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-mm@kvack.org
Cc: linux-metag@vger.kernel.org
Commit d3deafaa8b ("lib/: make RUNTIME_TESTS a menuconfig to ease
disabling it all") causes a regression when using runtime tests due to
it defaults RUNTIME_TESTING_MENU to not set.
Link: http://lkml.kernel.org/r/20180214133015.10090-1-anders.roxell@linaro.org
Fixes: d3deafaa8b ("lib/: make RUNTIME_TESTS a menuconfig to easedisabling it all")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Cc: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Makefile changes:
- enable unused-variable warning that was wrongly disabled for clang
Kconfig changes:
- warn blank 'help' and fix existing instances
- fix 'choice' behavior to not write out invisible symbols
- fix misc weirdness
Coccinell changes:
- fix false positive of free after managed memory alloc detection
- improve performance of NULL dereference detection
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJafl86AAoJED2LAQed4NsGrBEP/23mRl+8dSYkTmlczP7stZHL
tGkMQIOj85usPV4JIxYHgge6XhU0CBFXNGnDnxGeNEtwnBr5QQNxuS2nPh7rHJIN
zX5rX/vwO9lzn2FFEdKrk8bito1IgnUHUvN/d0ikPWzY7GaMy6WrIWgyThllsLNE
W7hp3cpUQOhLL9PKXwglZ/oS4iTpEs0DwN93UXU7cp7zyRa0XtFfPf7/IJ2KY+Yl
a2TEsUuZ/slJoxLhacr6+TBAgqUyewWIs0nAGdjP2EVlSjxZJQYFEQq4KnLUO2gV
wLHH2snsZSBDfPDp0M9OOb737HE17NRmuLjWxUZZOMFz8tvfUT1454zhVAN2OtSQ
cP0RqVRrFiS721oxacZpAxKFrd7o4ugUHpftJMPQAq70T9JFFbapfCLvd+OblOb/
CWmDOOR37tvop5OCuaqaSMq7a+ZQt2cO5fogiEDdnjZkk2AH5GgsAHJIrl7hH4OT
P9UMcxaWSGbutdVkM4cMUmYMuAJjiFhx1fiD+hevB1KvemXRXrqhCb0wV+GRdcoU
MXGvOGVw5WyF/vFdjpjkY7KeCgpU3BTWH3pFC2a5vUCDqgD8yndwFghJMDfSjl6d
46DIqknyveq234GK/Yz5khlbY094yL8JrJU2duva/9fGV86tgOr29xgMK28Lpyh8
AYRGO9XgmehZrEHcAQ57
=jy5l
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
"Makefile changes:
- enable unused-variable warning that was wrongly disabled for clang
Kconfig changes:
- warn about blank 'help' and fix existing instances
- fix 'choice' behavior to not write out invisible symbols
- fix misc weirdness
Coccinell changes:
- fix false positive of free after managed memory alloc detection
- improve performance of NULL dereference detection"
* tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
kconfig: remove const qualifier from sym_expand_string_value()
kconfig: add xrealloc() helper
kconfig: send error messages to stderr
kconfig: echo stdin to stdout if either is redirected
kconfig: remove check_stdin()
kconfig: remove 'config*' pattern from .gitignnore
kconfig: show '?' prompt even if no help text is available
kconfig: do not write choice values when their dependency becomes n
coccinelle: deref_null: avoid useless computation
coccinelle: devm_free: reduce false positives
kbuild: clang: disable unused variable warnings only when constant
kconfig: Warn if help text is blank
nios2: kconfig: Remove blank help text
arm: vt8500: kconfig: Remove blank help text
MIPS: kconfig: Remove blank help text
MIPS: BCM63XX: kconfig: Remove blank help text
lib/Kconfig.debug: Remove blank help text
Staging: rtl8192e: kconfig: Remove blank help text
Staging: rtl8192u: kconfig: Remove blank help text
mmc: kconfig: Remove blank help text
...
We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.
drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes
To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
frames that are smaller than 2 kilobytes most of the time on x86_64. An
earlier version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.
All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
and CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can
bring back that default now. KASAN_EXTRA=y still causes lots of
warnings but now defaults to !COMPILE_TEST to disable it in
allmodconfig, and it remains disabled in all other defconfigs since it
is a new option. I arbitrarily raise the warning limit for KASAN_EXTRA
to 3072 to reduce the noise, but an allmodconfig kernel still has around
50 warnings on gcc-7.
I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).
With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation.
That annotation now got replaced with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as
before and will lead to an instant stack overflow in a few extreme
cases.
This reverts parts of commit 3f181b4d86 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
3cd890dbe2 ("media: dvb-frontends: fix i2c access helpers for KASAN")
16c3ada89c ("media: r820t: fix r820t_write_reg for KASAN")
Do we really need to backport this?
I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0. Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.
The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d86 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.
I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).
Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.
Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No need to get into the submenu to disable all related config entries.
This makes it easier to disable all RUNTIME_TESTS config options without
entering the submenu. It will also enable one to see that en/dis-abled
state from the outside menu.
This is only intended to change menuconfig UI, not change the config
dependencies.
Link: http://lkml.kernel.org/r/20171209162742.7363-1-vincent.legoll@gmail.com
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Blank help texts are probably either a typo, a Kconfig misunderstanding,
or some kind of half-committing to adding a help text (in which case a
TODO comment would be clearer, if the help text really can't be added
right away).
Best to remove them, IMO.
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
Pull STRICT_DEVMEM default from Ingo Molnar:
"Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to
follow the distro status quo"
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
Support in-kernel fault-injection framework via debugfs.
This allows you to inject a conditional error to specified
function using debugfs interfaces.
Here is the result of test script described in
Documentation/fault-injection/fault-injection.txt
===========
# ./test_fail_function.sh
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
btrfs-progs v4.4
See http://btrfs.wiki.kernel.org for more information.
Label: (null)
UUID: bfa96010-12e9-4360-aed0-42eec7af5798
Node size: 16384
Sector size: 4096
Filesystem size: 1001.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 58.00MiB
System: DUP 12.00MiB
SSD detected: no
Incompat features: extref, skinny-metadata
Number of devices: 1
Devices:
ID SIZE PATH
1 1001.00MiB /dev/loop2
mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
SUCCESS!
===========
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Since error-injection framework is not limited to be used
by kprobes, nor bpf. Other kernel subsystems can use it
freely for checking safeness of error-injection, e.g.
livepatch, ftrace etc.
So this separate error-injection framework from kprobes.
Some differences has been made:
- "kprobe" word is removed from any APIs/structures.
- BPF_ALLOW_ERROR_INJECTION() is renamed to
ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
- CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
feature. It is automatically enabled if the arch supports
error injection feature for kprobe or ftrace etc.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.
If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.
Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...
This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.
Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.
( Note that it might make sense to maintain it out of tree and go through
the false positives every now and then and see whether new bugs were
introduced. )
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Distros have been shipping with CONFIG_STRICT_DEVMEM=y for years now. It
is probably time to flip this default for x86 and arm64.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20171201201000.GA44539@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The flag enables Clang instrumentation of comparison operations
(currently not supported by GCC). This instrumentation is needed by the
new KCOV device to collect comparison operands.
Link: http://lkml.kernel.org/r/20171011095459.70721-2-glider@google.com
Signed-off-by: Victor Chibotaru <tchibo@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
find_bit functions are widely used in the kernel, including hot paths.
This module tests performance of those functions in 2 typical scenarios:
randomly filled bitmap with relatively equal distribution of set and
cleared bits, and sparse bitmap which has 1 set bit for 500 cleared
bits.
On ThunderX machine:
Start testing find_bit() with random-filled bitmap
find_next_bit: 240043 cycles, 164062 iterations
find_next_zero_bit: 312848 cycles, 163619 iterations
find_last_bit: 193748 cycles, 164062 iterations
find_first_bit: 177720874 cycles, 164062 iterations
Start testing find_bit() with sparse bitmap
find_next_bit: 3633 cycles, 656 iterations
find_next_zero_bit: 620399 cycles, 327025 iterations
find_last_bit: 3038 cycles, 656 iterations
find_first_bit: 691407 cycles, 656 iterations
[arnd@arndb.de: use correct format string for find-bit tests]
Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Clement Courbet <courbet@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of driver core / debugfs patches for 4.15-rc1.
Not many here, mostly all are debugfs fixes to resolve some
long-reported problems with files going away with references to them in
userspace. There's also some SPDX cleanups for the debugfs code, as
well as a few other minor driver core changes for issues reported by
people.
All of these have been in linux-next for a week or more with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWg2NCA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymUNgCfYq434CFh+YtwITBNYdqkFYFf0ZAAn3qfhh2+
M3rmZzwk2MKBvNQ2npvt
=/8+Y
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core / debugfs patches for 4.15-rc1.
Not many here, mostly all are debugfs fixes to resolve some
long-reported problems with files going away with references to them
in userspace. There's also some SPDX cleanups for the debugfs code, as
well as a few other minor driver core changes for issues reported by
people.
All of these have been in linux-next for a week or more with no
reported issues"
* tag 'driver-core-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Fix device link deferred probe
debugfs: Remove redundant license text
debugfs: add SPDX identifiers to all debugfs files
debugfs: defer debugfs_fsdata allocation to first usage
debugfs: call debugfs_real_fops() only after debugfs_file_get()
debugfs: purge obsolete SRCU based removal protection
IB/hfi1: convert to debugfs_file_get() and -put()
debugfs: convert to debugfs_file_get() and -put()
debugfs: debugfs_real_fops(): drop __must_hold sparse annotation
debugfs: implement per-file removal protection
debugfs: add support for more elaborate ->d_fsdata
driver core: Move device_links_purge() after bus_remove_device()
arch_topology: Fix section miss match warning due to free_raw_capacity()
driver-core: pr_err() strings should end with newlines
Pull x86 core updates from Ingo Molnar:
"Note that in this cycle most of the x86 topics interacted at a level
that caused them to be merged into tip:x86/asm - but this should be a
temporary phenomenon, hopefully we'll back to the usual patterns in
the next merge window.
The main changes in this cycle were:
Hardware enablement:
- Add support for the Intel UMIP (User Mode Instruction Prevention)
CPU feature. This is a security feature that disables certain
instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)
[ Note that this is disabled by default for now, there are some
smaller enhancements in the pipeline that I'll follow up with in
the next 1-2 days, which allows this to be enabled by default.]
- Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
feature, on top of SME (Secure Memory Encryption) support that was
added in v4.14. (Tom Lendacky, Brijesh Singh)
- Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)
Other changes:
- A big series of entry code simplifications and enhancements (Andy
Lutomirski)
- Make the ORC unwinder default on x86 and various objtool
enhancements. (Josh Poimboeuf)
- 5-level paging enhancements (Kirill A. Shutemov)
- Micro-optimize the entry code a bit (Borislav Petkov)
- Improve the handling of interdependent CPU features in the early
FPU init code (Andi Kleen)
- Build system enhancements (Changbin Du, Masahiro Yamada)
- ... plus misc enhancements, fixes and cleanups"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
x86/build: Make the boot image generation less verbose
selftests/x86: Add tests for the STR and SLDT instructions
selftests/x86: Add tests for User-Mode Instruction Prevention
x86/traps: Fix up general protection faults caused by UMIP
x86/umip: Enable User-Mode Instruction Prevention at runtime
x86/umip: Force a page fault when unable to copy emulated result to user
x86/umip: Add emulation code for UMIP instructions
x86/cpufeature: Add User-Mode Instruction Prevention definitions
x86/insn-eval: Add support to resolve 16-bit address encodings
x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
x86/insn-eval: Add support to resolve 32-bit address encodings
x86/insn-eval: Compute linear address in several utility functions
resource: Fix resource_size.cocci warnings
X86/KVM: Clear encryption attribute when SEV is active
X86/KVM: Decrypt shared per-cpu variables when SEV is active
percpu: Introduce DEFINE_PER_CPU_DECRYPTED
x86: Add support for changing memory encryption attribute in early boot
x86/io: Unroll string I/O when SEV is active
x86/boot: Add early boot support when running with SEV active
...
Purge the SRCU based file removal race protection in favour of the new,
refcount based debugfs_file_get()/debugfs_file_put() API.
Fixes: 49d200deaa ("debugfs: prevent access to removed files' private data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull locking fixes from Ingo Molnar:
"Two lockdep fixes for bugs introduced by the cross-release dependency
tracking feature - plus a commit that disables it because performance
regressed in an absymal fashion on some systems"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Disable cross-release features for now
locking/selftest: Avoid false BUG report
locking/lockdep: Fix stacktrace mess
Johan Hovold reported a big lockdep slowdown on his system, caused by lockdep:
> I had noticed that the BeagleBone Black boot time appeared to have
> increased significantly with 4.14 and yesterday I finally had time to
> investigate it.
>
> Boot time (from "Linux version" to login prompt) had in fact doubled
> since 4.13 where it took 17 seconds (with my current config) compared to
> the 35 seconds I now see with 4.14-rc4.
>
> I quick bisect pointed to lockdep and specifically the following commit:
>
> 28a903f63e ("locking/lockdep: Handle non(or multi)-acquisition of a crosslock")
Because the final v4.14 release is close, disable the cross-release lockdep
features for now.
Bisected-by: Johan Hovold <johan@kernel.org>
Debugged-by: Johan Hovold <johan@kernel.org>
Reported-by: Johan Hovold <johan@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: kernel-team@lge.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Cc: linux-omap@vger.kernel.org
Link: http://lkml.kernel.org/r/20171014072659.f2yr6mhm5ha3eou7@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rename the unwinder config options from:
CONFIG_ORC_UNWINDER
CONFIG_FRAME_POINTER_UNWINDER
CONFIG_GUESS_UNWINDER
to:
CONFIG_UNWINDER_ORC
CONFIG_UNWINDER_FRAME_POINTER
CONFIG_UNWINDER_GUESS
... in order to give them a more logical config namespace.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Expand the "Runtime testing" menu by including more entries inside it
instead of after it. This is just Kconfig symbol movement.
This causes the (arch-independent) Runtime tests to be presented
(listed) all in one place instead of in multiple places.
Link: http://lkml.kernel.org/r/c194e5c4-2042-bf94-a2d8-7aa13756e257@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The parisc architecture has larger stack frames than most other
architectures on 32-bit kernels.
Increase the maximum allowed stack frame to 1280 bytes for parisc to
avoid warnings in the do_sys_poll() and pat_memconfig() functions.
Signed-off-by: Helge Deller <deller@gmx.de>
Add a test module that allows testing that CONFIG_DEBUG_VIRTUAL works
correctly, at least that it can catch invalid calls to virt_to_phys()
against the non-linear kernel virtual address map.
Link: http://lkml.kernel.org/r/20170808164035.26725-1-f.fainelli@gmail.com
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull locking updates from Ingo Molnar:
- Add 'cross-release' support to lockdep, which allows APIs like
completions, where it's not the 'owner' who releases the lock, to be
tracked. It's all activated automatically under
CONFIG_PROVE_LOCKING=y.
- Clean up (restructure) the x86 atomics op implementation to be more
readable, in preparation of KASAN annotations. (Dmitry Vyukov)
- Fix static keys (Paolo Bonzini)
- Add killable versions of down_read() et al (Kirill Tkhai)
- Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)
- Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)
- Remove smp_mb__before_spinlock() and convert its usages, introduce
smp_mb__after_spinlock() (Peter Zijlstra)
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
locking/lockdep/selftests: Fix mixed read-write ABBA tests
sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
smp: Avoid using two cache lines for struct call_single_data
locking/lockdep: Untangle xhlock history save/restore from task independence
locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
futex: Remove duplicated code and fix undefined behaviour
Documentation/locking/atomic: Finish the document...
locking/lockdep: Fix workqueue crossrelease annotation
workqueue/lockdep: 'Fix' flush_work() annotation
locking/lockdep/selftests: Add mixed read-write ABBA tests
mm, locking/barriers: Clarify tlb_flush_pending() barriers
locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
locking/lockdep: Explicitly initialize wq_barrier::done::map
locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
locking/refcounts, x86/asm: Implement fast refcount overflow protection
locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
...
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.
The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.
The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)
But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:
With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.
The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.
Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.
See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)
- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
The hardlockup detector on x86 uses a performance counter based on unhalted
CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the
performance counter period, so the hrtimer should fire 2-3 times before the
performance counter NMI fires. The NMI code checks whether the hrtimer
fired since the last invocation. If not, it assumess a hard lockup.
The calculation of those periods is based on the nominal CPU
frequency. Turbo modes increase the CPU clock frequency and therefore
shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x
nominal frequency) the perf/NMI period is shorter than the hrtimer period
which leads to false positives.
A simple fix would be to shorten the hrtimer period, but that comes with
the side effect of more frequent hrtimer and softlockup thread wakeups,
which is not desired.
Implement a low pass filter, which checks the perf/NMI period against
kernel time. If the perf/NMI fires before 4/5 of the watchdog period has
elapsed then the event is ignored and postponed to the next perf/NMI.
That solves the problem and avoids the overhead of shorter hrtimer periods
and more frequent softlockup thread wakeups.
Fixes: 58687acba5 ("lockup_detector: Combine nmi_watchdog and softlockup detector")
Reported-and-tested-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dzickus@redhat.com
Cc: prarit@redhat.com
Cc: ak@linux.intel.com
Cc: babu.moger@oracle.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@redhat.com
Cc: stable@vger.kernel.org
Cc: atomlin@redhat.com
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos
The syntax to turn Kconfig options into non-interactive ones is to not offer
interactive prompt help texts. Remove them.
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
'complete' is an adjective and LOCKDEP_COMPLETE sounds like 'lockdep is complete',
so pick a better name that uses a noun.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/1502960261-16206-3-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lockdep doesn't have to be made to work with crossrelease and just works
with them. Reword the title so that what the option does is clear.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/1502960261-16206-2-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Crossrelease support added the CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETE
options. It makes little sense to enable them when PROVE_LOCKING is disabled.
Make them non-interative options and part of PROVE_LOCKING to simplify the UI.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/1502960261-16206-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Although wait_for_completion() and its family can cause deadlock, the
lock correctness validator could not be applied to them until now,
because things like complete() are usually called in a different context
from the waiting context, which violates lockdep's assumption.
Thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can now apply the lockdep
detector to those completion operations. Applied it.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: boqun.feng@gmail.com
Cc: kernel-team@lge.com
Cc: kirill@shutemov.name
Cc: npiggin@gmail.com
Cc: walken@google.com
Cc: willy@infradead.org
Link: http://lkml.kernel.org/r/1502089981-21272-10-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lockdep is a runtime locking correctness validator that detects and
reports a deadlock or its possibility by checking dependencies between
locks. It's useful since it does not report just an actual deadlock but
also the possibility of a deadlock that has not actually happened yet.
That enables problems to be fixed before they affect real systems.
However, this facility is only applicable to typical locks, such as
spinlocks and mutexes, which are normally released within the context in
which they were acquired. However, synchronization primitives like page
locks or completions, which are allowed to be released in any context,
also create dependencies and can cause a deadlock.
So lockdep should track these locks to do a better job. The 'crossrelease'
implementation makes these primitives also be tracked.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: boqun.feng@gmail.com
Cc: kernel-team@lge.com
Cc: kirill@shutemov.name
Cc: npiggin@gmail.com
Cc: walken@google.com
Cc: willy@infradead.org
Link: http://lkml.kernel.org/r/1502089981-21272-6-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A couple of Kconfig changes which make it much easier to switch to the
new CONFIG_ORC_UNWINDER:
1) Remove x86 dependencies on CONFIG_FRAME_POINTER for lockdep,
latencytop, and fault injection. x86 has a 'guess' unwinder which
just scans the stack for kernel text addresses. It's not 100%
accurate but in many cases it's good enough. This allows those users
who don't want the text overhead of the frame pointer or ORC
unwinders to still use these features. More importantly, this also
makes it much more straightforward to disable frame pointers.
2) Make CONFIG_ORC_UNWINDER depend on !CONFIG_FRAME_POINTER. While it
would be possible to have both enabled, it doesn't really make sense
to do so. So enforce a sane configuration to prevent the user from
making a dumb mistake.
With these changes, when you disable CONFIG_FRAME_POINTER, "make
oldconfig" will ask if you want to enable CONFIG_ORC_UNWINDER.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/9985fb91ce5005fe33ea5cc2a20f14bd33c61d03.1500938583.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y.
It plugs into the existing x86 unwinder framework.
It relies on objtool to generate the needed .orc_unwind and
.orc_unwind_ip sections.
For more details on why ORC is used instead of DWARF, see
Documentation/x86/orc-unwinder.txt - but the short version is
that it's a simplified, fundamentally more robust debugninfo
data structure, which also allows up to two orders of magnitude
faster lookups than the DWARF unwinder - which matters to
profiling workloads like perf.
Thanks to Andy Lutomirski for the performance improvement ideas:
splitting the ORC unwind table into two parallel arrays and creating a
fast lookup table to search a subset of the unwind table.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com
[ Extended the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllqXNUACgkQ8vlZVpUN
gaPtAgf/aUbXZuWYsDQzslHsbzEWi+qz4QgL885/w4L00pEImTTp91Q06SDxWhtB
KPvGnZHS3IofxBh2DC+6AwN6dPMoWDCfYhhO6po3FSz0DiPRIQCTuvOb8fhKY1X7
rTdDq2xtDxPGxJ25bMJtlrgzH2XlXPpVyPUeoc9uh87zUK5aesXpUn9kBniRexoz
ume+M/cDzPKkwNQpbLq8vzhNjoWMVv0FeW2akVvrjkkWko8nZLZ0R/kIyKQlRPdG
LZDXcz0oTHpDS6+ufEo292ZuWm2IGer2YtwHsKyCAsyEWsUqBz2yurtkSj3mAVyC
hHafyS+5WNaGdgBmg0zJxxwn5qxxLg==
=ua7p
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random updates from Ted Ts'o:
"Add wait_for_random_bytes() and get_random_*_wait() functions so that
callers can more safely get random bytes if they can block until the
CRNG is initialized.
Also print a warning if get_random_*() is called before the CRNG is
initialized. By default, only one single-line warning will be printed
per boot. If CONFIG_WARN_ALL_UNSEEDED_RANDOM is defined, then a
warning will be printed for each function which tries to get random
bytes before the CRNG is initialized. This can get spammy for certain
architecture types, so it is not enabled by default"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: reorder READ_ONCE() in get_random_uXX
random: suppress spammy warnings about unseeded randomness
random: warn when kernel uses unseeded randomness
net/route: use get_random_int for random counter
net/neighbor: use get_random_u32 for 32-bit hash random
rhashtable: use get_random_u32 for hash_rnd
ceph: ensure RNG is seeded before using
iscsi: ensure RNG is seeded before use
cifs: use get_random_u32 for 32-bit lock random
random: add get_random_{bytes,u32,u64,int,long,once}_wait family
random: add wait_for_random_bytes() API
Unfortunately, on some models of some architectures getting a fully
seeded CRNG is extremely difficult, and so this can result in dmesg
getting spammed for a surprisingly long time. This is really bad from
a security perspective, and so architecture maintainers really need to
do what they can to get the CRNG seeded sooner after the system is
booted. However, users can't do anything actionble to address this,
and spamming the kernel messages log will only just annoy people.
For developers who want to work on improving this situation,
CONFIG_WARN_UNSEEDED_RANDOM has been renamed to
CONFIG_WARN_ALL_UNSEEDED_RANDOM. By default the kernel will always
print the first use of unseeded randomness. This way, hopefully the
security obsessed will be happy that there is _some_ indication when
the kernel boots there may be a potential issue with that architecture
or subarchitecture. To see all uses of unseeded randomness,
developers can enable CONFIG_WARN_ALL_UNSEEDED_RANDOM.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This adds a new stress test driver for kmod: the kernel module loader.
The new stress test driver, test_kmod, is only enabled as a module right
now. It should be possible to load this as built-in and load tests
early (refer to the force_init_test module parameter), however since a
lot of test can get a system out of memory fast we leave this disabled
for now.
Using a system with 1024 MiB of RAM can *easily* get your kernel OOM
fast with this test driver.
The test_kmod driver exposes API knobs for us to fine tune simple
request_module() and get_fs_type() calls. Since these API calls only
allow each one parameter a test driver for these is rather simple.
Other factors that can help out test driver though are the number of
calls we issue and knowing current limitations of each. This exposes
configuration as much as possible through userspace to be able to build
tests directly from userspace.
Since it allows multiple misc devices its will eventually (once we add a
knob to let us create new devices at will) also be possible to perform
more tests in parallel, provided you have enough memory.
We only enable tests we know work as of right now.
Demo screenshots:
# tools/testing/selftests/kmod/kmod.sh
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0002_driver: OK! - loading kmod test
kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0002_fs: OK! - loading kmod test
kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0003: OK! - loading kmod test
kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0004: OK! - loading kmod test
kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
XXX: add test restult for 0007
Test completed
You can also request for specific tests:
# tools/testing/selftests/kmod/kmod.sh -t 0001
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
Test completed
Lastly, the current available number of tests:
# tools/testing/selftests/kmod/kmod.sh --help
Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
Valid tests: 0001-0009
0001 - Simple test - 1 thread for empty string
0002 - Simple test - 1 thread for modules/filesystems that do not exist
0003 - Simple test - 1 thread for get_fs_type() only
0004 - Simple test - 2 threads for get_fs_type() only
0005 - multithreaded tests with default setup - request_module() only
0006 - multithreaded tests with default setup - get_fs_type() only
0007 - multithreaded tests with default setup test request_module() and get_fs_type()
0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()
The following test cases currently fail, as such they are not currently
enabled by default:
# tools/testing/selftests/kmod/kmod.sh -t 0008
# tools/testing/selftests/kmod/kmod.sh -t 0009
To be sure to run them as intended please unload both of the modules:
o test_module
o xfs
And ensure they are not loaded on your system prior to testing them. If
you use these paritions for your rootfs you can change the default test
driver used for get_fs_type() by exporting it into your environment. For
example of other test defaults you can override refer to kmod.sh
allow_user_defaults().
Behind the scenes this is how we fine tune at a test case prior to
hitting a trigger to run it:
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
Finally to trigger:
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config
The kmod.sh script uses the above constructs to build different test cases.
A bit of interpretation of the current failures follows, first two
premises:
a) When request_module() is used userspace figures out an optimized
version of module order for us. Once it finds the modules it needs, as
per depmod symbol dep map, it will finit_module() the respective
modules which are needed for the original request_module() request.
b) We have an optimization in place whereby if a kernel uses
request_module() on a module already loaded we never bother userspace
as the module already is loaded. This is all handled by kernel/kmod.c.
A few things to consider to help identify root causes of issues:
0) kmod 19 has a broken heuristic for modules being assumed to be
built-in to your kernel and will return 0 even though request_module()
failed. Upgrade to a newer version of kmod.
1) A get_fs_type() call for "xfs" will request_module() for "fs-xfs",
not for "xfs". The optimization in kernel described in b) fails to
catch if we have a lot of consecutive get_fs_type() calls. The reason
is the optimization in place does not look for aliases. This means two
consecutive get_fs_type() calls will bump kmod_concurrent, whereas
request_module() will not.
This one explanation why test case 0009 fails at least once for
get_fs_type().
2) If a module fails to load --- for whatever reason (kmod_concurrent
limit reached, file not yet present due to rootfs switch, out of
memory) we have a period of time during which module request for the
same name either with request_module() or get_fs_type() will *also*
fail to load even if the file for the module is ready.
This explains why *multiple* NULLs are possible on test 0009.
3) finit_module() consumes quite a bit of memory.
4) Filesystems typically also have more dependent modules than other
modules, its important to note though that even though a get_fs_type()
call does not incur additional kmod_concurrent bumps, since userspace
loads dependencies it finds it needs via finit_module_fd(), it *will*
take much more memory to load a module with a lot of dependencies.
Because of 3) and 4) we will easily run into out of memory failures with
certain tests. For instance test 0006 fails on qemu with 1024 MiB of RAM.
It panics a box after reaping all userspace processes and still not
having enough memory to reap.
[arnd@arndb.de: add dependencies for test module]
Link: http://lkml.kernel.org/r/20170630154834.3689272-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20170628223155.26472-3-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.
LOCKUP_DETECTOR implies the general boot, sysctl, and programming
interfaces for the lockup detectors.
An architecture that wants to use a hard lockup detector must define
HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH.
Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the
minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and
does not implement the LOCKUP_DETECTOR interfaces.
sparc is unusual in that it has started to implement some of the
interfaces, but not fully yet. It should probably be converted to a full
HAVE_HARDLOCKUP_DETECTOR_ARCH.
[npiggin@gmail.com: fix]
Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com
Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Babu Moger <babu.moger@oracle.com> [sparc]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The existing tools/testing/selftests/sysctl/ tests include two test
cases, but these use existing production kernel sysctl interfaces. We
want to expand test coverage but we can't just be looking for random
safe production values to poke at, that's just insane!
Instead just dedicate a test driver for debugging purposes and port the
existing scripts to use it. This will make it easier for further tests
to be added.
Subsequent patches will extend our test coverage for sysctl.
The stress test driver uses a new license (GPL on Linux, copyleft-next
outside of Linux). Linus was fine with this [0] and later due to Ted's
and Alans's request ironed out an "or" language clause to use [1] which
is already present upstream.
[0] https://lkml.kernel.org/r/CA+55aFyhxcvD+q7tp+-yrSFDKfR0mOHgyEAe=f_94aKLsOu0Og@mail.gmail.com
[1] https://lkml.kernel.org/r/1495234558.7848.122.camel@linux.intel.com
Link: http://lkml.kernel.org/r/20170630224431.17374-2-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "lib/interval_tree_test: some debugging improvements".
Here are some patches that update the interval_tree_test module allowing
users to pass finer grained options to run the actual test.
This patch (of 4):
It is a tristate after all, and also serves well for quick debugging.
Link: http://lkml.kernel.org/r/20170518174936.20265-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
around. Highlights include:
- Conversion of a bunch of security documentation into RST
- The conversion of the remaining DocBook templates by The Amazing
Mauro Machine. We can now drop the entire DocBook build chain.
- The usual collection of fixes and minor updates.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZWkGAAAoJEI3ONVYwIuV6rf0P/0B3JTiVPKS/WUx53+jzbAi4
1BN7dmmuMxE1bWpgdEq+ac4aKxm07iAojuntuMj0qz/ZB1WARcmvEqqzI5i4wfq9
5MrLduLkyuWfr4MOPseKJ2VK83p8nkMOiO7jmnBsilu7fE4nF+5YY9j4cVaArfMy
cCQvAGjQzvej2eiWMGUSLHn4QFKh00aD7cwKyBVsJ08b27C9xL0J2LQyCDZ4yDgf
37/MH3puEd3HX/4qAwLonIxT3xrIrrbDturqLU7OSKcWTtGZNrYyTFbwR3RQtqWd
H8YZVg2Uyhzg9MYhkbQ2E5dEjUP4mkegcp6/JTINH++OOPpTbdTJgirTx7VTkSf1
+kL8t7+Ayxd0FH3+77GJ5RMj8LUK6rj5cZfU5nClFQKWXP9UL3IelQ3Nl+SpdM8v
ZAbR2KjKgH9KS6+cbIhgFYlvY+JgPkOVruwbIAc7wXVM3ibk1sWoBOFEujcbueWh
yDpQv3l1UX0CKr3jnevJoW26LtEbGFtC7gSKZ+3btyeSBpWFGlii42KNycEGwUW0
ezlwryDVHzyTUiKllNmkdK4v73mvPsZHEjgmme4afKAIiUilmcUF4XcqD86hISFT
t+UJLA/zEU+0sJe26o2nK6GNJzmo4oCtVyxfhRe26Ojs1n80xlYgnZRfuIYdd31Z
nwLBnwDCHAOyX91WXp9G
=cVjZ
-----END PGP SIGNATURE-----
Merge tag 'docs-4.13' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"There has been a fair amount of activity in the docs tree this time
around. Highlights include:
- Conversion of a bunch of security documentation into RST
- The conversion of the remaining DocBook templates by The Amazing
Mauro Machine. We can now drop the entire DocBook build chain.
- The usual collection of fixes and minor updates"
* tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
scripts/kernel-doc: handle DECLARE_HASHTABLE
Documentation: atomic_ops.txt is core-api/atomic_ops.rst
Docs: clean up some DocBook loose ends
Make the main documentation title less Geocities
Docs: Use kernel-figure in vidioc-g-selection.rst
Docs: fix table problems in ras.rst
Docs: Fix breakage with Sphinx 1.5 and upper
Docs: Include the Latex "ifthen" package
doc/kokr/howto: Only send regression fixes after -rc1
docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
doc: Document suitability of IBM Verse for kernel development
Doc: fix a markup error in coding-style.rst
docs: driver-api: i2c: remove some outdated information
Documentation: DMA API: fix a typo in a function name
Docs: Insert missing space to separate link from text
doc/ko_KR/memory-barriers: Update control-dependencies example
Documentation, kbuild: fix typo "minimun" -> "minimum"
docs: Fix some formatting issues in request-key.rst
doc: ReSTify keys-trusted-encrypted.txt
doc: ReSTify keys-request-key.txt
...
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- Add CONFIG_REFCOUNT_FULL=y to allow the disabling of the 'full'
(robustness checked) refcount_t implementation with slightly lower
runtime overhead. (Kees Cook)
The lighter weight variant is the default. The two variants use the
same API. Having this variant was a precondition by some
maintainers to merge refcount_t cleanups.
- Add lockdep support for rtmutexes (Peter Zijlstra)
- liblockdep fixes and improvements (Sasha Levin, Ben Hutchings)
- ... misc fixes and improvements"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
locking/refcount: Remove the half-implemented refcount_sub() API
locking/refcount: Create unchecked atomic_t implementation
locking/rtmutex: Don't initialize lockdep when not required
locking/selftest: Add RT-mutex support
locking/selftest: Remove the bad unlock ordering test
rt_mutex: Add lockdep annotations
MAINTAINERS: Claim atomic*_t maintainership
locking/x86: Remove the unused atomic_inc_short() methd
tools/lib/lockdep: Remove private kernel headers
tools/lib/lockdep: Hide liblockdep output from test results
tools/lib/lockdep: Add dummy current_gfp_context()
tools/include: Add IS_ERR_OR_NULL to err.h
tools/lib/lockdep: Add empty __is_[module,kernel]_percpu_address
tools/lib/lockdep: Include err.h
tools/include: Add (mostly) empty include/linux/sched/mm.h
tools/lib/lockdep: Use LDFLAGS
tools/lib/lockdep: Remove double-quotes from soname
tools/lib/lockdep: Fix object file paths used in an out-of-tree build
tools/lib/lockdep: Fix compilation for 4.11
tools/lib/lockdep: Don't mix fd-based and stream IO
...
This enables an important dmesg notification about when drivers have
used the crng without it being seeded first. Prior, these errors would
occur silently, and so there hasn't been a great way of diagnosing these
types of bugs for obscure setups. By adding this as a config option, we
can leave it on by default, so that we learn where these issues happen,
in the field, will still allowing some people to turn it off, if they
really know what they're doing and do not want the log entries.
However, we don't leave it _completely_ by default. An earlier version
of this patch simply had `default y`. I'd really love that, but it turns
out, this problem with unseeded randomness being used is really quite
present and is going to take a long time to fix. Thus, as a compromise
between log-messages-for-all and nobody-knows, this is `default y`,
except it is also `depends on DEBUG_KERNEL`. This will ensure that the
curious see the messages while others don't have to.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
RCU's debugging Kconfig options are in the unintuitive location
lib/Kconfig.debug, and there are enough of them that it would be good for
them to be more centralized. This commit therefore extracts RCU's Kconfig
options from init/Kconfig into a new kernel/rcu/Kconfig.debug file.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
RCU's debugfs tracing used to be the only reasonable low-level debug
information available, but ftrace and event tracing has since surpassed
the RCU debugfs level of usefulness. This commit therefore removes
RCU's debugfs tracing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The sparse-based checking for non-RCU accesses to RCU-protected pointers
has been around for a very long time, and it is now the only type of
sparse-based checking that is optional. This commit therefore makes
it unconditional.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
The PROVE_RCU_REPEATEDLY Kconfig option was initially added due to
the volume of messages from PROVE_RCU: Doing just one per boot would
have required excessive numbers of boots to locate them all. However,
PROVE_RCU messages are now relatively rare, so there is no longer any
reason to need more than one such message per boot. This commit therefore
removes the PROVE_RCU_REPEATEDLY Kconfig option.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
useful for torture testing, and there are the rcutree.gp_cleanup_delay,
rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
that rcutorture can use instead. The effect of these parameters is to
artificially slow down grace period initialization and cleanup in order
to make some types of race conditions happen more often.
This commit therefore simplifies Tree RCU a bit by removing the Kconfig
options and adding the corresponding kernel parameters to rcutorture's
.boot files instead. However, this commit also leaves out the kernel
parameters for TREE02, TREE04, and TREE07 in order to have about the
same number of tests slowed as not slowed. TREE01, TREE03, TREE05,
and TREE06 are slowed, and the rest are not slowed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that (PI) futexes have their own private RT-mutex interface and
implementation we can easily add lockdep annotations to the existing
RT-mutex interface.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The filesystem documentation was moved from DocBook to
Documentation/filesystems/. Update it at the sources.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Extract the linked list sorting test code into its own source file, to
allow to compile it either to a loadable module, or builtin into the
kernel.
Link: http://lkml.kernel.org/r/1488287219-15832-4-git-send-email-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow to compile the array-based sort test code either to a loadable
module, or builtin into the kernel.
Link: http://lkml.kernel.org/r/1488287219-15832-3-git-send-email-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
guide for user-space API documents, rather sparsely populated at the
moment, but it's a start. Markus improved the infrastructure for
converting diagrams. Mauro has converted much of the USB documentation
over to RST. Plus the usual set of fixes, improvements, and tweaks.
There's a bit more than the usual amount of reaching out of Documentation/
to fix comments elsewhere in the tree; I have acks for those where I could
get them.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZB1elAAoJEI3ONVYwIuV6wUIQAJSM/4rNdj6z+GXeWhRfbsOo
vqqVYluvXQIJaaqdsy9dgcfThhOXWYsPyVF6Xd+bDJpwF3BMZYbX1CI1Mo3kRD+7
9+Pf68cYSHRoU3l/sFI8q0zfKbHtmFteIvnRQoFtRaExqgTR8glUfxNDyN9XuNAZ
3naS4qMZivM4gjMcSpIB/wFOQpV+6qVIs6VTFLdCC8wodT3W/Wmb+bqrCVJ0twbB
t8jJeYHt2wsiTdqrKU+VilAUAZ1Lby+DNfeWrO18rC1ohktPyUzOGg8JmTKUBpVO
qj1OJwD6abuaNh/J9bXsh8u0OrVrBKWjVrhq9IFYDlm92fu3Bgr6YeoaVPEpcklt
jdlgZnWs9/oXa6d32aMc9F7mP9a0Q1qikFTYINhaHQZCb4VDRuQ9hCSuqWm5jlVy
lmVAoxLa0zSdOoXaYuO3HC99ku1cIn814CXMDz/IwKXkqUCV+zl+H3AGkvxGyQ5M
eblw2TnQnc6e1LRcxt5bgpFR1JYMbCJhu0U5XrNFueQV8ReB15dvL7h4y21dWJKF
2Sr83rwfG1rpZQiSqCjOXxIzuXbEGH3+a+zCDV5IHhQRt/VNDOt2hgmcyucSSJ5h
5GRFYgTlGvoT/6LdIT39QooHB+4tSDRtEQ6lh0q2ZtVV2rfG/I6/PR5sUbWM65SN
vAfctRm2afHLhdonSX5O
=41m+
-----END PGP SIGNATURE-----
Merge tag 'docs-4.12' of git://git.lwn.net/linux
Pull documentation update from Jonathan Corbet:
"A reasonably busy cycle for documentation this time around. There is a
new guide for user-space API documents, rather sparsely populated at
the moment, but it's a start. Markus improved the infrastructure for
converting diagrams. Mauro has converted much of the USB documentation
over to RST. Plus the usual set of fixes, improvements, and tweaks.
There's a bit more than the usual amount of reaching out of
Documentation/ to fix comments elsewhere in the tree; I have acks for
those where I could get them"
* tag 'docs-4.12' of git://git.lwn.net/linux: (74 commits)
docs: Fix a couple typos
docs: Fix a spelling error in vfio-mediated-device.txt
docs: Fix a spelling error in ioctl-number.txt
MAINTAINERS: update file entry for HSI subsystem
Documentation: allow installing man pages to a user defined directory
Doc/PM: Sync with intel_powerclamp code behavior
zr364xx.rst: usb/devices is now at /sys/kernel/debug/
usb.rst: move documentation from proc_usb_info.txt to USB ReST book
convert philips.txt to ReST and add to media docs
docs-rst: usb: update old usbfs-related documentation
arm: Documentation: update a path name
docs: process/4.Coding.rst: Fix a couple of document refs
docs-rst: fix usb cross-references
usb: gadget.h: be consistent at kernel doc macros
usb: composite.h: fix two warnings when building docs
usb: get rid of some ReST doc build errors
usb.rst: get rid of some Sphinx errors
usb/URB.txt: convert to ReST and update it
usb/persist.txt: convert to ReST and add to driver-api book
usb/hotplug.txt: convert to ReST and add to driver-api book
...
AVR32 architecture has been removed from the Linux kernel sources, hence
clean up the architecture related symbols in lib/Kconfig.debug.
Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the
kernel text, data, and bss fit in the required 32MB limit, but this
option is not set for every config that enables lockdep.
A 4.10 kernel fails to boot with the console output
Kernel: Using 8 locked TLB entries for main kernel image.
hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f
Program terminated
with these config options
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
CONFIG_PROVE_LOCKING=n
To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and
enable this option with CONFIG_LOCKDEP=y so we get the reduced memory
usage every time lockdep is turned on.
Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if
CONFIG_LOCKDEP is set to 'y'. When other lockdep-related config options
that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or
CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also
enabled.
Fixes: e6b5f1be7a ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bunch of documentation files have moved, correct the paths.
Signed-off-by: Hans Holmberg <hans@pixelmunchies.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull locking fixes from Ingo Molnar:
"The main change is the uninlining of large refcount_t APIs, plus a
header dependency fix.
Note that the uninlining allowed us to enable the underflow/overflow
warnings unconditionally and remove the debug Kconfig switch: this
might trigger new warnings in buggy code and turn
crashes/use-after-free bugs into less harmful memory leaks"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/refcounts: Add missing kernel.h header to have UINT_MAX defined
locking/refcounts: Out-of-line everything
Along with the addition made to Kconfig.debug, the prior existing but
permanently disabled test function has been slightly refactored.
Patch has been tested using QEMU 2.1.2 with a .config obtained through
'make defconfig' (x86_64) and manually enabling the option.
[arnd@arndb.de: move sort self-test into a separate file]
Link: http://lkml.kernel.org/r/20170112110657.3123790-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/HE1PR09MB0394B0418D504DCD27167D4FD49B0@HE1PR09MB0394.eurprd09.prod.outlook.com
Signed-off-by: Kostenzer Felix <fkostenzer@live.at>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow to compile the atomic64 test code either to a loadable module, or
builtin into the kernel.
Link: http://lkml.kernel.org/r/1483470276-10517-3-git-send-email-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus asked to please make this real C code.
And since size then isn't an issue what so ever anymore, remove the
debug knob and make all WARN()s unconditional.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dwindsor@gmail.com
Cc: elena.reshetova@intel.com
Cc: gregkh@linuxfoundation.org
Cc: ishkamiel@gmail.com
Cc: keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Here is the big tty/serial driver patchset for 4.11-rc1.
Not much here, but a lot of little fixes and individual serial driver
updates all over the subsystem. Majority are for the sh-sci driver and
platform (the arch-specific changes have acks from the maintainer).
The start of the "serial bus" code is here as well, but nothing is
converted to use it yet. That work is still ongoing, hopefully will
start to show up across different subsystems for 4.12 (bluetooth is one
major place that will be used.)
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWK2lDg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylwfwCgyExa4x8Lur2nZyu8cgcaVRU68VAAoNQh0WJt
EZdEhEkRMt4d64j+ApYI
=iv7x
-----END PGP SIGNATURE-----
Merge tag 'tty-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big tty/serial driver patchset for 4.11-rc1.
Not much here, but a lot of little fixes and individual serial driver
updates all over the subsystem. Majority are for the sh-sci driver and
platform (the arch-specific changes have acks from the maintainer).
The start of the "serial bus" code is here as well, but nothing is
converted to use it yet. That work is still ongoing, hopefully will
start to show up across different subsystems for 4.12 (bluetooth is
one major place that will be used.)
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (109 commits)
tty: pl011: Work around QDF2400 E44 stuck BUSY bit
atmel_serial: Use the fractional divider when possible
tty: Remove extra include in HVC console tty framework
serial: exar: Enable MSI support
serial: exar: Move register defines from uapi header to consumer site
serial: pci: Remove unused pci_boards entries
serial: exar: Move Commtech adapters to 8250_exar as well
serial: exar: Fix feature control register constants
serial: exar: Fix initialization of EXAR registers for ports > 0
serial: exar: Fix mapping of port I/O resources
serial: sh-sci: fix hardware RX trigger level setting
tty/serial: atmel: ensure state is restored after suspending
serial: 8250_dw: Avoid "too much work" from bogus rx timeout interrupt
serdev: ttyport: check whether tty_init_dev() fails
serial: 8250_pci: make pciserial_detach_ports() static
ARM: dts: STiH410-b2260: Enable HW flow-control
ARM: dts: STiH407-family: Use new Pinctrl groups
ARM: dts: STiH407-pinctrl: Add Pinctrl group for HW flow-control
ARM: dts: STiH410-b2260: Identify the UART RTS line
dt-bindings: serial: Update 'uart-has-rtscts' description
...
- Errata workarounds for Qualcomm's Falkor CPU
- Qualcomm L2 Cache PMU driver
- Qualcomm SMCCC firmware quirk
- Support for DEBUG_VIRTUAL
- CPU feature detection for userspace via MRS emulation
- Preliminary work for the Statistical Profiling Extension
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJYpIxqAAoJELescNyEwWM0xdwH/AsTYAXPZDMdRnrQUyV0Fd2H
/9pMzww6dHXEmCMKkImf++otUD6S+gTCJTsj7kEAXT5sZzLk27std5lsW7R9oPjc
bGQMalZy+ovLR1gJ6v072seM3In4xph/qAYOpD8Q0AfYCLHjfMMArQfoLa8Esgru
eSsrAgzVAkrK7XHi3sYycUjr9Hac9tvOOuQ3SaZkDz4MfFIbI4b43+c1SCF7wgT9
tQUHLhhxzGmgxjViI2lLYZuBWsIWsE+algvOe1qocvA9JEIXF+W8NeOuCjdL8WwX
3aoqYClC+qD/9+/skShFv5gM5fo0/IweLTUNIHADXpB6OkCYDyg+sxNM+xnEWQU=
=YrPg
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
- Errata workarounds for Qualcomm's Falkor CPU
- Qualcomm L2 Cache PMU driver
- Qualcomm SMCCC firmware quirk
- Support for DEBUG_VIRTUAL
- CPU feature detection for userspace via MRS emulation
- Preliminary work for the Statistical Profiling Extension
- Misc cleanups and non-critical fixes
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits)
arm64/kprobes: consistently handle MRS/MSR with XZR
arm64: cpufeature: correctly handle MRS to XZR
arm64: traps: correctly handle MRS/MSR with XZR
arm64: ptrace: add XZR-safe regs accessors
arm64: include asm/assembler.h in entry-ftrace.S
arm64: fix warning about swapper_pg_dir overflow
arm64: Work around Falkor erratum 1003
arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2
arm64: arch_timer: document Hisilicon erratum 161010101
arm64: use is_vmalloc_addr
arm64: use linux/sizes.h for constants
arm64: uaccess: consistently check object sizes
perf: add qcom l2 cache perf events driver
arm64: remove wrong CONFIG_PROC_SYSCTL ifdef
ARM: smccc: Update HVC comment to describe new quirk parameter
arm64: do not trace atomic operations
ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device()
ACPI/IORT: Fix iort_node_get_id() mapping entries indexing
arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
perf: xgene: Include module.h
...
Pull networking updates from David Miller:
"Highlights:
1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
Varadhan.
2) Simplify classifier state on sk_buff in order to shrink it a bit.
From Willem de Bruijn.
3) Introduce SIPHASH and it's usage for secure sequence numbers and
syncookies. From Jason A. Donenfeld.
4) Reduce CPU usage for ICMP replies we are going to limit or
suppress, from Jesper Dangaard Brouer.
5) Introduce Shared Memory Communications socket layer, from Ursula
Braun.
6) Add RACK loss detection and allow it to actually trigger fast
recovery instead of just assisting after other algorithms have
triggered it. From Yuchung Cheng.
7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
9) Export MPLS packet stats via netlink, from Robert Shearman.
10) Significantly improve inet port bind conflict handling, especially
when an application is restarted and changes it's setting of
reuseport. From Josef Bacik.
11) Implement TX batching in vhost_net, from Jason Wang.
12) Extend the dummy device so that VF (virtual function) features,
such as configuration, can be more easily tested. From Phil
Sutter.
13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
Dumazet.
14) Add new bpf MAP, implementing a longest prefix match trie. From
Daniel Mack.
15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
16) Add new aquantia driver, from David VomLehn.
17) Add bpf tracepoints, from Daniel Borkmann.
18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
Florian Fainelli.
19) Remove custom busy polling in many drivers, it is done in the core
networking since 4.5 times. From Eric Dumazet.
20) Support XDP adjust_head in virtio_net, from John Fastabend.
21) Fix several major holes in neighbour entry confirmation, from
Julian Anastasov.
22) Add XDP support to bnxt_en driver, from Michael Chan.
23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
25) Support GRO in IPSEC protocols, from Steffen Klassert"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
Revert "ath10k: Search SMBIOS for OEM board file extension"
net: socket: fix recvmmsg not returning error from sock_error
bnxt_en: use eth_hw_addr_random()
bpf: fix unlocking of jited image when module ronx not set
arch: add ARCH_HAS_SET_MEMORY config
net: napi_watchdog() can use napi_schedule_irqoff()
tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
net/hsr: use eth_hw_addr_random()
net: mvpp2: enable building on 64-bit platforms
net: mvpp2: switch to build_skb() in the RX path
net: mvpp2: simplify MVPP2_PRS_RI_* definitions
net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
net: mvpp2: remove unused register definitions
net: mvpp2: simplify mvpp2_bm_bufs_add()
net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
net: mvpp2: release reference to txq_cpu[] entry after unmapping
net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
...
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- Implement wraparound-safe refcount_t and kref_t types based on
generic atomic primitives (Peter Zijlstra)
- Improve and fix the ww_mutex code (Nicolai Hähnle)
- Add self-tests to the ww_mutex code (Chris Wilson)
- Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
Bueso)
- Micro-optimize the current-task logic all around the core kernel
(Davidlohr Bueso)
- Tidy up after recent optimizations: remove stale code and APIs,
clean up the code (Waiman Long)
- ... plus misc fixes, updates and cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
fork: Fix task_struct alignment
locking/spinlock/debug: Remove spinlock lockup detection code
lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
lkdtm: Convert to refcount_t testing
kref: Implement 'struct kref' using refcount_t
refcount_t: Introduce a special purpose refcount type
sched/wake_q: Clarify queue reinit comment
sched/wait, rcuwait: Fix typo in comment
locking/mutex: Fix lockdep_assert_held() fail
locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
locking/rwsem: Reinit wake_q after use
locking/rwsem: Remove unnecessary atomic_long_t casts
jump_labels: Move header guard #endif down where it belongs
locking/atomic, kref: Implement kref_put_lock()
locking/ww_mutex: Turn off __must_check for now
locking/atomic, kref: Avoid more abuse
locking/atomic, kref: Use kref_get_unless_zero() more
locking/atomic, kref: Kill kref_sub()
locking/atomic, kref: Add kref_read()
locking/atomic, kref: Add KREF_INIT()
...
Pull RCU updates from Ingo Molnar:
"The RCU changes in this cycle are:
- Dynticks updates, consolidating open-coded counter accesses into a
well-defined API
- SRCU updates: Simplify algorithm, add formal verification
- Documentation updates
- Miscellaneous fixes
- Torture-test updates
Most of the diffstat comes from the relatively large documentation
update"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
rcutorture: Add CBMC-based formal verification for SRCU
srcu: Force full grace-period ordering
srcu: Implement more-efficient reader counts
rcu: Adjust FQS offline checks for exact online-CPU detection
rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
rcu: Abstract extended quiescent state determination
rcu: Abstract dynticks extended quiescent state enter/exit operations
rcu: Add lockdep checks to synchronous expedited primitives
rcu: Eliminate unused expedited_normal counter
llist: Clarify comments about when locking is needed
rcu: Fix comment in rcu_organize_nocb_kthreads()
rcu: Enable RCU tracepoints by default to aid in debugging
rcu: Make rcu_cpu_starting() use its "cpu" argument
rcu: Add comment headers to expedited-grace-period counter functions
rcu: Don't wake rcuc/X kthreads on NOCB CPUs
rcu: Re-enable TASKS_RCU for User Mode Linux
rcu: Once again use NMI-based stack traces in stall warnings
rcu: Remove short-term CPU kicking
rcu: Add long-term CPU kicking
...
Currently CONFIG_TIMER_STATS exposes process information across namespaces:
kernel/time/timer_list.c print_timer():
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
/proc/timer_list:
#11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Xing Gao <xgao01@email.wm.edu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jessica Frazelle <me@jessfraz.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Provide refcount_t, an atomic_t like primitive built just for
refcounting.
It provides saturation semantics such that overflow becomes impossible
and thereby 'spurious' use-after-free is avoided.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This introduces a infrastructure for management of linear priority
areas. Priority order in an array matters, however order of items inside
a priority group does not matter.
As an initial implementation, L-sort algorithm is used. It is quite
trivial. More advanced algorithm called P-sort will be introduced as a
follow-up. The infrastructure is prepared for other algos.
Alongside this, a testing module is introduced as well.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While debugging a performance issue I needed to understand why
RCU sofitrqs were firing so frequently.
Unfortunately, the RCU callback tracepoints are hidden behind
CONFIG_RCU_TRACE which defaults to off in the upstream kernel and is
likely to also be disabled in enterprise distribution configs.
Enable it by default for CONFIG_TREE_RCU. However, we must keep it
disabled for tiny RCU, because it would otherwise pull in a large
amount of code that would make tiny RCU less than tiny.
I ran some file system metadata intensive workloads (git checkout,
FS-Mark) on a variety of machines with this patch and saw no
detectable change in performance.
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
many embedded boards have a disconnected TTL level serial which can
generate some garbage that can lead to spurious false sysrq detects.
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
DEBUG_VIRTUAL currently depends on DEBUG_KERNEL && X86. arm64 is getting
the same support. Rather than add a list of architectures, switch this
to ARCH_HAS_DEBUG_VIRTUAL and let architectures select it as
appropriate.
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The build of frv allmodconfig was failing with the errors like:
/tmp/cc0JSPc3.s: Assembler messages:
/tmp/cc0JSPc3.s:1839: Error: symbol `.LSLT0' is already defined
/tmp/cc0JSPc3.s:1842: Error: symbol `.LASLTP0' is already defined
/tmp/cc0JSPc3.s:1969: Error: symbol `.LELTP0' is already defined
/tmp/cc0JSPc3.s:1970: Error: symbol `.LELT0' is already defined
Commit 866ced950b ("kbuild: Support split debug info v4") introduced
splitting the debug info and keeping that in a separate file. Somehow,
the frv-linux gcc did not like that and I am guessing that instead of
splitting it started copying. The first report about this is at:
https://lists.01.org/pipermail/kbuild-all/2015-July/010527.html.
I will try and see if this can work with frv and if still fails I will
open a bug report with gcc. But meanwhile this is the easiest option to
solve build failure of frv.
Fixes: 866ced950b ("kbuild: Support split debug info v4")
Link: http://lkml.kernel.org/r/1482062348-5352-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function, or as a
general PRF for short input use cases, such as sequence numbers or RNG
chaining.
For the first usage:
There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector. Currently
hashtables use jhash, which is fast but not secure, and some kind of
rotating key scheme (or none at all, which isn't good). SipHash is meant
as a replacement for jhash in these cases.
There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.
While SipHash is extremely fast for a cryptographically secure function,
it is likely a bit slower than the insecure jhash, and so replacements
will be evaluated on a case-by-case basis based on whether or not the
difference in speed is negligible and whether or not the current jhash usage
poses a real security risk.
For the second usage:
A few places in the kernel are using MD5 or SHA1 for creating secure
sequence numbers, syn cookies, port numbers, or fast random numbers.
SipHash is a faster and more fitting, and more secure replacement for MD5
in those situations. Replacing MD5 and SHA1 with SipHash for these uses is
obvious and straight-forward, and so is submitted along with this patch
series. There shouldn't be much of a debate over its efficacy.
Dozens of languages are already using this internally for their hash
tables and PRFs. Some of the BSDs already use this in their kernels.
SipHash is a widely known high-speed solution to a widely known set of
problems, and it's time we catch-up.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(),
register_hotcpu_notifier(), register_cpu_notifier(),
__register_hotcpu_notifier(), __register_cpu_notifier(),
unregister_hotcpu_notifier(), unregister_cpu_notifier(),
__unregister_hotcpu_notifier(), __unregister_cpu_notifier()
are unused now. Remove them and all related code.
Remove also the now pointless cpu notifier error injection mechanism. The
states can be executed step by step and error rollback is the same as cpu
down, so any state transition can be tested w/o requiring the notifier
error injection.
Some CPU hotplug states are kept as they are (ab)used for hotplug state
tracking.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Documentation on development tools was moved to Documentation/devl-tools
and sphinxified (renamed from .txt to .rst).
References in lib/Kconfig.debug need to be updated to the new location.
Link: http://lkml.kernel.org/r/1476698152-29340-2-git-send-email-andreas.platschek@opentech.at
Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's another busy cycle for the docs tree, as the sphinx conversion
continues. Highlights include:
- Further work on PDF output, which remains a bit of a pain but should be
more solid now.
- Five more DocBook template files converted to Sphinx. Only 27 to go...
Lots of plain-text files have also been converted and integrated.
- Images in binary formats have been replaced with more source-friendly
versions.
- Various bits of organizational work, including the renaming of various
files discussed at the kernel summit.
- New documentation for the device_link mechanism.
...and, of course, lots of typo fixes and small updates.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYTbl7AAoJEI3ONVYwIuV63NIP/REwzThnGWFJMRSuq8Ieq2r9
sFSQsaGTGlhyKiDoEooo+SO/Za3uTonjK+e7WZg8mhdiEdamta5aociU/71C1Yy/
T9ur0FhcGblrvZ1NidSDvCLwuECZOMMei7mgLZ9a+KCpc4ANqqTVZSUm1blKcqhF
XelhVXxBa0ar35l/pVzyCxkdNXRWXv+MJZE8hp5XAdTdr11DS7UY9zrZdH31axtf
BZlbYJrvB8WPydU6myTjRpirA17Hu7uU64MsL3bNIEiRQ+nVghEzQC8uxeUCvfVx
r0H5AgGGQeir+e8GEv2T20SPZ+dumXs+y/HehKNb3jS3gV0mo+pKPeUhwLIxr+Zh
QY64gf+jYf5ISHwAJRnU0Ima72ehObzSbx9Dko10nhq2OvbR5f83gjz9t9jKYFU7
RDowICA8lwqyRbHRoVfyoW8CpVhWFpMFu3yNeJMckeTish3m7ANqzaWslbsqIP5G
zxgFMIrVVSbeae+sUeygtEJAnWI09aZ4tuaUXYtGWwu6ikC/3aV6DryP4bthG2LF
A19uV4nMrLuuh8g2wiTHHjMfjYRwvSn+f9yaolwJhwyNDXQzRPy+ZJ3W/6olOkXC
bAxTmVRCW5GA/fmSrfXmW1KbnxlWfP2C62hzZQ09UHxzTHdR97oFLDQdZhKo1uwf
pmSJR0hVeRUmA4uw6+Su
=A0EV
-----END PGP SIGNATURE-----
Merge tag 'docs-4.10' of git://git.lwn.net/linux
Pull documentation update from Jonathan Corbet:
"These are the documentation changes for 4.10.
It's another busy cycle for the docs tree, as the sphinx conversion
continues. Highlights include:
- Further work on PDF output, which remains a bit of a pain but
should be more solid now.
- Five more DocBook template files converted to Sphinx. Only 27 to
go... Lots of plain-text files have also been converted and
integrated.
- Images in binary formats have been replaced with more
source-friendly versions.
- Various bits of organizational work, including the renaming of
various files discussed at the kernel summit.
- New documentation for the device_link mechanism.
... and, of course, lots of typo fixes and small updates"
* tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits)
dma-buf: Extract dma-buf.rst
Update Documentation/00-INDEX
docs: 00-INDEX: document directories/files with no docs
docs: 00-INDEX: remove non-existing entries
docs: 00-INDEX: add missing entries for documentation files/dirs
docs: 00-INDEX: consolidate process/ and admin-guide/ description
scripts: add a script to check if Documentation/00-INDEX is sane
Docs: change sh -> awk in REPORTING-BUGS
Documentation/core-api/device_link: Add initial documentation
core-api: remove an unexpected unident
ppc/idle: Add documentation for powersave=off
Doc: Correct typo, "Introdution" => "Introduction"
Documentation/atomic_ops.txt: convert to ReST markup
Documentation/local_ops.txt: convert to ReST markup
Documentation/assoc_array.txt: convert to ReST markup
docs-rst: parse-headers.pl: cleanup the documentation
docs-rst: fix media cleandocs target
docs-rst: media/Makefile: reorganize the rules
docs-rst: media: build SVG from graphviz files
docs-rst: replace bayer.png by a SVG image
...
With CONFIG_DEVMEM not set, CONFIG_STRICT_DEVMEM will be useless even if
it is set =y, thus let's update the dependency in Kconfig.
Link: http://lkml.kernel.org/r/20161006051217.GA31027@dhcp-128-65.nay.redhat.com
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a configuration option to set the default console loglevel. This
is, as before, still possible to override at runtime through bootargs
(loglevel=<x>), sysrq and /proc/printk.
There are cases where adding additional arguments on the commandline is
impractical, and changing the default for the kernel when being built
makes more sense. Provide such a method here, for those who choose to
do so.
Also, while touching this code, clarify the difference between
MESSAGE_LOGLEVEL_DEFAULT and CONSOLE_LOGLEVEL_DEFAULT.
Link: http://lkml.kernel.org/r/1479676829-30031-1-git-send-email-olof@lixom.net
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull RCU updates from Paul E. McKenney:
- Documentation updates, yet again just simple changes.
- Miscellaneous fixes, including a change to call_rcu()'s
rcu_head alignment check.
- Security-motivated list consistency checks, which are
disabled by default behind DEBUG_LIST.
- Torture-test updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYHmoCAAoJEHm+PkMAQRiG7RMIAI2i7Y5hpL5yCxK5AFaL4u/G
KxXfp1B1UanUTgjOmd7zGqtDYcFX9t7GTTUFixQ7/9Opr4PD9qbnatoDGSc3xjbT
msDgA1B78F1/Q3kHWfeGq32MihQ4mj5NwUCo+igUcUvvWG7mHgzErj/Nh5RoobQX
p/izdpTbrw3GX6xXB8olbG7XWHaVye/+TT3q6+gmgm8I/QEujcLeGoycE0zlhPN8
FG/JX76At/+ZM2Py7Oxo3k+oKL9CHrtOQYDp/wN0uslV5eYvvkZz0/M1HMOGZt+c
gZU5jzM17K7C4Nzo06WAuBU9wUBGc25m+cPicLlOmljnzfU+f50SKaDjZq3p7QI=
=2KUF
-----END PGP SIGNATURE-----
Merge tag 'v4.9-rc4' into sound
Bring in -rc4 patches so I can successfully merge the sound doc changes.
This new config parameter limits the space used for "Lock debugging:
prove locking correctness" by about 4MB. The current sparc systems have
the limitation of 32MB size for kernel size including .text, .data and
.bss sections. With PROVE_LOCKING feature, the kernel size could grow
beyond this limit and causing system boot-up issues. With this option,
kernel limits the size of the entries of lock_chains, stack_trace etc.,
so that kernel fits in required size limit. This is not visible to user
and only used for sparc.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit brings the RCU_TRACE Kconfig option's help text up to date
by noting that it enables additional event tracing as well as debugfs.
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
[ paulmck: Do some wordsmithing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The CONFIG_DEBUG_LIST option is normally meant for kernel developers
rather than production machines and is guarded by CONFIG_DEBUG_KERNEL.
In contrast, the newly added CONFIG_BUG_ON_DATA_CORRUPTION is meant
for security hardening and may be used on systems that intentionally
do not enable CONFIG_DEBUG_KERNEL. In this configuration, we get
a warning from Kconfig about the mismatched dependencies:
warning: (BUG_ON_DATA_CORRUPTION) selects DEBUG_LIST which has unmet direct dependencies (DEBUG_KERNEL)
This annotates the DEBUG_LIST option to be selectable by
BUG_ON_DATA_CORRUPTION when DEBUG_KERNEL is disabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 40cd725cfc7f ("bug: Provide toggle for BUG on data corruption")
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Commit 484f29c7430b3 ("bug: Provide toggle for BUG on data corruption")
added a Kconfig select statement on CONFIG_DEBUG_LIST, but the CONFIG_
prefix is only used in Make and C(PP) syntax. Remove the CONFIG_ prefix
to correctly select the Kconfig option DEBUG_LIST.
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
The kernel checks for cases of data structure corruption under some
CONFIGs (e.g. CONFIG_DEBUG_LIST). When corruption is detected, some
systems may want to BUG() immediately instead of letting the system run
with known corruption. Usually these kinds of manipulation primitives can
be used by security flaws to gain arbitrary memory write control. This
provides a new config CONFIG_BUG_ON_DATA_CORRUPTION and a corresponding
macro CHECK_DATA_CORRUPTION for handling these situations. Notably, even
if not BUGing, the kernel should not continue processing the corrupted
structure.
This is inspired by similar hardening by Syed Rameez Mustafa in MSM
kernels, and in PaX and Grsecurity, which is likely in response to earlier
removal of the BUG calls in commit 924d9addb9 ("list debugging: use
WARN() instead of BUG()").
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel@redhat.com>
When building with the latent_entropy plugin, set the default
CONFIG_FRAME_WARN to 2048, since some __init functions have many basic
blocks that, when instrumented by the latent_entropy plugin, grow beyond
1024 byte stack size on 32-bit builds.
Link: http://lkml.kernel.org/r/20161018211216.GA39687@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The previous patch renamed several files that are cross-referenced
along the Kernel documentation. Adjust the links to point to
the right places.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This update consists of:
- Fixes and improvements to existing tests
- Moving code from Documentation to selftests, samples, and tools.
Moves dnotify_test, prctl, ptp, vDSO, ia64, watchdog, and networking
tests from Documentation to selftests.
Moves mic/mpssd, misc-devices/mei, timers, watchdog, auxdisplay, and
blackfin examples from Documentation to samples.
Moves accounting, laptops/dslm, and pcmcia/crc32hash tools from
Documentation to tools.
Deletes BUILD_DOCSRC and its dependencies.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX/6zUAAoJEAsCRMQNDUMczIEP/0kH+yjJ3El4GYIokspR1/UU
++sy4XMzrD1UPy90v+ftcg4ss5R80r0v7EZ59k1UjDJSZ6WATHHGoZKCS2Dy3xcq
i+0vm7Bawh7YWrXD3TunwaL97lwb2DdVTSxRXuU4Hfv+oVynUfh/+ZlCH6RCM2nm
ZJE5PDYiq4nTVSRqFB2FyRE6yay5dPvpQ2ArwnSEw+ku4C+ZdGTGCWzS+aZBwZM/
ykePkGLVRXz9FsWTCmipJzYu0Z/M4xEGlfXQZiiLG2HicbJNP6AqJImbQrANm+TW
RFigYpofdhr9XG5TKTLIudaRt9qB6BE0mYEApZXH8U7NrHElfO9BBMEwzajl0V/2
q/r5iej/CJult3zsfkhdHo7GLXpOaDLyoXiUI6UTgL0XOdWLAWTqDYx4JJz9sXxp
B9dwKJeP5HLipk6FMkAHgJM90JKQFd/nLDKxeWexbMu/b/yQ2C9AR7NpdQ+c1X7I
8W8UNEi/fnK75+r4t3NfeD2/5boq/jwujSKEMDQm/3R8L8EFYYb/TRoujFn89Na3
wbZLV3hBL+KQ5lRyIx7X8RKyVJv1nlo9Wh57ItJed6zvGp5EmsI8w+DER2RfbO2c
HR2JPDKSxmU8O2WBfDW5QoiPQH8Lssd147Ir0UFE7mwBXgWWsmxJxDpufizAXwyJ
qnELJ9X3UFIdydtoObLr
=60kH
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.9-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This update consists of:
- Fixes and improvements to existing tests
- Moving code from Documentation to selftests, samples, and tools:
* Moves dnotify_test, prctl, ptp, vDSO, ia64, watchdog, and
networking tests from Documentation to selftests.
* Moves mic/mpssd, misc-devices/mei, timers, watchdog, auxdisplay,
and blackfin examples from Documentation to samples.
* Moves accounting, laptops/dslm, and pcmcia/crc32hash tools from
Documentation to tools.
* Deletes BUILD_DOCSRC and its dependencies"
* tag 'linux-kselftest-4.9-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (21 commits)
selftests/futex: Check ANSI terminal color support
Doc: update 00-INDEX files to reflect the runnable code move
samples: move blackfin gptimers-example from Documentation
tools: move pcmcia crc32hash tool from Documentation
tools: move laptops dslm tool from Documentation
tools: move accounting tool from Documentation
samples: move auxdisplay example code from Documentation
samples: move watchdog example code from Documentation
samples: move timers example code from Documentation
samples: move misc-devices/mei example code from Documentation
samples: move mic/mpssd example code from Documentation
selftests: Move networking/timestamping from Documentation
selftests: move watchdog tests from Documentation/watchdog
selftests: move ia64 tests from Documentation/ia64
selftests: move vDSO tests from Documentation/vDSO
selftests: move ptp tests from Documentation/ptp
selftests: move prctl tests from Documentation/prctl
selftests: move dnotify_test from Documentation/filesystems
selftests/timers: Add missing error code assignment before test
selftests/zram: replace ZRAM_LZ4_COMPRESS
...
Move blackfin gptimers-example to samples and remove it from Documentation
Makefile. Update samples Kconfig and Makefile to build gptimers-example.
blackfin is the last CONFIG_BUILD_DOCSRC target in Documentation/Makefile.
Hence this patch also includes changes to remove CONFIG_BUILD_DOCSRC from
Makefile and lib/Kconfig.debug and updates VIDEO_PCI_SKELETON dependency
on BUILD_DOCSRC.
Documentation/Makefile is not deleted to avoid braking make htmldocs and
make distclean.
Acked-by: Michal Marek <mmarek@suse.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reported-by: Valentin Rothberg <valentinrothberg@gmail.com>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Fix the indefinitiley -> indefinitely typo in Kconfig.debug.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160922205513.17821-1-vivien.didelot@savoirfairelinux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:
1) "copy_from_user() buffer size is too small" compile warning/error
This is a static warning which happens when object size and copy size
are both const, and copy size > object size. I didn't see any false
positives for this one. So the function warning attribute seems to
be working fine here.
Note this scenario is always a bug and so I think it should be
changed to *always* be an error, regardless of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.
2) "copy_from_user() buffer size is not provably correct" compile warning
This is another static warning which happens when I enable
__compiletime_object_size() for new compilers (and
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size
is const, but copy size is *not*. In this case there's no way to
compare the two at build time, so it gives the warning. (Note the
warning is a byproduct of the fact that gcc has no way of knowing
whether the overflow function will be called, so the call isn't dead
code and the warning attribute is activated.)
So this warning seems to only indicate "this is an unusual pattern,
maybe you should check it out" rather than "this is a bug".
I get 102(!) of these warnings with allyesconfig and the
__compiletime_object_size() gcc check removed. I don't know if there
are any real bugs hiding in there, but from looking at a small
sample, I didn't see any. According to Kees, it does sometimes find
real bugs. But the false positive rate seems high.
3) "Buffer overflow detected" runtime warning
This is a runtime warning where object size is const, and copy size >
object size.
All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:
2fb0815c9e ("gcc4: disable __compiletime_object_size for GCC 4.6+")
That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size(). But in fact,
__compiletime_object_size() seems to be working fine. The false
positives were instead triggered by #2 above. (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)
So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.
Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.
Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge yet more updates from Andrew Morton:
- the rest of ocfs2
- various hotfixes, mainly MM
- quite a bit of misc stuff - drivers, fork, exec, signals, etc.
- printk updates
- firmware
- checkpatch
- nilfs2
- more kexec stuff than usual
- rapidio updates
- w1 things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
ipc: delete "nr_ipc_ns"
kcov: allow more fine-grained coverage instrumentation
init/Kconfig: add clarification for out-of-tree modules
config: add android config fragments
init/Kconfig: ban CONFIG_LOCALVERSION_AUTO with allmodconfig
relay: add global mode support for buffer-only channels
init: allow blacklisting of module_init functions
w1:omap_hdq: fix regression
w1: add helper macro module_w1_family
w1: remove need for ida and use PLATFORM_DEVID_AUTO
rapidio/switches: add driver for IDT gen3 switches
powerpc/fsl_rio: apply changes for RIO spec rev 3
rapidio: modify for rev.3 specification changes
rapidio: change inbound window size type to u64
rapidio/idt_gen2: fix locking warning
rapidio: fix error handling in mbox request/release functions
rapidio/tsi721_dma: advance queue processing from transfer submit call
rapidio/tsi721: add messaging mbox selector parameter
rapidio/tsi721: add PCIe MRRS override parameter
rapidio/tsi721_dma: add channel mask and queue size parameters
...
For more targeted fuzzing, it's better to disable kernel-wide
instrumentation and instead enable it on a per-subsystem basis. This
follows the pattern of UBSAN and allows you to compile in the kcov
driver without instrumenting the whole kernel.
To instrument a part of the kernel, you can use either
# for a single file in the current directory
KCOV_INSTRUMENT_filename.o := y
or
# for all the files in the current directory (excluding subdirectories)
KCOV_INSTRUMENT := y
or
# (same as above)
ccflags-y += $(CFLAGS_KCOV)
or
# for all the files in the current directory (including subdirectories)
subdir-ccflags-y += $(CFLAGS_KCOV)
Link: http://lkml.kernel.org/r/1464008380-11405-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kbuild updates from Michal Marek:
- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.
- reduction of the gcc commandline by Arnd Bergmann.
- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada
- bin2c fix by Michael Tautschnig
- setlocalversion fix by Wolfram Sang
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support
Currently, we store each page's allocation stacktrace on corresponding
page_ext structure and it requires a lot of memory. This causes the
problem that memory tight system doesn't work well if page_owner is
enabled. Moreover, even with this large memory consumption, we cannot
get full stacktrace because we allocate memory at boot time and just
maintain 8 stacktrace slots to balance memory consumption. We could
increase it to more but it would make system unusable or change system
behaviour.
To solve the problem, this patch uses stackdepot to store stacktrace.
It obviously provides memory saving but there is a drawback that
stackdepot could fail.
stackdepot allocates memory at runtime so it could fail if system has
not enough memory. But, most of allocation stack are generated at very
early time and there are much memory at this time. So, failure would
not happen easily. And, one failure means that we miss just one page's
allocation stacktrace so it would not be a big problem. In this patch,
when memory allocation failure happens, we store special stracktrace
handle to the page that is failed to save stacktrace. With it, user can
guess memory usage properly even if failure happens.
Memory saving looks as following. (4GB memory system with page_owner)
(before the patch -> after the patch)
static allocation:
92274688 bytes -> 25165824 bytes
dynamic allocation after boot + kernel build:
0 bytes -> 327680 bytes
total:
92274688 bytes -> 25493504 bytes
72% reduction in total.
Note that implementation looks complex than someone would imagine
because there is recursion issue. stackdepot uses page allocator and
page_owner is called at page allocation. Using stackdepot in page_owner
could re-call page allcator and then page_owner. That is a recursion.
To detect and avoid it, whenever we obtain stacktrace, recursion is
checked and page_owner is set to dummy information if found. Dummy
information means that this page is allocated for page_owner feature
itself (such as stackdepot) and it's understandable behavior for user.
[iamjoonsoo.kim@lge.com: mm-page_owner-use-stackdepot-to-store-stacktrace-v3]
Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1466150259-27727-7-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since adding the gcc plugin development headers is required for the
gcc plugin support, we should ease into this new kernel build dependency
more slowly. For now, disable the gcc plugins under COMPILE_TEST so that
all*config builds will skip it.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
This commit removes CONFIG_RCU_TORTURE_TEST_RUNNABLE in favor of the
already-existing rcutorture.torture_runnable kernel boot parameter.
It also converts an #ifdef into IS_ENABLED(), saving a few lines of code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit applies the infamous IS_ENABLED() macro to eliminate a #ifdef.
It also eliminates the RCU_PERF_TEST_RUNNABLE Kconfig option in favor
of the already-existing rcuperf.perf_runnable kernel boot parameter.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The sancov gcc plugin inserts a __sanitizer_cov_trace_pc() call
at the start of basic blocks.
This plugin is a helper plugin for the kcov feature. It supports
all gcc versions with plugin support (from gcc-4.5 on).
It is based on the gcc commit "Add fuzzing coverage support" by Dmitry Vyukov
(https://gcc.gnu.org/viewcvs/gcc?limit_changes=0&view=revision&revision=231296).
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
It appears that somehow I missed a test of the latest UUID rework which
landed in the kernel. Present a small test module to avoid such cases
in the future.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull string hash improvements from George Spelvin:
"This series does several related things:
- Makes the dcache hash (fs/namei.c) useful for general kernel use.
(Thanks to Bruce for noticing the zero-length corner case)
- Converts the string hashes in <linux/sunrpc/svcauth.h> to use the
above.
- Avoids 64-bit multiplies in hash_64() on 32-bit platforms. Two
32-bit multiplies will do well enough.
- Rids the world of the bad hash multipliers in hash_32.
This finishes the job started in commit 689de1d6ca ("Minimal
fix-up of bad hashing behavior of hash_64()")
The vast majority of Linux architectures have hardware support for
32x32-bit multiply and so derive no benefit from "simplified"
multipliers.
The few processors that do not (68000, h8/300 and some models of
Microblaze) have arch-specific implementations added. Those
patches are last in the series.
- Overhauls the dcache hash mixing.
The patch in commit 0fed3ac866 ("namei: Improve hash mixing if
CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
Replaced with a much more careful design that's simultaneously
faster and better. (My own invention, as there was noting suitable
in the literature I could find. Comments welcome!)
- Modify the hash_name() loop to skip the initial HASH_MIX(). This
would let us salt the hash if we ever wanted to.
- Sort out partial_name_hash().
The hash function is declared as using a long state, even though
it's truncated to 32 bits at the end and the extra internal state
contributes nothing to the result. And some callers do odd things:
- fs/hfs/string.c only allocates 32 bits of state
- fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes
- Modify bytemask_from_count to handle inputs of 1..sizeof(long)
rather than 0..sizeof(long)-1. This would simplify users other
than full_name_hash"
Special thanks to Bruce Fields for testing and finding bugs in v1. (I
learned some humbling lessons about "obviously correct" code.)
On the arch-specific front, the m68k assembly has been tested in a
standalone test harness, I've been in contact with the Microblaze
maintainers who mostly don't care, as the hardware multiplier is never
omitted in real-world applications, and I haven't heard anything from
the H8/300 world"
* 'hash' of git://ftp.sciencehorizons.net/linux:
h8300: Add <asm/hash.h>
microblaze: Add <asm/hash.h>
m68k: Add <asm/hash.h>
<linux/hash.h>: Add support for architecture-specific functions
fs/namei.c: Improve dcache hash function
Eliminate bad hash multipliers from hash_32() and hash_64()
Change hash_64() return value to 32 bits
<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
fs/namei.c: Add hashlen_string() function
Pull out string hash to <linux/stringhash.h>
This is just the infrastructure; there are no users yet.
This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.
That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis <alistai@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
Here's the "big" driver core update for 4.7-rc1.
Mostly just debugfs changes, the long-known and messy races with removing
debugfs files should be fixed thanks to the great work of Nicolai Stange. We
also have some isa updates in here (the x86 maintainers told me to take it
through this tree), a new warning when we run out of dynamic char major
numbers, and a few other assorted changes, details in the shortlog.
All have been in linux-next for some time with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlc/0mwACgkQMUfUDdst+ynjXACgjNxR5nMUiM8ZuuD0i4Xj7VXd
hnIAoM08+XDCv41noGdAcKv+2WZVZWMC
=i+0H
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the "big" driver core update for 4.7-rc1.
Mostly just debugfs changes, the long-known and messy races with
removing debugfs files should be fixed thanks to the great work of
Nicolai Stange. We also have some isa updates in here (the x86
maintainers told me to take it through this tree), a new warning when
we run out of dynamic char major numbers, and a few other assorted
changes, details in the shortlog.
All have been in linux-next for some time with no reported issues"
* tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
Revert "base: dd: don't remove driver_data in -EPROBE_DEFER case"
gpio: ws16c48: Utilize the ISA bus driver
gpio: 104-idio-16: Utilize the ISA bus driver
gpio: 104-idi-48: Utilize the ISA bus driver
gpio: 104-dio-48e: Utilize the ISA bus driver
watchdog: ebc-c384_wdt: Utilize the ISA bus driver
iio: stx104: Utilize the module_isa_driver and max_num_isa_dev macros
iio: stx104: Add X86 dependency to STX104 Kconfig option
Documentation: Add ISA bus driver documentation
isa: Implement the max_num_isa_dev macro
isa: Implement the module_isa_driver macro
pnp: pnpbios: Add explicit X86_32 dependency to PNPBIOS
isa: Decouple X86_32 dependency from the ISA Kconfig option
driver-core: use 'dev' argument in dev_dbg_ratelimited stub
base: dd: don't remove driver_data in -EPROBE_DEFER case
kernfs: Move faulting copy_user operations outside of the mutex
devcoredump: add scatterlist support
debugfs: unproxify files created through debugfs_create_u32_array()
debugfs: unproxify files created through debugfs_create_blob()
debugfs: unproxify files created through debugfs_create_bool()
...
Nothing prevents a dentry found by path lookup before a return of
__debugfs_remove() to actually get opened after that return. Now, after
the return of __debugfs_remove(), there are no guarantees whatsoever
regarding the memory the corresponding inode's file_operations object
had been kept in.
Since __debugfs_remove() is seldomly invoked, usually from module exit
handlers only, the race is hard to trigger and the impact is very low.
A discussion of the problem outlined above as well as a suggested
solution can be found in the (sub-)thread rooted at
http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk
("Yet another pipe related oops.")
Basically, Greg KH suggests to introduce an intermediate fops and
Al Viro points out that a pointer to the original ones may be stored in
->d_fsdata.
Follow this line of reasoning:
- Add SRCU as a reverse dependency of DEBUG_FS.
- Introduce a srcu_struct object for the debugfs subsystem.
- In debugfs_create_file(), store a pointer to the original
file_operations object in ->d_fsdata.
- Make debugfs_remove() and debugfs_remove_recursive() wait for a
SRCU grace period after the dentry has been delete()'d and before they
return to their callers.
- Introduce an intermediate file_operations object named
"debugfs_open_proxy_file_operations". It's ->open() functions checks,
under the protection of a SRCU read lock, whether the dentry is still
alive, i.e. has not been d_delete()'d and if so, tries to acquire a
reference on the owning module.
On success, it sets the file object's ->f_op to the original
file_operations and forwards the ongoing open() call to the original
->open().
- For clarity, rename the former debugfs_file_operations to
debugfs_noop_file_operations -- they are in no way canonical.
The choice of SRCU over "normal" RCU is justified by the fact, that the
former may also be used to protect ->i_private data from going away
during the execution of a file's readers and writers which may (and do)
sleep.
Finally, introduce the fs/debugfs/internal.h header containing some
declarations internal to the debugfs implementation.
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit adds a new rcuperf module that carries out simple performance
tests of RCU grace periods.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
On parisc and metag the stack grows upwards, so for those we need to
scan the stack downwards in order to calculate how much stack a process
has used.
Tested on a 64bit parisc kernel.
Signed-off-by: Helge Deller <deller@gmx.de>
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing). Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system. A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/). However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.
kcov does not aim to collect as much coverage as possible. It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g. scheduler, locking).
Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes. Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch). I've
dropped the second mode for simplicity.
This patch adds the necessary support on kernel side. The complimentary
compiler support was added in gcc revision 231296.
We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:
https://github.com/google/syzkaller/wiki/Found-Bugs
We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation". For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.
Why not gcov. Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
typical coverage can be just a dozen of basic blocks (e.g. an invalid
input). In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M). Cost of
kcov depends only on number of executed basic blocks/edges. On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.
kcov exposes kernel PCs and control flow to user-space which is
insecure. But debugfs should not be mapped as user accessible.
Based on a patch by Quentin Casasnovas.
[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull 'objtool' stack frame validation from Ingo Molnar:
"This tree adds a new kernel build-time object file validation feature
(ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
It was written by and is maintained by Josh Poimboeuf.
The motivation: there's a category of hard to find kernel bugs, most
of them in assembly code (but also occasionally in C code), that
degrades the quality of kernel stack dumps/backtraces. These bugs are
hard to detect at the source code level. Such bugs result in
incorrect/incomplete backtraces most of time - but can also in some
rare cases result in crashes or other undefined behavior.
The build time correctness checking is done via the new 'objtool'
user-space utility that was written for this purpose and which is
hosted in the kernel repository in tools/objtool/. The tool's (very
simple) UI and source code design is shaped after Git and perf and
shares quite a bit of infrastructure with tools/perf (which tooling
infrastructure sharing effort got merged via perf and is already
upstream). Objtool follows the well-known kernel coding style.
Objtool does not try to check .c or .S files, it instead analyzes the
resulting .o generated machine code from first principles: it decodes
the instruction stream and interprets it. (Right now objtool supports
the x86-64 architecture.)
From tools/objtool/Documentation/stack-validation.txt:
"The kernel CONFIG_STACK_VALIDATION option enables a host tool named
objtool which runs at compile time. It has a "check" subcommand
which analyzes every .o file and ensures the validity of its stack
metadata. It enforces a set of rules on asm code and C inline
assembly code so that stack traces can be reliable.
Currently it only checks frame pointer usage, but there are plans to
add CFI validation for C files and CFI generation for asm files.
For each function, it recursively follows all possible code paths
and validates the correct frame pointer state at each instruction.
It also follows code paths involving special sections, like
.altinstructions, __jump_table, and __ex_table, which can add
alternative execution paths to a given instruction (or set of
instructions). Similarly, it knows how to follow switch statements,
for which gcc sometimes uses jump tables."
When this new kernel option is enabled (it's disabled by default), the
tool, if it finds any suspicious assembly code pattern, outputs
warnings in compiler warning format:
warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer
... so that scripts that pick up compiler warnings will notice them.
All known warnings triggered by the tool are fixed by the tree, most
of the commits in fact prepare the kernel to be warning-free. Most of
them are bugfixes or cleanups that stand on their own, but there are
also some annotations of 'special' stack frames for justified cases
such entries to JIT-ed code (BPF) or really special boot time code.
There are two other long-term motivations behind this tool as well:
- To improve the quality and reliability of kernel stack frames, so
that they can be used for optimized live patching.
- To create independent infrastructure to check the correctness of
CFI stack frames at build time. CFI debuginfo is notoriously
unreliable and we cannot use it in the kernel as-is without extra
checking done both on the kernel side and on the build side.
The quality of kernel stack frames matters to debuggability as well,
so IMO we can merge this without having to consider the live patching
or CFI debuginfo angle"
* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
objtool: Only print one warning per function
objtool: Add several performance improvements
tools: Copy hashtable.h into tools directory
objtool: Fix false positive warnings for functions with multiple switch statements
objtool: Rename some variables and functions
objtool: Remove superflous INIT_LIST_HEAD
objtool: Add helper macros for traversing instructions
objtool: Fix false positive warnings related to sibling calls
objtool: Compile with debugging symbols
objtool: Detect infinite recursion
objtool: Prevent infinite recursion in noreturn detection
objtool: Detect and warn if libelf is missing and don't break the build
tools: Support relative directory path for 'O='
objtool: Support CROSS_COMPILE
x86/asm/decoder: Use explicitly signed chars
objtool: Enable stack metadata validation on 64-bit x86
objtool: Add CONFIG_STACK_VALIDATION option
objtool: Add tool to perform compile-time stack metadata validation
x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
sched: Always inline context_switch()
...
Pull networking updates from David Miller:
"Highlights:
1) Support more Realtek wireless chips, from Jes Sorenson.
2) New BPF types for per-cpu hash and arrap maps, from Alexei
Starovoitov.
3) Make several TCP sysctls per-namespace, from Nikolay Borisov.
4) Allow the use of SO_REUSEPORT in order to do per-thread processing
of incoming TCP/UDP connections. The muxing can be done using a
BPF program which hashes the incoming packet. From Craig Gallek.
5) Add a multiplexer for TCP streams, to provide a messaged based
interface. BPF programs can be used to determine the message
boundaries. From Tom Herbert.
6) Add 802.1AE MACSEC support, from Sabrina Dubroca.
7) Avoid factorial complexity when taking down an inetdev interface
with lots of configured addresses. We were doing things like
traversing the entire address less for each address removed, and
flushing the entire netfilter conntrack table for every address as
well.
8) Add and use SKB bulk free infrastructure, from Jesper Brouer.
9) Allow offloading u32 classifiers to hardware, and implement for
ixgbe, from John Fastabend.
10) Allow configuring IRQ coalescing parameters on a per-queue basis,
from Kan Liang.
11) Extend ethtool so that larger link mode masks can be supported.
From David Decotigny.
12) Introduce devlink, which can be used to configure port link types
(ethernet vs Infiniband, etc.), port splitting, and switch device
level attributes as a whole. From Jiri Pirko.
13) Hardware offload support for flower classifiers, from Amir Vadai.
14) Add "Local Checksum Offload". Basically, for a tunneled packet
the checksum of the outer header is 'constant' (because with the
checksum field filled into the inner protocol header, the payload
of the outer frame checksums to 'zero'), and we can take advantage
of that in various ways. From Edward Cree"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
bonding: fix bond_get_stats()
net: bcmgenet: fix dma api length mismatch
net/mlx4_core: Fix backward compatibility on VFs
phy: mdio-thunder: Fix some Kconfig typos
lan78xx: add ndo_get_stats64
lan78xx: handle statistics counter rollover
RDS: TCP: Remove unused constant
RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
net: smc911x: convert pxa dma to dmaengine
team: remove duplicate set of flag IFF_MULTICAST
bonding: remove duplicate set of flag IFF_MULTICAST
net: fix a comment typo
ethernet: micrel: fix some error codes
ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
bpf, dst: add and use dst_tclassid helper
bpf: make skb->tc_classid also readable
net: mvneta: bm: clarify dependencies
cls_bpf: reset class and reuse major in da
ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
ldmvsw: Add ldmvsw.c driver code
...
Make it possible to write a target state to the per cpu state file, so we can
switch between states.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.022814799@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add a CONFIG_STACK_VALIDATION option which will run "objtool check" for
each .o file to ensure the validity of its stack metadata.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/92baab69a6bf9bc7043af0bfca9fb964a1d45546.1456719558.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
drivers/net/phy/bcm7xxx.c
drivers/net/phy/marvell.c
drivers/net/vxlan.c
All three conflicts were cases of simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is mainly testing bitmap construction and conversion to/from u32[]
for now.
Tested:
qemu i386, x86_64, ppc, ppc64 BE and LE, ARM.
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Workqueue used to guarantee local execution for work items queued
without explicit target CPU. The guarantee is gone now which can
break some usages in subtle ways. To flush out those cases, this
patch implements a debug feature which forces round-robin CPU
selection for all such work items.
The debug feature defaults to off and can be enabled with a kernel
parameter. The default can be flipped with a debug config option.
If you hit this commit during bisection, please refer to 041bd12e27
("Revert "workqueue: make sure delayed work run in local cpu"") for
more information and ping me.
Signed-off-by: Tejun Heo <tj@kernel.org>
Merge third patch-bomb from Andrew Morton:
"I'm pretty much done for -rc1 now:
- the rest of MM, basically
- lib/ updates
- checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit
- cpu_mask simplifications
- kexec, rapidio, MAINTAINERS etc, etc.
- more dma-mapping cleanups/simplifications from hch"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits)
MAINTAINERS: add/fix git URLs for various subsystems
mm: memcontrol: add "sock" to cgroup2 memory.stat
mm: memcontrol: basic memory statistics in cgroup2 memory controller
mm: memcontrol: do not uncharge old page in page cache replacement
Documentation: cgroup: add memory.swap.{current,max} description
mm: free swap cache aggressively if memcg swap is full
mm: vmscan: do not scan anon pages if memcg swap limit is hit
swap.h: move memcg related stuff to the end of the file
mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
mm: vmscan: pass memcg to get_scan_count()
mm: memcontrol: charge swap to cgroup2
mm: memcontrol: clean up alloc, online, offline, free functions
mm: memcontrol: flatten struct cg_proto
mm: memcontrol: rein in the CONFIG space madness
net: drop tcp_memcontrol.c
mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
mm: memcontrol: allow to disable kmem accounting for cgroup2
mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
mm: memcontrol: separate kmem code from legacy tcp accounting code
...
Larry Finger reports:
"My PowerBook G4 Aluminum with a 32-bit PPC processor fails to boot for
the 4.4-git series".
This is likely due to X still needing /dev/mem access on this platform.
CONFIG_IO_STRICT_DEVMEM is not yet safe to turn on when
CONFIG_STRICT_DEVMEM=y.
Remove the default so that old configurations do not change behavior.
Fixes: 90a545e981 ("restrict /dev/mem to idle io memory ranges")
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Link: http://marc.info/?l=linux-kernel&m=145332012023825&w=2
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
UBSAN uses compile-time instrumentation to catch undefined behavior
(UB). Compiler inserts code that perform certain kinds of checks before
operations that could cause UB. If check fails (i.e. UB detected)
__ubsan_handle_* function called to print error message.
So the most of the work is done by compiler. This patch just implements
ubsan handlers printing errors.
GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.
[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/
Issues which UBSAN has found thus far are:
Found bugs:
* out-of-bounds access - 97840cb67f ("netfilter: nfnetlink: fix
insufficient validation in nfnetlink_bind")
undefined shifts:
* d48458d4a7 ("jbd2: use a better hash function for the revoke
table")
* 10632008b9 ("clockevents: Prevent shift out of bounds")
* 'x << -1' shift in ext4 -
http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com>
* undefined rol32(0) -
http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com>
* undefined dirty_ratelimit calculation -
http://lkml.kernel.org/r/<566594E2.3050306@odin.com>
* undefined roundown_pow_of_two(0) -
http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com>
* [WONTFIX] undefined shift in __bpf_prog_run -
http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com>
WONTFIX here because it should be fixed in bpf program, not in kernel.
signed overflows:
* 32a8df4e0b ("sched: Fix odd values in effective_load()
calculations")
* mul overflow in ntp -
http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com>
* incorrect conversion into rtc_time in rtc_time64_to_tm() -
http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com>
* unvalidated timespec in io_getevents() -
http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com>
* [NOTABUG] signed overflow in ktime_add_safe() -
http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com>
[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As illustrated by commit a3afe70b83 ("[S390] latencytop s390
support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to
advertise an implementation of save_stack_trace_tsk.
However, as of 9212ddb5ea ("stacktrace: provide save_stack_trace_tsk()
weak alias") a dummy implementation is provided if STACKTRACE=y. Given
that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Helge Deller <deller@gmx.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a third argument to macros which create function
definitions for page flags. This argument defines how page-flags
helpers behave on compound functions.
For now we define four policies:
- PF_ANY: the helper function operates on the page it gets, regardless
if it's non-compound, head or tail.
- PF_HEAD: the helper function operates on the head page of the
compound page if it gets tail page.
- PF_NO_TAIL: only head and non-compond pages are acceptable for this
helper function.
- PF_NO_COMPOUND: only non-compound pages are acceptable for this
helper function.
For now we use policy PF_ANY for all helpers, which matches current
behaviour.
We do not enforce the policy for TESTPAGEFLAG, because we have flags
checked for random pages all over the kernel. Noticeable exception to
this is PageTransHuge() which triggers VM_BUG_ON() for tail page.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1/ Media error handling: The 'badblocks' implementation that originated
in md-raid is up-levelled to a generic capability of a block device.
This initial implementation is limited to being consulted in the pmem
block-i/o path. Later, 'badblocks' will be consulted when creating
dax mappings.
2/ Raw block device dax: For virtualization and other cases that want
large contiguous mappings of persistent memory, add the capability to
dax-mmap a block device directly.
3/ Increased /dev/mem restrictions: Add an option to treat all io-memory
as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access while a driver is
actively using an address range. This behavior is controlled via the
new CONFIG_IO_STRICT_DEVMEM option and can be overridden by the
existing "iomem=relaxed" kernel command line option.
4/ Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
block device shutdown crash fix, and other small libnvdimm fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWlrhjAAoJEB7SkWpmfYgCFbAQALKsQfFwT6JFS+zlPgiNpbqw
2VMNKEH0AfGYGj96mT02j2q+vSUmXLMIDMTsbe0sDdtwFZtQbFmhmryzPWUVppSu
KGTlLPW8vuEhQVs91+UI3BQKkvpi0+tbR8hPOh9W6QhjpRT+lyHFKnsNR5HZy5wB
K4/VMaT5ffd5/pXRTjkYiPQYTwWyfcvNjICj0YtqhPvOwS031m77JpFsWJ8HSpEX
K99VlzNUPMXd1pYkHmFNXWw52fhRGNhwAEomLeKMdQfKms+KnbKp8BOSA0aCqU8E
kpujQcilDXJwykFQZOFI3Z5Dxvrv8lxFTU8HRMBvo3ESzfTWjfqcvyjGOjDUcruw
ihESFSJtdZzhrBiMnf9RRqSpMFJvAT8MVT6Q4D3mZUHCMPbUqFJsQjMPt9hEH3ho
4F0D2lesOCkubUKFTZmjMoDb+szuKbVhYK8TeFVVEhizinc/Aj0NKuazJqi+CXB/
xh0ER4ZxD8wvzqFFWvS5UvR1G9I5fr7+3jGRUrqGLHlSdeXP9dkEg28ao3QbWk3x
1dPOen6ZqQ9WJ/E7eGmXbVEz2R4Xd79hMXQzdQwmKDk/KbxRoAp7hyU8BslAyrBf
HCdmVt+RAgrxZYfFRXuLhqwEBThJnNrgZA3qu74FUpkpFg6xRUu1bAYBiF7N+bFi
82b5UbMkveBTtkXjJoiR
=7V5r
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"The bulk of this has appeared in -next and independently received a
build success notification from the kbuild robot. The 'for-4.5/block-
dax' topic branch was rebased over the weekend to drop the "block
device end-of-life" rework that Al would like to see re-implemented
with a notifier, and to address bug reports against the badblocks
integration.
There is pending feedback against "libnvdimm: Add a poison list and
export badblocks" received last week. Linda identified some localized
fixups that we will handle incrementally.
Summary:
- Media error handling: The 'badblocks' implementation that
originated in md-raid is up-levelled to a generic capability of a
block device. This initial implementation is limited to being
consulted in the pmem block-i/o path. Later, 'badblocks' will be
consulted when creating dax mappings.
- Raw block device dax: For virtualization and other cases that want
large contiguous mappings of persistent memory, add the capability
to dax-mmap a block device directly.
- Increased /dev/mem restrictions: Add an option to treat all
io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access
while a driver is actively using an address range. This behavior
is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be
overridden by the existing "iomem=relaxed" kernel command line
option.
- Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
block device shutdown crash fix, and other small libnvdimm fixes"
* tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits)
block: kill disk_{check|set|clear|alloc}_badblocks
libnvdimm, pmem: nvdimm_read_bytes() badblocks support
pmem, dax: disable dax in the presence of bad blocks
pmem: fail io-requests to known bad blocks
libnvdimm: convert to statically allocated badblocks
libnvdimm: don't fail init for full badblocks list
block, badblocks: introduce devm_init_badblocks
block: clarify badblocks lifetime
badblocks: rename badblocks_free to badblocks_exit
libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h
libnvdimm: Add a poison list and export badblocks
nfit_test: Enable DSMs for all test NFITs
md: convert to use the generic badblocks code
block: Add badblock management for gendisks
badblocks: Add core badblock management code
block: fix del_gendisk() vs blkdev_ioctl crash
block: enable dax for raw block devices
block: introduce bdev_file_inode()
restrict /dev/mem to idle io memory ranges
arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
...
Pull networking updates from Davic Miller:
1) Support busy polling generically, for all NAPI drivers. From Eric
Dumazet.
2) Add byte/packet counter support to nft_ct, from Floriani Westphal.
3) Add RSS/XPS support to mvneta driver, from Gregory Clement.
4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
Frederic Sowa.
5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.
6) Add support for VLAN device bridging to mlxsw switch driver, from
Ido Schimmel.
7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.
8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.
9) Reorganize wireless drivers into per-vendor directories just like we
do for ethernet drivers. From Kalle Valo.
10) Provide a way for administrators "destroy" connected sockets via the
SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.
11) Add support to add/remove multicast routes via netlink, from Nikolay
Aleksandrov.
12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.
13) Add forwarding and packet duplication facilities to nf_tables, from
Pablo Neira Ayuso.
14) Dead route support in MPLS, from Roopa Prabhu.
15) TSO support for thunderx chips, from Sunil Goutham.
16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.
17) Rationalize, consolidate, and more completely document the checksum
offloading facilities in the networking stack. From Tom Herbert.
18) Support aborting an ongoing scan in mac80211/cfg80211, from
Vidyullatha Kanchanapally.
19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
net: bnxt: always return values from _bnxt_get_max_rings
net: bpf: reject invalid shifts
phonet: properly unshare skbs in phonet_rcv()
dwc_eth_qos: Fix dma address for multi-fragment skbs
phy: remove an unneeded condition
mdio: remove an unneed condition
mdio_bus: NULL dereference on allocation error
net: Fix typo in netdev_intersect_features
net: freescale: mac-fec: Fix build error from phy_device API change
net: freescale: ucc_geth: Fix build error from phy_device API change
bonding: Prevent IPv6 link local address on enslaved devices
IB/mlx5: Add flow steering support
net/mlx5_core: Export flow steering API
net/mlx5_core: Make ipv4/ipv6 location more clear
net/mlx5_core: Enable flow steering support for the IB driver
net/mlx5_core: Initialize namespaces only when supported by device
net/mlx5_core: Set priority attributes
net/mlx5_core: Connect flow tables
net/mlx5_core: Introduce modify flow table command
net/mlx5_core: Managing root flow table
...
- Optimize boot time by detecting cards simultaneously
- Make runtime resume default behavior for MMC/SD
- Enable MMC/SD/SDIO devices to suspend/resume asynchronously
- Allow more than 8 partitions per card
- Introduce MMC_CAP2_NO_SDIO to prevent unsupported SDIO commands
- Support the standard DT wakeup-source property
- Fix driver strength switching for HS200 and HS400
- Fix switch command timeout
- Fix invalid vdd in voltage switch power cycle for SDIO
MMC host:
- sdhci: Restore behavior when setting VDD via external regulator
- sdhci: A couple of changes/fixes related to the dma support
- sdhci-tegra: Add Tegra210 support
- sdhci-tegra: Support for UHS-I cards including tuning support
- sdhci-of-at91: Add PM support
- sh_mmcif: Rework dma channel handling
- mvsdio: Delete platform data code path
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWk8SJAAoJEP4mhCVzWIwpCfoQAMS5lU1sWfiQmIEBAlTmhFXD
RdJ6VB2wZvbBXyXeSqpuhDxmPQkGFBbKDoz8SbLPhuvM0E4h+yZ7/QP5g7jghd5h
3HtsNZxlFS/lVuGGTWxwpKyY55NeFzeGzeSIJm5r4asyOyiWg2XRGkivn0kvMnUx
Pxkv2yHatVc6l570TkHrhW+iAx72Ochba2IR1C88lc8WTnYc7bFmB3w6qUoNDaKP
+Ma0QU6f0nwUGXK5lGW7RX8NGpmW7usqMT3O98i9Z28IBIvV/WGUFEwlZvhR9Jpe
SpdH6DSD+b7fwtultwipseYzo7XwhEUBsWfYyg4O/LU5qza63WQC0Ab8fM1RKAyc
Fzuyb3S8CrYGGlAYJoLIYRcK2CsnbLuLg0OoM5pMWYFl+l/jel2P9vq/Z6tSeFaD
cZFqDycbTkZ4A4dpEnf94RTucZJIMcxX6a7/M1r9oUQ5qhhC5IT74gzEZYhh7VV+
d+zIcyq1KXr17kJBx73jruaE5zJrEyyOD4Dw44zYFFbO8nsUFMy2bBoKc3spakzj
aAZhoTeUmQzrUo+sG7Edw7qJPyrBuANTvWIf714ZZ95mKYHWNOV7GufhuktSKcJF
QV7Xr+Igqb0Yh5Li06xhueRQn/uuLUhplyi+5UjYWkszXVKlTydrT4NdCxFO6EGG
fNKukKH1jHK6gFtI8yaU
=p5RF
-----END PGP SIGNATURE-----
Merge tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Optimize boot time by detecting cards simultaneously
- Make runtime resume default behavior for MMC/SD
- Enable MMC/SD/SDIO devices to suspend/resume asynchronously
- Allow more than 8 partitions per card
- Introduce MMC_CAP2_NO_SDIO to prevent unsupported SDIO commands
- Support the standard DT wakeup-source property
- Fix driver strength switching for HS200 and HS400
- Fix switch command timeout
- Fix invalid vdd in voltage switch power cycle for SDIO
MMC host:
- sdhci: Restore behavior when setting VDD via external regulator
- sdhci: A couple of changes/fixes related to the dma support
- sdhci-tegra: Add Tegra210 support
- sdhci-tegra: Support for UHS-I cards including tuning support
- sdhci-of-at91: Add PM support
- sh_mmcif: Rework dma channel handling
- mvsdio: Delete platform data code path"
* tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc: (52 commits)
mmc: dw_mmc: remove the unused quirks
mmc: sdhci-pci: use to_pci_dev()
mmc: cb710: use to_platform_device()
mmc: tegra: use correct accessor for misc ctrl register
mmc: tegra: enable UHS-I modes
mmc: tegra: implement UHS tuning
mmc: tegra: disable SPI_MODE_CLKEN
mmc: tegra: implement module external clock change
mmc: sdhci: restore behavior when setting VDD via external regulator
mmc: It is not an error for the card to be removed while suspended
mmc: block: Allow more than 8 partitions per card
mmc: core: Optimize boot time by detecting cards simultaneously
mmc: dw_mmc: use resource_size_t to store physical address
mmc: core: fix __mmc_switch timeout caused by preempt
mmc: usdhi6rol0: handle NULL data in timeout
mmc: of_mmc_spi: Add IRQF_ONESHOT to interrupt flags
mmc: mediatek: change some dev_err to dev_dbg
mmc: enable MMC/SD/SDIO device to suspend/resume asynchronously
mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
mmc: sdhci: 64-bit DMA actually has 4-byte alignment
...
This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
semantics by default. If userspace really believes it is safe to access
the memory region it can also perform the extra step of disabling an
active driver. This protects device address ranges with read side
effects and otherwise directs userspace to use the driver.
Persistent memory presents a large "mistake surface" to /dev/mem as now
accidental writes can corrupt a filesystem.
In general if a device driver is busily using a memory region it already
informs other parts of the kernel to not touch it via
request_mem_region(). /dev/mem should honor the same safety restriction
by default. Debugging a device driver from userspace becomes more
difficult with this enabled. Any application using /dev/mem or mmap of
sysfs pci resources will now need to perform the extra step of either:
1/ Disabling the driver, for example:
echo <device id> > /dev/bus/<parent bus>/drivers/<driver name>/unbind
2/ Rebooting with "iomem=relaxed" on the command line
3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n
Traditional users of /dev/mem like dosemu are unaffected because the
first 1MB of memory is not subject to the IO_STRICT_DEVMEM restriction.
Legacy X configurations use /dev/mem to talk to graphics hardware, but
that functionality has since moved to kernel graphics drivers.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Let all the archs that implement devmem_is_allowed() opt-in to a common
definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.
Cc: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko: drop 'default y' for s390]
Acked-by: Ingo Molnar <mingo@redhat.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fault-injection capability for MMC IO uses debugfs entries to configure
the attributes.
FAULT_INJECTION_DEBUG_FS must be enabled to use FAIL_MMC_REQUEST.
Replace FAULT_INJECTION with FAULT_INJECTION_DEBUG_FS.
Also remove 'select DEBUG_FS' since FAULT_INJECTION_DEBUG_FS depends on
it.
Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.
To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.
BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...
The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.
v2: Decoupled from softlockup control knobs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
This module allows to insert errors in some of netdevice's notifier
events. All network drivers use these notifiers to signal various events
and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
afterwards. Until recently I had to run failure tests by injecting
a custom module, but now this infrastructure makes it trivial to test
these failure paths. Some of the recent bugs I fixed were found using
this module.
Here's an example:
$ cd /sys/kernel/debug/notifier-error-inject/netdev
$ echo -22 > actions/NETDEV_CHANGEMTU/error
$ ip link set eth0 mtu 1024
RTNETLINK answers: Invalid argument
CC: Akinobu Mita <akinobu.mita@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev <netdev@vger.kernel.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWP91+AAoJENkgDmzRrbjxBAAQAJDFEKrMmdhyX56R058RFW1q
pYK3XrHrVMCrJRa80UH6MStvCzkR5yYU7q81XAOfl+/9TR3IIi6EPMIC6wYSyiXC
JpfnUISEJAuDMYOT19xeFDt2c7oknJnkOM7QWQt6ypY5sGWVHQ3KQUmkqlzaxQ5C
Oen9CfFttugmmpO6KDCfIxtMvxkQ1LM6SoTAKTu7LamcVsBCp5It2Me9UwGUxADj
1Phq14U8heJ9ScNYkroutEkWgyZLFJOZExUuNEIMwyooXmWQmZzBiwVwQ72WjstG
2jj3ZiLucVYvBM4k8qnGnlMR4IkymcYlXD1YJ0X7tvBFnp7UGXFKLt2NSqfOskLC
2fRPETf4PLHebZeNN/J/WKJ7qKzsBsS49KjFjJ2vm4+P6sScmcDGXw4eMyLTYfnJ
dRbuRtZpnJV4S1vss/STjehOA8A8/fURXQwb80AUzzEEfmjujZWCMYVhfqO91+kx
XsbtSciek+Abxyh9Ow9xHgVnMcsXgmZMkpODv4Gjc/4R6Uu6XRSVK04jvkuoLVi5
t4VC00NK0WY2PFVK3qGYE5ZejPTOu59UGRLwxDqZ0QmXF36Yun9f//hSDWpM10BO
Ah92OybEnny4tij7/0xz7Krg7u8BQ+at0TAmxrw4Xu9VqbnRqcpJy9Q04e52mwTu
G6ztYV2tOGMEh5lK2k0S
=WtDm
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"Nothing exciting, minor tweaks and cleanups"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
scripts: [modpost] add new sections to white list
modpost: Add flag -E for making section mismatches fatal
params: don't ignore the rest of cmdline if parse_one() fails
modpost: abort if a module symbol is too long
This adds a simple module for testing the kernel's printf facilities.
Previously, some %p extensions have caused a wrong return value in case
the entire output didn't fit and/or been unusable in kasprintf(). This
should help catch such issues. Also, it should help ensure that changes
to the formatting algorithms don't break anything.
I'm not sure if we have a struct dentry or struct file lying around at
boot time or if we can fake one, but most %p extensions should be
testable, as should the ordinary number and string formatting.
The nature of vararg functions means we can't use a more conventional
table-driven approach.
For now, this is mostly a skeleton; contributions are very
welcome. Some tests are/will be slightly annoying to write, since the
expected output depends on stuff like CONFIG_*, sizeof(long), runtime
values etc.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Martin Kletzander <mkletzan@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the kernel compiled with KASAN=y, GCC adds redzones for each
variable on stack. This enlarges function's stack frame and causes:
'warning: the frame size of X bytes is larger than Y bytes'
The worst case I've seen for now is following:
../net/wireless/nl80211.c: In function `nl80211_send_wiphy':
../net/wireless/nl80211.c:1731:1: warning: the frame size of 5448 bytes is larger than 2048 bytes [-Wframe-larger-than=]
That kind of warning becomes useless with KASAN=y. It doesn't
necessarily indicate that there is some problem in the code, thus we
should turn it off.
(The KASAN=y stack size in increased from 16k to 32k for this reason)
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Abylay Ospan <aospan@netup.ru>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Kozlov Sergey <serjk@netup.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The section mismatch warning can be easy to miss during the kernel build
process. Allow it to be marked as fatal to be easily caught and prevent
bugs from slipping in.
Setting CONFIG_SECTION_MISMATCH_WARN_ONLY=y causes these warnings to be
non-fatal, since there are a number of section mismatches when using
allmodconfig on some architectures, and we do not want to break these
builds by default.
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Change-Id: Ic346706e3297c9f0d790e3552aa94e5cff9897a6
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pull locking and atomic updates from Ingo Molnar:
"Main changes in this cycle are:
- Extend atomic primitives with coherent logic op primitives
(atomic_{or,and,xor}()) and deprecate the old partial APIs
(atomic_{set,clear}_mask())
The old ops were incoherent with incompatible signatures across
architectures and with incomplete support. Now every architecture
supports the primitives consistently (by Peter Zijlstra)
- Generic support for 'relaxed atomics':
- _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
- atomic_read_acquire()
- atomic_set_release()
This came out of porting qwrlock code to arm64 (by Will Deacon)
- Clean up the fragile static_key APIs that were causing repeat bugs,
by introducing a new one:
DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);
which define a key of different types with an initial true/false
value.
Then allow:
static_branch_likely()
static_branch_unlikely()
to take a key of either type and emit the right instruction for the
case. To be able to know the 'type' of the static key we encode it
in the jump entry (by Peter Zijlstra)
- Static key self-tests (by Jason Baron)
- qrwlock optimizations (by Waiman Long)
- small futex enhancements (by Davidlohr Bueso)
- ... and misc other changes"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
jump_label/x86: Work around asm build bug on older/backported GCCs
locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
locking/static_keys: Make verify_keys() static
jump label, locking/static_keys: Update docs
locking/static_keys: Provide a selftest
jump_label: Provide a self-test
s390/uaccess, locking/static_keys: employ static_branch_likely()
x86, tsc, locking/static_keys: Employ static_branch_likely()
locking/static_keys: Add selftest
locking/static_keys: Add a new static_key interface
locking/static_keys: Rework update logic
locking/static_keys: Add static_key_{en,dis}able() helpers
...
No one uses this anymore, and this is not the first time the
idea of replacing it with a (now possible) userspace side.
Lock stealing logic was removed long ago in when the lock
was granted to the highest prio.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1435782588-4177-2-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Although futexes are well known for being a royal pita,
we really have very little debugging capabilities - except
for relying on tglx's eye half the time.
By simply making use of the existing fault-injection machinery,
we can improve this situation, allowing generating artificial
uaddress faults and deadlock scenarios. Of course, when this is
disabled in production systems, the overhead for failure checks
is practically zero -- so this is very cheap at the same time.
Future work would be nice to now enhance trinity to make use of
this.
There is a special tunable 'ignore-private', which can filter
out private futexes. Given the tsk->make_it_fail filter and
this option, pi futexes can be narrowed down pretty closely.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <darren@dvhart.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Link: http://lkml.kernel.org/r/1435645562-975-3-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The CONFIG_RCU_CPU_STALL_INFO has been default-y for a couple of
releases with no complaints, so it is time to eliminate this Kconfig
option entirely, so that the long-form RCU CPU stall warnings cannot
be disabled. This commit does just that.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit applies some warning-omission micro-optimizations to RCU's
various extended-quiescent-state functions, which are on the kernel/user
hotpath for CONFIG_NO_HZ_FULL=y.
Reported-by: Rik van Riel <riel@redhat.com>
Reported by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, Kconfig will ask the user whether TASKS_RCU should be set.
This is silly because Kconfig already has all the information that it
needs to set this parameter. This commit therefore directly drives
the value of TASKS_RCU via "select" statements. Which means that
as subsystems require TASKS_RCU, those subsystems will need to add
"select" statements of their own.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Grace-period scans of the rcu_node combining tree normally
proceed quite quickly, so that it is very difficult to reproduce
races against them. This commit therefore allows grace-period
pre-initialization and cleanup to be artificially slowed down,
increasing race-reproduction probability. A pair of pairs of new
Kconfig parameters are provided, RCU_TORTURE_TEST_SLOW_PREINIT to
enable the slowing down of propagating CPU-hotplug changes up the
combining tree along with RCU_TORTURE_TEST_SLOW_PREINIT_DELAY to
specify the delay in jiffies, and RCU_TORTURE_TEST_SLOW_CLEANUP
to enable the slowing down of the end-of-grace-period cleanup scan
along with RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY to specify the delay
in jiffies. Boot-time parameters named rcutree.gp_preinit_delay and
rcutree.gp_cleanup_delay allow these delays to be specified at boot time.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pull RCU fix from Ingo Molnar:
"An RCU Kconfig fix that eliminates an annoying interactive kconfig
question for CONFIG_RCU_TORTURE_TEST_SLOW_INIT"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Control grace-period delays directly from value
Pull RCU fix from Paul E. McKenney:
"This series contains a single change that fixes Kconfig asking pointless
questions."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In a misguided attempt to avoid an #ifdef, the use of the
gp_init_delay module parameter was conditioned on the corresponding
RCU_TORTURE_TEST_SLOW_INIT Kconfig variable, using IS_ENABLED() at
the point of use in the code. This meant that the compiler always saw
the delay, which meant that RCU_TORTURE_TEST_SLOW_INIT_DELAY had to be
unconditionally defined. This in turn caused "make oldconfig" to ask
pointless questions about the value of RCU_TORTURE_TEST_SLOW_INIT_DELAY
in cases where it was not even used.
This commit avoids these pointless questions by defining gp_init_delay
under #ifdef. In one branch, gp_init_delay is initialized to
RCU_TORTURE_TEST_SLOW_INIT_DELAY and is also a module parameter (thus
allowing boot-time modification), and in the other branch gp_init_delay
is a const variable initialized by default to zero.
This approach also simplifies the code at the delay point by eliminating
the IS_DEFINED(). Because gp_init_delay is constant zero in the no-delay
case intended for production use, the "gp_init_delay > 0" check causes
the delay to become dead code, as desired in this case. In addition,
this commit replaces magic constant "10" with the preprocessor variable
PER_RCU_NODE_PERIOD, which controls the number of grace periods that
are allowed to elapse at full speed before a delay is inserted.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Merge first patchbomb from Andrew Morton:
- arch/sh updates
- ocfs2 updates
- kernel/watchdog feature
- about half of mm/
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits)
Documentation: update arch list in the 'memtest' entry
Kconfig: memtest: update number of test patterns up to 17
arm: add support for memtest
arm64: add support for memtest
memtest: use phys_addr_t for physical addresses
mm: move memtest under mm
mm, hugetlb: abort __get_user_pages if current has been oom killed
mm, mempool: do not allow atomic resizing
memcg: print cgroup information when system panics due to panic_on_oom
mm: numa: remove migrate_ratelimited
mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
mm: split ET_DYN ASLR from mmap ASLR
s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
mm: expose arch_mmap_rnd when available
s390: standardize mmap_rnd() usage
powerpc: standardize mmap_rnd() usage
mips: extract logic for mmap_rnd()
arm64: standardize mmap_rnd() usage
x86: standardize mmap_rnd() usage
arm: factor out mmap ASLR into mmap_rnd
...
Additional test patterns for memtest were introduced since commit
63823126c2 ("x86: memtest: add additional (regular) test patterns"),
but looks like Kconfig was not updated that time.
Update Kconfig entry with the actual number of maximum test patterns.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memtest is a simple feature which fills the memory with a given set of
patterns and validates memory contents, if bad memory regions is detected
it reserves them via memblock API. Since memblock API is widely used by
other architectures this feature can be enabled outside of x86 world.
This patch set promotes memtest to live under generic mm umbrella and
enables memtest feature for arm/arm64.
It was reported that this patch set was useful for tracking down an issue
with some errant DMA on an arm64 platform.
This patch (of 6):
There is nothing platform dependent in the core memtest code, so other
platforms might benefit from this feature too.
[linux@roeck-us.net: MEMTEST depends on MEMBLOCK]
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull RCU changes from Ingo Molnar:
"The main changes in this cycle were:
- changes permitting use of call_rcu() and friends very early in
boot, for example, before rcu_init() is invoked.
- add in-kernel API to enable and disable expediting of normal RCU
grace periods.
- improve RCU's handling of (hotplug-) outgoing CPUs.
- NO_HZ_FULL_SYSIDLE fixes.
- tiny-RCU updates to make it more tiny.
- documentation updates.
- miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well
cpu: Defer smpboot kthread unparking until CPU known to scheduler
rcu: Associate quiescent-state reports with grace period
rcu: Yet another fix for preemption and CPU hotplug
rcu: Add diagnostics to grace-period cleanup
rcutorture: Default to grace-period-initialization delays
rcu: Handle outgoing CPUs on exit from idle loop
cpu: Make CPU-offline idle-loop transition point more precise
rcu: Eliminate ->onoff_mutex from rcu_node structure
rcu: Process offlining and onlining only at grace-period start
rcu: Move rcu_report_unblock_qs_rnp() to common code
rcu: Rework preemptible expedited bitmask handling
rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs
rcutorture: Enable slow grace-period initializations
rcu: Provide diagnostic option to slow down grace-period initialization
rcu: Detect stalls caused by failure to propagate up rcu_node tree
rcu: Eliminate empty HOTPLUG_CPU ifdef
rcu: Simplify sync_rcu_preempt_exp_init()
rcu: Put all orphan-callback-related code under same comment
rcu: Consolidate offline-CPU callback initialization
...
Recently there's been requests for better sanity
checking in the time code, so that it's more clear
when something is going wrong, since timekeeping issues
could manifest in a large number of strange ways in
various subsystems.
Thus, this patch adds some extra infrastructure to
add a check to update_wall_time() to print two new
warnings:
1) if we see the call delayed beyond the 'max_cycles'
overflow point,
2) or if we see the call delayed beyond the clocksource's
'max_idle_ns' value, which is currently 50% of the
overflow point.
This extra infrastructure is conditional on
a new CONFIG_DEBUG_TIMEKEEPING option, also
added in this patch - default off.
Tested this a bit by halting qemu for specified
lengths of time to trigger the warnings.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1426133800-29329-5-git-send-email-john.stultz@linaro.org
[ Improved the changelog and the messages a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Given that CPU-hotplug events are now applied only at the starts of
grace periods, it makes sense to unconditionally enable slow grace-period
initialization for rcutorture testing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Grace-period initialization normally proceeds quite quickly, so
that it is very difficult to reproduce races against grace-period
initialization. This commit therefore allows grace-period
initialization to be artificially slowed down, increasing
race-reproduction probability. A pair of new Kconfig parameters are
provided, CONFIG_RCU_TORTURE_TEST_SLOW_INIT to enable the slowdowns, and
CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY to specify the number of jiffies
of slowdown to apply. A boot-time parameter named rcutree.gp_init_delay
allows boot-time delay to be specified. By default, no delay will be
applied even if CONFIG_RCU_TORTURE_TEST_SLOW_INIT is set.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
In the past, it has been useful to enable PROVE_LOCKING without also
enabling PROVE_RCU. However, experience with PROVE_RCU over the past
few years has demonstrated its usefulness, so this commit makes
PROVE_LOCKING directly imply PROVE_RCU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This provides the basic infrastructure to load kernel-specific python
helper scripts when debugging the kernel in gdb.
The loading mechanism is based on gdb loading for <objfile>-gdb.py when
opening <objfile>. Therefore, this places a corresponding link to the
main helper script into the output directory that contains vmlinux.
The main scripts will pull in submodules containing Linux specific gdb
commands and functions. To avoid polluting the source directory with
compiled python modules, we link to them from the object directory.
Due to gdb.parse_and_eval and string redirection for gdb.execute, we
depend on gdb >= 7.2.
This feature is enabled via CONFIG_GDB_SCRIPTS.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Michal Marek <mmarek@suse.cz> [kbuild stuff]
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kernel Address sanitizer (KASan) is a dynamic memory error detector. It
provides fast and comprehensive solution for finding use-after-free and
out-of-bounds bugs.
KASAN uses compile-time instrumentation for checking every memory access,
therefore GCC > v4.9.2 required. v4.9.2 almost works, but has issues with
putting symbol aliases into the wrong section, which breaks kasan
instrumentation of globals.
This patch only adds infrastructure for kernel address sanitizer. It's
not available for use yet. The idea and some code was borrowed from [1].
Basic idea:
The main idea of KASAN is to use shadow memory to record whether each byte
of memory is safe to access or not, and use compiler's instrumentation to
check the shadow memory on each memory access.
Address sanitizer uses 1/8 of the memory addressable in kernel for shadow
memory and uses direct mapping with a scale and offset to translate a
memory address to its corresponding shadow address.
Here is function to translate address to corresponding shadow address:
unsigned long kasan_mem_to_shadow(unsigned long addr)
{
return (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET;
}
where KASAN_SHADOW_SCALE_SHIFT = 3.
So for every 8 bytes there is one corresponding byte of shadow memory.
The following encoding used for each shadow byte: 0 means that all 8 bytes
of the corresponding memory region are valid for access; k (1 <= k <= 7)
means that the first k bytes are valid for access, and other (8 - k) bytes
are not; Any negative value indicates that the entire 8-bytes are
inaccessible. Different negative values used to distinguish between
different kinds of inaccessible memory (redzones, freed memory) (see
mm/kasan/kasan.h).
To be able to detect accesses to bad memory we need a special compiler.
Such compiler inserts a specific function calls (__asan_load*(addr),
__asan_store*(addr)) before each memory access of size 1, 2, 4, 8 or 16.
These functions check whether memory region is valid to access or not by
checking corresponding shadow memory. If access is not valid an error
printed.
Historical background of the address sanitizer from Dmitry Vyukov:
"We've developed the set of tools, AddressSanitizer (Asan),
ThreadSanitizer and MemorySanitizer, for user space. We actively use
them for testing inside of Google (continuous testing, fuzzing,
running prod services). To date the tools have found more than 10'000
scary bugs in Chromium, Google internal codebase and various
open-source projects (Firefox, OpenSSL, gcc, clang, ffmpeg, MySQL and
lots of others): [2] [3] [4].
The tools are part of both gcc and clang compilers.
We have not yet done massive testing under the Kernel AddressSanitizer
(it's kind of chicken and egg problem, you need it to be upstream to
start applying it extensively). To date it has found about 50 bugs.
Bugs that we've found in upstream kernel are listed in [5].
We've also found ~20 bugs in out internal version of the kernel. Also
people from Samsung and Oracle have found some.
[...]
As others noted, the main feature of AddressSanitizer is its
performance due to inline compiler instrumentation and simple linear
shadow memory. User-space Asan has ~2x slowdown on computational
programs and ~2x memory consumption increase. Taking into account that
kernel usually consumes only small fraction of CPU and memory when
running real user-space programs, I would expect that kernel Asan will
have ~10-30% slowdown and similar memory consumption increase (when we
finish all tuning).
I agree that Asan can well replace kmemcheck. We have plans to start
working on Kernel MemorySanitizer that finds uses of unitialized
memory. Asan+Msan will provide feature-parity with kmemcheck. As
others noted, Asan will unlikely replace debug slab and pagealloc that
can be enabled at runtime. Asan uses compiler instrumentation, so even
if it is disabled, it still incurs visible overheads.
Asan technology is easily portable to other architectures. Compiler
instrumentation is fully portable. Runtime has some arch-dependent
parts like shadow mapping and atomic operation interception. They are
relatively easy to port."
Comparison with other debugging features:
========================================
KMEMCHECK:
- KASan can do almost everything that kmemcheck can. KASan uses
compile-time instrumentation, which makes it significantly faster than
kmemcheck. The only advantage of kmemcheck over KASan is detection of
uninitialized memory reads.
Some brief performance testing showed that kasan could be
x500-x600 times faster than kmemcheck:
$ netperf -l 30
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
no debug: 87380 16384 16384 30.00 41624.72
kasan inline: 87380 16384 16384 30.00 12870.54
kasan outline: 87380 16384 16384 30.00 10586.39
kmemcheck: 87380 16384 16384 30.03 20.23
- Also kmemcheck couldn't work on several CPUs. It always sets
number of CPUs to 1. KASan doesn't have such limitation.
DEBUG_PAGEALLOC:
- KASan is slower than DEBUG_PAGEALLOC, but KASan works on sub-page
granularity level, so it able to find more bugs.
SLUB_DEBUG (poisoning, redzones):
- SLUB_DEBUG has lower overhead than KASan.
- SLUB_DEBUG in most cases are not able to detect bad reads,
KASan able to detect both reads and writes.
- In some cases (e.g. redzone overwritten) SLUB_DEBUG detect
bugs only on allocation/freeing of object. KASan catch
bugs right before it will happen, so we always know exact
place of first bad read/write.
[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
[2] https://code.google.com/p/address-sanitizer/wiki/FoundBugs
[3] https://code.google.com/p/thread-sanitizer/wiki/FoundBugs
[4] https://code.google.com/p/memory-sanitizer/wiki/FoundBugs
[5] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel#Trophies
Based on work by Andrey Konovalov.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Test different scenarios of function calls located in lib/hexdump.c.
Currently hex_dump_to_buffer() is only tested and test data is provided
for little endian CPUs.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) More iov_iter conversion work from Al Viro.
[ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
wrong, and this pull actually adds an extra commit on top of the
branch I'm pulling to fix that up, so that the pre-merge state is
ok. - Linus ]
2) Various optimizations to the ipv4 forwarding information base trie
lookup implementation. From Alexander Duyck.
3) Remove sock_iocb altogether, from CHristoph Hellwig.
4) Allow congestion control algorithm selection via routing metrics.
From Daniel Borkmann.
5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.
6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.
7) Add xmit_more support to r8169, e1000, and e1000e drivers. From
Florian Westphal.
8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.
9) Add BPF packet actions to packet scheduler, from Jiri Pirko.
10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.
11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
Kwok.
12) More sanely handle out-of-window dupacks, which can result in
serious ACK storms. From Neal Cardwell.
13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
Patrick McHardy, and Thomas Graf.
14) Support xmit_more in be2net, from Sathya Perla.
15) Group Policy extensions for vxlan, from Thomas Graf.
16) Remove Checksum Offload support for vxlan, from Tom Herbert.
17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From
Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
crypto: fix af_alg_make_sg() conversion to iov_iter
ipv4: Namespecify TCP PMTU mechanism
i40e: Fix for stats init function call in Rx setup
tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
ipv6: Make __ipv6_select_ident static
ipv6: Fix fragment id assignment on LE arches.
bridge: Fix inability to add non-vlan fdb entry
net: Mellanox: Delete unnecessary checks before the function call "vunmap"
cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
net: dsa: Remove redundant phy_attach()
IB/mlx4: Reset flow support for IB kernel ULPs
IB/mlx4: Always use the correct port for mirrored multicast attachments
net/bonding: Fix potential bad memory access during bonding events
tipc: remove tipc_snprintf
tipc: nl compat add noop and remove legacy nl framework
tipc: convert legacy nl stats show to nl compat
tipc: convert legacy nl net id get to nl compat
tipc: convert legacy nl net id set to nl compat
...
Pull trivial tree changes from Jiri Kosina:
"Patches from trivial.git that keep the world turning around.
Mostly documentation and comment fixes, and a two corner-case code
fixes from Alan Cox"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
kexec, Kconfig: spell "architecture" properly
mm: fix cleancache debugfs directory path
blackfin: mach-common: ints-priority: remove unused function
doubletalk: probe failure causes OOPS
ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
msdos_fs.h: fix 'fields' in comment
scsi: aic7xxx: fix comment
ARM: l2c: fix comment
ibmraid: fix writeable attribute with no store method
dynamic_debug: fix comment
doc: usbmon: fix spelling s/unpriviledged/unprivileged/
x86: init_mem_mapping(): use capital BIOS in comment
Allow the selftest on the resizable hash table to be built modular, just
like all other tests that do not depend on DEBUG_KERNEL.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RCU_CPU_STALL_INFO code has been in for quite some time, and has
proven reliable. This commit therefore enables it by default.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.
The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.
If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.
text data bss dec hex filename
2007 0 0 2007 7d7 kernel/rcu/srcu.o
Size of arch/powerpc/boot/zImage changes from
text data bss dec hex filename
831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before
829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after
so the savings are about ~2000 bytes.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
This is the page owner tracking code which is introduced so far ago. It
is resident on Andrew's tree, though, nobody tried to upstream so it
remain as is. Our company uses this feature actively to debug memory leak
or to find a memory hogger so I decide to upstream this feature.
This functionality help us to know who allocates the page. When
allocating a page, we store some information about allocation in extra
memory. Later, if we need to know status of all pages, we can get and
analyze it from this stored information.
In previous version of this feature, extra memory is statically defined in
struct page, but, in this version, extra memory is allocated outside of
struct page. It enables us to turn on/off this feature at boottime
without considerable memory waste.
Although we already have tracepoint for tracing page allocation/free,
using it to analyze page owner is rather complex. We need to enlarge the
trace buffer for preventing overlapping until userspace program launched.
And, launched program continually dump out the trace buffer for later
analysis and it would change system behaviour with more possibility rather
than just keeping it in memory, so bad for debug.
Moreover, we can use page_owner feature further for various purposes. For
example, we can use it for fragmentation statistics implemented in this
patch. And, I also plan to implement some CMA failure debugging feature
using this interface.
I'd like to give the credit for all developers contributed this feature,
but, it's not easy because I don't know exact history. Sorry about that.
Below is people who has "Signed-off-by" in the patches in Andrew's tree.
Contributor:
Alexander Nyberg <alexn@dsv.su.se>
Mel Gorman <mgorman@suse.de>
Dave Hansen <dave@linux.vnet.ibm.com>
Minchan Kim <minchan@kernel.org>
Michal Nazarewicz <mina86@mina86.com>
Andrew Morton <akpm@linux-foundation.org>
Jungsoo Son <jungsoo.son@lge.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The CONFIG_RCU_CPU_STALL_VERBOSE Kconfig parameter causes preemptible
RCU's CPU stall warnings to dump out any preempted tasks that are blocking
the current RCU grace period. This information is useful, and the default
has been CONFIG_RCU_CPU_STALL_VERBOSE=y for some years. It is therefore
time for this commit to remove this Kconfig parameter, so that future
kernel builds will always act as if CONFIG_RCU_CPU_STALL_VERBOSE=y.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The "_MODULE" suffix is reserved for tristates compiled as loadable kernel
modules (LKM). The "TEST_MODULE" feature thereby violates this
convention. The feature is used to compile the lib/test_module.c kernel
module.
Sadly this convention is not made explicit, but the Kconfig code documents
it. The following code (./scripts/kconfig/confdata.c) is used to generate
the autoconf.h header file during the build process. When a feature is
selected as a kernel module ('m'), it is suffixed with "_MODULE" to
indicate it.
switch (*value) {
case 'n':
break;
case 'm':
suffix = "_MODULE";
/* fall through */
This causes problems for static code analysis, which assumes a consistent
use of the "_MODULE" suffix.
This patch renames the feature and its reference in a Makefile to
"TEST_LKM", which still expresses the test of a LKM.
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
Hansen)
- Various sched/idle refinements for better idle handling (Nicolas
Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)
- sched/numa updates and optimizations (Rik van Riel)
- sysbench speedup (Vincent Guittot)
- capacity calculation cleanups/refactoring (Vincent Guittot)
- Various cleanups to thread group iteration (Oleg Nesterov)
- Double-rq-lock removal optimization and various refactorings
(Kirill Tkhai)
- various sched/deadline fixes
... and lots of other changes"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
sched/fair: Delete resched_cpu() from idle_balance()
sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
sched: Improve sysbench performance by fixing spurious active migration
sched/x86: Fix up typo in topology detection
x86, sched: Add new topology for multi-NUMA-node CPUs
sched/rt: Use resched_curr() in task_tick_rt()
sched: Use rq->rd in sched_setaffinity() under RCU read lock
sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
sched: Use dl_bw_of() under RCU read lock
sched/fair: Remove duplicate code from can_migrate_task()
sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
sched: print_rq(): Don't use tasklist_lock
sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
sched: Fix the task-group check in tg_has_rt_tasks()
sched/fair: Leverage the idle state info when choosing the "idlest" cpu
sched: Let the scheduler see CPU idle states
sched/deadline: Fix inter- exclusive cpusets migrations
sched/deadline: Clear dl_entity params when setscheduling to different class
sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
...
Pull core locking updates from Ingo Molnar:
"The main updates in this cycle were:
- mutex MCS refactoring finishing touches: improve comments, refactor
and clean up code, reduce debug data structure footprint, etc.
- qrwlock finishing touches: remove old code, self-test updates.
- small rwsem optimization
- various smaller fixes/cleanups"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Revert qrwlock recusive stuff
locking/rwsem: Avoid double checking before try acquiring write lock
locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition
locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
locking/rwlock, x86: Clean up asm/spinlock*.h to remove old rwlock code
locking/semaphore: Resolve some shadow warnings
locking/selftest: Support queued rwlock
locking/lockdep: Restrict the use of recursive read_lock() with qrwlock
locking/spinlocks: Always evaluate the second argument of spin_lock_nested()
locking/Documentation: Update locking/mutex-design.txt disadvantages
locking/Documentation: Move locking related docs into Documentation/locking/
locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate
locking/mutexes: Refactor optimistic spinning code
locking/mcs: Remove obsolete comment
locking/mutexes: Document quick lock release when unlocking
locking/mutexes: Standardize arguments in lock/unlock slowpaths
locking: Remove deprecated smp_mb__() barriers
1.
the library includes a trivial set of BPF syscall wrappers:
int bpf_create_map(int key_size, int value_size, int max_entries);
int bpf_update_elem(int fd, void *key, void *value);
int bpf_lookup_elem(int fd, void *key, void *value);
int bpf_delete_elem(int fd, void *key);
int bpf_get_next_key(int fd, void *key, void *next_key);
int bpf_prog_load(enum bpf_prog_type prog_type,
const struct sock_filter_int *insns, int insn_len,
const char *license);
bpf_prog_load() stores verifier log into global bpf_log_buf[] array
and BPF_*() macros to build instructions
2.
test stubs configure eBPF infra with 'unspec' map and program types.
These are fake types used by user space testsuite only.
3.
verifier tests valid and invalid programs and expects predefined
error log messages from kernel.
40 tests so far.
$ sudo ./test_verifier
#0 add+sub+mul OK
#1 unreachable OK
#2 unreachable2 OK
#3 out of range jump OK
#4 out of range jump2 OK
#5 test1 ld_imm64 OK
...
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.
This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: aneesh.kumar@linux.vnet.ibm.com
Cc: dzickus@redhat.com
Cc: bmr@redhat.com
Cc: jcastillo@redhat.com
Cc: oleg@redhat.com
Cc: riel@redhat.com
Cc: prarit@redhat.com
Cc: jgh@redhat.com
Cc: minchan@kernel.org
Cc: mpe@ellerman.id.au
Cc: tglx@linutronix.de
Cc: rostedt@goodmis.org
Cc: hannes@cmpxchg.org
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lubomir Rintel <lkundrak@v3.sk>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1410527779-8133-4-git-send-email-atomlin@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I was puzzled why /proc/$$/stack had disappeared, until I figured out I
had disabled the last debug option that did a 'select STACKTRACE'. This
patch makes the option show up at config time, so it can be enabled
without enabling any of the more heavyweight debug options.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We really don't want distro's enabling this in their kernels. Try and
make that more clear.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Pull kbuild updates from Michal Marek:
- make clean also considers $(extra-m) and $(extra-) to be consistent
- cleanup and fixes in scripts/Makefile.host
- allow to override the name of the Python 2 executable with make
PYTHON=... (only needed for ia64 in practice)
- option to split debugingo into *.dwo files to save disk space if the
compiler supports it (CONFIG_DEBUG_INFO_SPLIT)
- option to use dwarf4 debuginfo if the compiler supports it
(CONFIG_DEBUG_INFO_DWARF4)
- fix for disabling certain warnings with clang
- fix for unneeded rebuild with dash when a command contains
backslashes
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Fix handling of backslashes in *.cmd files
kbuild, LLVMLinux: Supress warnings unless W=1-3
Kbuild: Add a option to enable dwarf4 v2
kbuild: Support split debug info v4
kbuild: allow to override Python command name
kbuild: clean-up and bug fix of scripts/Makefile.host
kbuild: clean up scripts/Makefile.host
kbuild: drop shared library support from Makefile.host
kbuild: fix a bug of C++ host program handling
kbuild: fix a typo in scripts/Makefile.host
scripts/Makefile.clean: clean also $(extra-m) and $(extra-)
Commit a8fe19ebfb ("kernel/printk: use symbolic defines for console
loglevels") makes consistent use of symbolic values for printk() log
levels.
The naming scheme used is different from the one used for
DEFAULT_MESSAGE_LOGLEVEL though. Change that symbol name to be
MESSAGE_LOGLEVEL_DEFAULT for consistency. And because the value of that
symbol comes from a similarly-named config option, rename
CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jan Kara <jack@suse.cz>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights:
1) Steady transitioning of the BPF instructure to a generic spot so
all kernel subsystems can make use of it, from Alexei Starovoitov.
2) SFC driver supports busy polling, from Alexandre Rames.
3) Take advantage of hash table in UDP multicast delivery, from David
Held.
4) Lighten locking, in particular by getting rid of the LRU lists, in
inet frag handling. From Florian Westphal.
5) Add support for various RFC6458 control messages in SCTP, from
Geir Ola Vaagland.
6) Allow to filter bridge forwarding database dumps by device, from
Jamal Hadi Salim.
7) virtio-net also now supports busy polling, from Jason Wang.
8) Some low level optimization tweaks in pktgen from Jesper Dangaard
Brouer.
9) Add support for ipv6 address generation modes, so that userland
can have some input into the process. From Jiri Pirko.
10) Consolidate common TCP connection request code in ipv4 and ipv6,
from Octavian Purdila.
11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.
12) Generic resizable RCU hash table, with intial users in netlink and
nftables. From Thomas Graf.
13) Maintain a name assignment type so that userspace can see where a
network device name came from (enumerated by kernel, assigned
explicitly by userspace, etc.) From Tom Gundersen.
14) Automatic flow label generation on transmit in ipv6, from Tom
Herbert.
15) New packet timestamping facilities from Willem de Bruijn, meant to
assist in measuring latencies going into/out-of the packet
scheduler, latency from TCP data transmission to ACK, etc"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
cxgb4 : Disable recursive mailbox commands when enabling vi
net: reduce USB network driver config options.
tg3: Modify tg3_tso_bug() to handle multiple TX rings
amd-xgbe: Perform phy connect/disconnect at dev open/stop
amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
net: sun4i-emac: fix memory leak on bad packet
sctp: fix possible seqlock seadlock in sctp_packet_transmit()
Revert "net: phy: Set the driver when registering an MDIO bus device"
cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
team: Simplify return path of team_newlink
bridge: Update outdated comment on promiscuous mode
net-timestamp: ACK timestamp for bytestreams
net-timestamp: TCP timestamping
net-timestamp: SCHED timestamp on entering packet scheduler
net-timestamp: add key to disambiguate concurrent datagrams
net-timestamp: move timestamp flags out of sk_flags
net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
cxgb4i : Move stray CPL definitions to cxgb4 driver
tcp: reduce spurious retransmits due to transient SACK reneging
qlcnic: Initialize dcbnl_ops before register_netdev
...
Pull timer and time updates from Thomas Gleixner:
"A rather large update of timers, timekeeping & co
- Core timekeeping code is year-2038 safe now for 32bit machines.
Now we just need to fix all in kernel users and the gazillion of
user space interfaces which rely on timespec/timeval :)
- Better cache layout for the timekeeping internal data structures.
- Proper nanosecond based interfaces for in kernel users.
- Tree wide cleanup of code which wants nanoseconds but does hoops
and loops to convert back and forth from timespecs. Some of it
definitely belongs into the ugly code museum.
- Consolidation of the timekeeping interface zoo.
- A fast NMI safe accessor to clock monotonic for tracing. This is a
long standing request to support correlated user/kernel space
traces. With proper NTP frequency correction it's also suitable
for correlation of traces accross separate machines.
- Checkpoint/restart support for timerfd.
- A few NOHZ[_FULL] improvements in the [hr]timer code.
- Code move from kernel to kernel/time of all time* related code.
- New clocksource/event drivers from the ARM universe. I'm really
impressed that despite an architected timer in the newer chips SoC
manufacturers insist on inventing new and differently broken SoC
specific timers.
[ Ed. "Impressed"? I don't think that word means what you think it means ]
- Another round of code move from arch to drivers. Looks like most
of the legacy mess in ARM regarding timers is sorted out except for
a few obnoxious strongholds.
- The usual updates and fixlets all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
timekeeping: Fixup typo in update_vsyscall_old definition
clocksource: document some basic timekeeping concepts
timekeeping: Use cached ntp_tick_length when accumulating error
timekeeping: Rework frequency adjustments to work better w/ nohz
timekeeping: Minor fixup for timespec64->timespec assignment
ftrace: Provide trace clocks monotonic
timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
seqcount: Add raw_write_seqcount_latch()
seqcount: Provide raw_read_seqcount()
timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
timekeeping: Create struct tk_read_base and use it in struct timekeeper
timekeeping: Restructure the timekeeper some more
clocksource: Get rid of cycle_last
clocksource: Move cycle_last validation to core code
clocksource: Make delta calculation a function
wireless: ath9k: Get rid of timespec conversions
drm: vmwgfx: Use nsec based interfaces
drm: i915: Use nsec based interfaces
timekeeping: Provide ktime_get_raw()
hangcheck-timer: Use ktime_get_ns()
...
Here's the big driver-core pull request for 3.17-rc1.
Largest thing in here is the dma-buf rework and fence code, that touched
many different subsystems so it was agreed it should go through this
tree to handle merge issues. There's also some firmware loading
updates, as well as tests added, and a few other tiny changes, the
changelog has the details.
All have been in linux-next for a long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlPf1XcACgkQMUfUDdst+ylREACdHLXBa02yLrRzbrONJ+nARuFv
JuQAoMN49PD8K9iMQpXqKBvZBsu+iCIY
=w8OJ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the big driver-core pull request for 3.17-rc1.
Largest thing in here is the dma-buf rework and fence code, that
touched many different subsystems so it was agreed it should go
through this tree to handle merge issues. There's also some firmware
loading updates, as well as tests added, and a few other tiny changes,
the changelog has the details.
All have been in linux-next for a long time"
* tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
ARM: imx: Remove references to platform_bus in mxc code
firmware loader: Fix _request_firmware_load() return val for fw load abort
platform: Remove most references to platform_bus device
test: add firmware_class loader test
doc: fix minor typos in firmware_class README
staging: android: Cleanup style issues
Documentation: devres: Sort managed interfaces
Documentation: devres: Add devm_kmalloc() et al
fs: debugfs: remove trailing whitespace
kernfs: kernel-doc warning fix
debugfs: Fix corrupted loop in debugfs_remove_recursive
stable_kernel_rules: Add pointer to netdev-FAQ for network patches
driver core: platform: add device binding path 'driver_override'
driver core/platform: remove unused implicit padding in platform_object
firmware loader: inform direct failure when udev loader is disabled
firmware: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN
firmware: read firmware size using i_size_read()
firmware loader: allow disabling of udev as firmware loader
reservation: add suppport for read-only access using rcu
reservation: update api and add some helpers
...
Conflicts:
drivers/base/platform.c
Pull locking updates from Ingo Molnar:
"The main changes in this cycle are:
- big rtmutex and futex cleanup and robustification from Thomas
Gleixner
- mutex optimizations and refinements from Jason Low
- arch_mutex_cpu_relax() removal and related cleanups
- smaller lockdep tweaks"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
arch, locking: Ciao arch_mutex_cpu_relax()
locking/lockdep: Only ask for /proc/lock_stat output when available
locking/mutexes: Optimize mutex trylock slowpath
locking/mutexes: Try to acquire mutex only if it is unlocked
locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro
locking/mutexes: Correct documentation on mutex optimistic spinning
rtmutex: Make the rtmutex tester depend on BROKEN
futex: Simplify futex_lock_pi_atomic() and make it more robust
futex: Split out the first waiter attachment from lookup_pi_state()
futex: Split out the waiter check from lookup_pi_state()
futex: Use futex_top_waiter() in lookup_pi_state()
futex: Make unlock_pi more robust
rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
rtmutex: Cleanup deadlock detector debug logic
rtmutex: Confine deadlock logic to futex
rtmutex: Simplify remove_waiter()
rtmutex: Document pi chain walk
rtmutex: Clarify the boost/deboost part
rtmutex: No need to keep task ref for lock owner check
rtmutex: Simplify and document try_to_take_rtmutex()
...
Generic implementation of a resizable, scalable, concurrent hash table
based on [0]. The implementation supports both, fixed size keys specified
via an offset and length, or arbitrary keys via own hash and compare
functions.
Lookups are lockless and protected as RCU read side critical sections.
Automatic growing/shrinking based on user configurable watermarks is
available while allowing concurrent lookups to take place.
Objects to be hashed must include a struct rhash_head. The reason for not
using the existing struct hlist_head is that the expansion and shrinking
will have two buckets point to a single entry which would lead in obscure
reverse chaining behaviour.
Code includes a boot selftest if CONFIG_TEST_RHASHTABLE is defined.
[0] https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdf
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found that a lot of unresolvable variables when using gdb on the
kernel become resolvable when dwarf4 is enabled. So add a Kconfig flag
to enable it.
It definitely increases the debug information size, but on the other
hand this isn't so bad when debug fusion is used.
v2: Use cc-option
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
This is an alternative approach to lower the overhead of debug info
(as we discussed a few days ago)
gcc 4.7+ and newer binutils have a new "split debug info" debug info
model where the debug info is only written once into central ".dwo" files.
This avoids having to copy it around multiple times, from the object
files to the final executable. It lowers the disk space
requirements. In addition it defaults to compressed debug data.
More details here: http://gcc.gnu.org/wiki/DebugFission
This patch adds a new option to enable it. It has to be an option,
because it'll undoubtedly break everyone's debuginfo packaging scheme.
gdb/objdump/etc. all still work, if you have new enough versions.
I don't see big compile wins (maybe a second or two faster or so), but the
object dirs with debuginfo get significantly smaller. My standard kernel
config (slightly bigger than defconfig) shrinks from 2.9G disk space
to 1.1G objdir (with non reduced debuginfo). I presume if you are IO limited
the compile time difference will be larger.
Only problem I've seen so far is that it doesn't play well with older
versions of ccache (apparently fixed, see
https://bugzilla.samba.org/show_bug.cgi?id=10005)
v2: various fixes from Dirk Gouders. Improve commit message slightly.
v3: Fix clean rules and improve Kconfig slightly
v4: Fix merge error in last version (Sam Ravnborg)
Clarify description that it mainly helps disk size.
Cc: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Create a module that allows udelay() to be executed to ensure that
it is delaying at least as long as requested (with a little bit of
error allowed).
There are some configurations which don't have reliably udelay
due to using a loop delay with cpufreq changes which should use
a counter time based delay instead. This test aims to identify
those configurations where timing is unreliable.
Signed-off-by: David Riley <davidriley@chromium.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
This provides a simple interface to trigger the firmware_class loader
to test built-in, filesystem, and user helper modes. Additionally adds
tests via the new interface to the selftests tree.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CONFIG_PROVE_RCU_DELAY Kconfig parameter doesn't appear to be very
effective at finding race conditions, so this commit removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
[ paulmck: Remove definition and uses as noted by Paul Bolle. ]
It has been broken for quite some time. Just the recent updates made
it compile time broken. Make it depend on BROKEN instead of removing
it right away as we want a proper replacement.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull networking updates from David Miller:
1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.
2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
Benniston.
3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
Mork.
4) BPF now has a "random" opcode, from Chema Gonzalez.
5) Add more BPF documentation and improve test framework, from Daniel
Borkmann.
6) Support TCP fastopen over ipv6, from Daniel Lee.
7) Add software TSO helper functions and use them to support software
TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia.
8) Support software TSO in fec driver too, from Nimrod Andy.
9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.
10) Handle broadcasts more gracefully over macvlan when there are large
numbers of interfaces configured, from Herbert Xu.
11) Allow more control over fwmark used for non-socket based responses,
from Lorenzo Colitti.
12) Do TCP congestion window limiting based upon measurements, from Neal
Cardwell.
13) Support busy polling in SCTP, from Neal Horman.
14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.
15) Bridge promisc mode handling improvements from Vlad Yasevich.
16) Don't use inetpeer entries to implement ID generation any more, it
performs poorly, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
tcp: fixing TLP's FIN recovery
net: fec: Add software TSO support
net: fec: Add Scatter/gather support
net: fec: Increase buffer descriptor entry number
net: fec: Factorize feature setting
net: fec: Enable IP header hardware checksum
net: fec: Factorize the .xmit transmit function
bridge: fix compile error when compiling without IPv6 support
bridge: fix smatch warning / potential null pointer dereference
via-rhine: fix full-duplex with autoneg disable
bnx2x: Enlarge the dorq threshold for VFs
bnx2x: Check for UNDI in uncommon branch
bnx2x: Fix 1G-baseT link
bnx2x: Fix link for KR with swapped polarity lane
sctp: Fix sk_ack_backlog wrap-around problem
net/core: Add VF link state control policy
net/fsl: xgmac_mdio is dependent on OF_MDIO
net/fsl: Make xgmac_mdio read error message useful
net_sched: drr: warn when qdisc is not work conserving
...
Pull drm updates from Dave Airlie:
"This is the main drm merge window pull request, changes all over the
place, mostly normal levels of churn.
Highlights:
Core drm:
More cleanups, fix race on connector/encoder naming, docs updates,
object locking rework in prep for atomic modeset
i915:
mipi DSI support, valleyview power fixes, cursor size fixes,
execlist refactoring, vblank improvements, userptr support, OOM
handling improvements
radeon:
GPUVM tuning and large page size support, gart fixes, deep color
HDMI support, HDMI audio cleanups
nouveau:
- displayport rework should fix lots of issues
- initial gk20a support
- gk110b support
- gk208 fixes
exynos:
probe order fixes, HDMI changes, IPP consolidation
msm:
debugfs updates, misc fixes
ast:
ast2400 support, sync with UMS driver
tegra:
cleanups, hdmi + hw cursor for Tegra 124.
panel:
fixes existing panels add some new ones.
ipuv3:
moved from staging to drivers/gpu"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (761 commits)
drm/nouveau/disp/dp: fix tmds passthrough on dp connector
drm/nouveau/dp: probe dpcd to determine connectedness
drm/nv50-: trigger update after all connectors disabled
drm/nv50-: prepare for attaching a SOR to multiple heads
drm/gf119-/disp: fix debug output on update failure
drm/nouveau/disp/dp: make use of postcursor when its available
drm/g94-/disp/dp: take max pullup value across all lanes
drm/nouveau/bios/dp: parse lane postcursor data
drm/nouveau/dp: fix support for dpms
drm/nouveau: register a drm_dp_aux channel for each dp connector
drm/g94-/disp: add method to power-off dp lanes
drm/nouveau/disp/dp: maintain link in response to hpd signal
drm/g94-/disp: bash and wait for something after changing lane power regs
drm/nouveau/disp/dp: split link config/power into two steps
drm/nv50/disp: train PIOR-attached DP from second supervisor
drm/nouveau/disp/dp: make use of existing output data for link training
drm/gf119/disp: start removing direct vbios parsing from supervisor
drm/nv50/disp: start removing direct vbios parsing from supervisor
drm/nouveau/disp/dp: maintain receiver caps in response to hpd signal
drm/nouveau/disp/dp: create subclass for dp outputs
...
Change CONFIG_DEBUG_PI_LIST to be user-selectable, and add a title and
description. Remove the dependency on DEBUG_RT_MUTEXES since they were
changed to use rbtrees, and there are other users of plists now.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce a CONFIG_DEBUG_VM_VMACACHE option to enable counting the cache
hit rate -- exported in /proc/vmstat.
Any updates to the caching scheme needs this kind of data, thus it can
save some work re-implementing the counting all the time.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull trivial tree changes from Jiri Kosina:
"Usual pile of patches from trivial tree that make the world go round"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
staging: go7007: remove reference to CONFIG_KMOD
aic7xxx: Remove obsolete preprocessor define
of: dma: doc fixes
doc: fix incorrect formula to calculate CommitLimit value
doc: Note need of bc in the kernel build from 3.10 onwards
mm: Fix printk typo in dmapool.c
modpost: Fix comment typo "Modules.symvers"
Kconfig.debug: Grammar s/addition/additional/
wimax: Spelling s/than/that/, wording s/destinatary/recipient/
aic7xxx: Spelling s/termnation/termination/
arm64: mm: Remove superfluous "the" in comment
of: Spelling s/anonymouns/anonymous/
dma: imx-sdma: Spelling s/determnine/determine/
ath10k: Improve grammar in comments
ath6kl: Spelling s/determnine/determine/
of: Improve grammar for of_alias_get_id() documentation
drm/exynos: Spelling s/contro/control/
radio-bcm2048.c: fix wrong overflow check
doc: printk-formats: do not mention casts for u64/s64
doc: spelling error changes
...
The testsuite covers classic and internal BPF instructions.
It is particularly useful for JIT compiler developers.
Adds to "net" selftest target.
The testsuite can be used as a set of micro-benchmarks.
It measures execution time of each BPF program in nsec.
This patch adds core framework.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
lib/interval_tree.c provides a simple interface for an interval-tree
(an augmented red-black tree) but is only built when testing the generic
macros for building interval-trees. For drivers with modest needs,
export the simple interval-tree library as is.
v2: Lots of help from Michel Lespinasse to only compile the code
as required:
- make INTERVAL_TREE a config option
- make INTERVAL_TREE_TEST select the library functions
and sanitize the filenames & Makefile
- prepare interval_tree for being built as a module if required
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
[Acked for inclusion via drm/i915 by Andrew Morton.]
[danvet: switch to _GPL as per the mailing list discussion.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This appears to be a copy/paste error. Update the description to
reflect extra rbtree debug and checks for the config option instead of
duplicating CONFIG_DEBUG_VM.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs updates from Al Viro:
"The first vfs pile, with deep apologies for being very late in this
window.
Assorted cleanups and fixes, plus a large preparatory part of iov_iter
work. There's a lot more of that, but it'll probably go into the next
merge window - it *does* shape up nicely, removes a lot of
boilerplate, gets rid of locking inconsistencie between aio_write and
splice_write and I hope to get Kent's direct-io rewrite merged into
the same queue, but some of the stuff after this point is having
(mostly trivial) conflicts with the things already merged into
mainline and with some I want more testing.
This one passes LTP and xfstests without regressions, in addition to
usual beating. BTW, readahead02 in ltp syscalls testsuite has started
giving failures since "mm/readahead.c: fix readahead failure for
memoryless NUMA nodes and limit readahead pages" - might be a false
positive, might be a real regression..."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
missing bits of "splice: fix racy pipe->buffers uses"
cifs: fix the race in cifs_writev()
ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
kill generic_file_buffered_write()
ocfs2_file_aio_write(): switch to generic_perform_write()
ceph_aio_write(): switch to generic_perform_write()
xfs_file_buffered_aio_write(): switch to generic_perform_write()
export generic_perform_write(), start getting rid of generic_file_buffer_write()
generic_file_direct_write(): get rid of ppos argument
btrfs_file_aio_write(): get rid of ppos
kill the 5th argument of generic_file_buffered_write()
kill the 4th argument of __generic_file_aio_write()
lustre: don't open-code kernel_recvmsg()
ocfs2: don't open-code kernel_recvmsg()
drbd: don't open-code kernel_recvmsg()
constify blk_rq_map_user_iov() and friends
lustre: switch to kernel_sendmsg()
ocfs2: don't open-code kernel_sendmsg()
take iov_iter stuff to mm/iov_iter.c
process_vm_access: tidy up a bit
...
This commit adds the locking counterpart to rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Make n_lock_torture_errors and torture_spinlock static
as suggested by Fengguang Wu. ]
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Because rcu_torture_random() will be used by the locking equivalent to
rcutorture, pull it out into its own module. This new module cannot
be separately configured, instead, use the Kconfig "select" statement
from the Kconfig options of tests depending on it.
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
It really isn't very interesting to have DEBUG_INFO when doing compile
coverage stuff (you wouldn't want to run the result anyway, that's kind
of the whole point of COMPILE_TEST), and it currently makes the build
take longer and use much more disk space for "all{yes,mod}config".
There's somewhat active discussion about this still, and we might end up
with some new config option for things like this (Andi points out that
the silly X86_DECODER_SELFTEST option also slows down the normal
coverage tests hugely), but I'm starting the ball rolling with this
simple one-liner.
DEBUG_INFO isn't that noticeable if you have tons of memory and a good
IO subsystem, but it hurts you a lot if you don't - for very little
upside for the common use.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To help avoid an architecture failing to correctly check kernel/user
boundaries when handling copy_to_user, copy_from_user, put_user, or
get_user, perform some simple tests and fail to load if any of them
behave unexpectedly.
Specifically, this is to make sure there is a way to notice if things
like what was fixed in commit 8404663f81 ("ARM: 7527/1: uaccess:
explicitly check __user pointer when !CPU_USE_DOMAINS") ever regresses
again, for any architecture.
Additionally, adds new "user" selftest target, which loads this module.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a pair of test modules I'd like to see in the tree. Instead of
putting these in lkdtm, where I've been adding various tests that trigger
crashes, these don't make sense there since they need to be either
distinctly separate, or their pass/fail state don't need to crash the
machine.
These live in lib/ for now, along with a few other in-kernel test modules,
and use the slightly more common "test_" naming convention, instead of
"test-". We should likely standardize on the former:
$ find . -name 'test_*.c' | grep -v /tools/ | wc -l
4
$ find . -name 'test-*.c' | grep -v /tools/ | wc -l
2
The first is entirely a no-op module, designed to allow simple testing of
the module loading and verification interface. It's useful to have a
module that has no other uses or dependencies so it can be reliably used
for just testing module loading and verification.
The second is a module that exercises the user memory access functions, in
an effort to make sure that we can quickly catch any regressions in
boundary checking (e.g. like what was recently fixed on ARM).
This patch (of 2):
When doing module loading verification tests (for example, with module
signing, or LSM hooks), it is very handy to have a module that can be
built on all systems under test, isn't auto-loaded at boot, and has no
device or similar dependencies. This creates the "test_module.ko" module
for that purpose, which only reports its load and unload to printk.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Record actively mapped pages and provide an api for asserting a given
page is dma inactive before execution proceeds. Placing
debug_dma_assert_idle() in cow_user_page() flagged the violation of the
dma-api in the NET_DMA implementation (see commit 7787380336 "net_dma:
mark broken").
The implementation includes the capability to count, in a limited way,
repeat mappings of the same page that occur without an intervening
unmap. This 'overlap' counter is limited to the few bits of tag space
in a radix tree. This mechanism is added to mitigate false negative
cases where, for example, a page is dma mapped twice and
debug_dma_assert_idle() is called after the page is un-mapped once.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes it possible to debug kernel over FireWire without the need to
recompile it.
[Stefan R: changed description from "...0" to "...N"]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The panic_timeout value can be set via the command line option
'panic=x', or via /proc/sys/kernel/panic, however that is not
sufficient when the panic occurs before we are able to set up
these values. Thus, add a CONFIG_PANIC_TIMEOUT so that we can
set the desired value from the .config.
The default panic_timeout value continues to be 0 - wait
forever. Also adds set_arch_panic_timeout(new_timeout,
arch_default_timeout), which is intended to be used by arches in
arch_setup(). The idea being that the new_timeout is only set if
the user hasn't changed from the arch_default_timeout.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: benh@kernel.crashing.org
Cc: paulus@samba.org
Cc: ralf@linux-mips.org
Cc: mpe@ellerman.id.au
Cc: felipe.contreras@gmail.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1a1674daec27c534df409697025ac568ebcee91e.1385418410.git.jbaron@akamai.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Here's the big tty/serial driver update for 3.13-rc1.
There's some more minor n_tty work here, but nothing like previous
kernel releases. Also some new driver ids, driver updates for new
hardware, and other small things.
All of this has been in linux-next for a while with no issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlJ6xnEACgkQMUfUDdst+ylj1QCfaIzUNuFfzTmyB6K6iZrRNhCc
WPgAnRLkkEpI/3rRo7jKwGQsuIuyybUu
=r149
-----END PGP SIGNATURE-----
Merge tag 'tty-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here's the big tty/serial driver update for 3.13-rc1.
There's some more minor n_tty work here, but nothing like previous
kernel releases. Also some new driver ids, driver updates for new
hardware, and other small things.
All of this has been in linux-next for a while with no issues"
* tag 'tty-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (84 commits)
serial: omap: fix missing comma
serial: sh-sci: Enable the driver on all ARM platforms
serial: mfd: Staticize local symbols
serial: omap: fix a few checkpatch warnings
serial: omap: improve RS-485 performance
mrst_max3110: fix unbalanced IRQ issue during resume
serial: omap: Add support for optional wake-up
serial: sirf: remove duplicate defines
tty: xuartps: Fix build error when COMMON_CLK is not set
tty: xuartps: Fix build error due to missing forward declaration
tty: xuartps: Fix "may be used uninitialized" build warning
serial: 8250_pci: add Pericom PCIe Serial board Support (12d8:7952/4/8) - Chip PI7C9X7952/4/8
tty: xuartps: Update copyright information
tty: xuartps: Implement suspend/resume callbacks
tty: xuartps: Dynamically adjust to input frequency changes
tty: xuartps: Updating set_baud_rate()
tty: xuartps: Force enable the UART in xuartps_console_write
tty: xuartps: support 64 byte FIFO size
tty: xuartps: Add polled mode support for xuartps
tty: xuartps: Implement BREAK detection, add SYSRQ support
...
Without the timer debugging, the delayed kobject release will just
result in undebuggable oopses if it triggers any latent bugs. That
doesn't actually help debugging at all.
So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid
having people enable one without the other.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Turn the initial value of sysctl kernel.sysrq (SYSRQ_DEFAULT_ENABLE)
into a Kconfig variable.
Original version by Bastian Blank <waldi@debian.org>.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
No reason require rbtree test code to be a module, allow it to be builtin
(streamlines my development process)
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frame pointer on ARC doesn't serve the conventional purpose of stack
unwinding due to the typical way ABI designates it's usage.
Thus it's explicit usage on ARC is discouraged (gcc is free to use it,
for some tricky stack frames even if -fomit-frame-pointer).
Hence no point enabling it for ARC.
References: http://www.spinics.net/lists/kernel/msg1593937.html
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: linux-kernel@vger.kernel.org
Implement debugging for kobject release functions. kobjects are
reference counted, so the drop of the last reference to them is not
predictable. However, the common case is for the last reference to be
the kobject's removal from a subsystem, which results in the release
function being immediately called.
This can hide subtle bugs, which can occur when another thread holds a
reference to the kobject at the same time that a kobject is removed.
This results in the release method being delayed.
In order to make these kinds of problems more visible, the following
patch implements a delayed release; this has the effect that the
release function will be out of order with respect to the removal of
the kobject in the same manner that it would be if a reference was
being held.
This provides us with an easy way to allow driver writers to debug
their drivers and fix otherwise hidden problems.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
This removes all the uses of the __cpuinit macros from C files in
the core kernel directories (kernel, init, lib, mm, and include)
that don't really have a specific maintainer.
[1] https://lkml.org/lkml/2013/5/20/589
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Pull MIPS updates from Ralf Baechle:
"MIPS updates:
- All the things that didn't make 3.10.
- Removes the Windriver PPMC platform. Nobody will miss it.
- Remove a workaround from kernel/irq/irqdomain.c which was there
exclusivly for MIPS. Patch by Grant Likely.
- More small improvments for the SEAD 3 platform
- Improvments on the BMIPS / SMP support for the BCM63xx series.
- Various cleanups of dead leftovers.
- Platform support for the Cavium Octeon-based EdgeRouter Lite.
Two large KVM patchsets didn't make it for this pull request because
their respective authors are vacationing"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (124 commits)
MIPS: Kconfig: Add missing MODULES dependency to VPE_LOADER
MIPS: BCM63xx: CLK: Add dummy clk_{set,round}_rate() functions
MIPS: SEAD3: Disable L2 cache on SEAD-3.
MIPS: BCM63xx: Enable second core SMP on BCM6328 if available
MIPS: BCM63xx: Add SMP support to prom.c
MIPS: define write{b,w,l,q}_relaxed
MIPS: Expose missing pci_io{map,unmap} declarations
MIPS: Malta: Update GCMP detection.
Revert "MIPS: make CAC_ADDR and UNCAC_ADDR account for PHYS_OFFSET"
MIPS: APSP: Remove <asm/kspd.h>
SSB: Kconfig: Amend SSB_EMBEDDED dependencies
MIPS: microMIPS: Fix improper definition of ISA exception bit.
MIPS: Don't try to decode microMIPS branch instructions where they cannot exist.
MIPS: Declare emulate_load_store_microMIPS as a static function.
MIPS: Fix typos and cleanup comment
MIPS: Cleanup indentation and whitespace
MIPS: BMIPS: support booting from physical CPU other than 0
MIPS: Only set cpu_has_mmips if SYS_SUPPORTS_MICROMIPS
MIPS: GIC: Fix gic_set_affinity infinite loop
MIPS: Don't save/restore OCTEON wide multiplier state on syscalls.
...
Merge Kconfig menu diet patches from Dave Hansen:
"I think the "Kernel Hacking" menu has gotten a bit out of hand. It is
over 120 lines long on my system with everything enabled and options
are scattered around it haphazardly.
http://sr71.net/~dave/linux/kconfig-horror.png
Let's try to introduce some sanity. This set takes that 120 lines
down to 55 and makes it vastly easier to find some things. It's a
start.
This set stands on its own, but there is plenty of room for follow-up
patches. The arch-specific debug options still end up getting stuck
in the top-level "kernel hacking" menu. OPTIMIZE_INLINING, for
instance, could obviously go in to the "compiler options" menu, but
the fact that it is defined in arch/ in a separate Kconfig file keeps
it on its own for the moment.
The Signed-off-by's in here look funky. I changed employers while
working on this set, so I have signoffs from both email addresses"
* emailed patches from Dave Hansen <dave@sr71.net>:
hang and lockup detection menu
kconfig: consolidate printk options
group locking debugging options
consolidate compilation option configs
consolidate runtime testing configs
order memory debugging Kconfig options
consolidate per-arch stack overflow debugging options
The hard/softlockup and hung-task entries take up 6 lines
of screen real-estate when enabled. I bet folks don't
mess with these _that_ often, so move them in a group
down a level.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Same deal, take the printk-related things and hide them in a menu.
This takes another 4 items out of the top-level menu.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original posting:
http://lkml.kernel.org/r/20121214184208.D9E5804D@kernel.stglabs.ibm.com
There are quite a few of these, and we want to make sure that
there is one-stop-shopping for lock debugging.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original Post:
http://lkml.kernel.org/r/20121214184207.6E00DDEC@kernel.stglabs.ibm.com
Again, trying to come up with some common themes of the stuff in
the kernel hacking menu... There are quite a few options to
tweak compilation in some way, or perform extra compile-time
checks. Give them their own menu.
The diff here looks a bit funny... makes it look like I'm
moving debugfs even though I'm actually moving the options on
either side of it.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original posting:
http://lkml.kernel.org/r/20121214184206.FC11422F@kernel.stglabs.ibm.com
These runtime tests are great, except that there are a lot of them,
and they are very rarely needed. Give them their own menu so that
only the folks who need them will have to go looking for them.
Note that there are some other runtime tests that are not in here,
like for RCU or locking. This menu should only be used for tests
that do not have a more appropriate home.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original posting:
http://lkml.kernel.org/r/20121214184203.37E6C724@kernel.stglabs.ibm.com
There are a *LOT* of memory debugging options. They are just scattered
all over the "Kernel Hacking" menu. Sure, "memory debugging" is a very
vague term and it's going to be hard to make absolute rules about what
goes in here, but this has to be better than what we had before.
This does, however, leave out the architecture-specific memory
debugging options (like x86's DEBUG_SET_MODULE_RONX). There would need
to be some substantial changes to move those in here. Kconfig can not
easily mix arch-specific and generic options together: it really
requires a file per-architecture, and I think having an
arch/foo/Kconfig.debug-memory might be taking things a bit too far
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Original posting:
http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com
Several architectures have similar stack debugging config options.
They all pretty much do the same thing, some with slightly
differing help text.
This patch changes the architectures to instead enable a Kconfig
boolean, and then use that boolean in the generic Kconfig.debug
to present the actual menu option. This removes a bunch of
duplication and adds consistency across arches.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Injects EDEADLK conditions at pseudo-random interval, with
exponential backoff up to UINT_MAX (to ensure that every lock
operation still completes in a reasonable time).
This way we can test the wound slowpath even for ww mutex users
where contention is never expected, and the ww deadlock
avoidance algorithm is only needed for correctness against
malicious userspace. An example would be protecting kernel
modesetting properties, which thanks to single-threaded X isn't
really expected to contend, ever.
I've looked into using the CONFIG_FAULT_INJECTION
infrastructure, but decided against it for two reasons:
- EDEADLK handling is mandatory for ww mutex users and should
never affect the outcome of a syscall. This is in contrast to -ENOMEM
injection. So fine configurability isn't required.
- The fault injection framework only allows to set a simple
probability for failure. Now the probability that a ww mutex acquire
stage with N locks will never complete (due to too many injected
EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
operations for the completely uncontended case would be O(exp(N)).
The per-acuiqire ctx exponential backoff solution choosen here only
results in O(log N) overhead due to injection and so O(log N * N)
lock operations. This way we can fail with high probability (and so
have good test coverage even for fancy backoff and lock acquisition
paths) without running into patalogical cases.
Note that EDEADLK will only ever be injected when we managed to
acquire the lock. This prevents any behaviour changes for users
which rely on the EALREADY semantics.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: rostedt@goodmis.org
Cc: daniel@ffwll.ch
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130620113117.4001.21681.stgit@patser
Signed-off-by: Ingo Molnar <mingo@kernel.org>
FAULT_INJECTION_STACKTRACE_FILTER selects FRAME_POINTER but
that symbol is not available for MIPS.
Fixes the following problem on a randconfig:
warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP &&
KMEMCHECK) selects FRAME_POINTER which has unmet direct dependencies
(DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN ||
MN10300 || METAG) || ARCH_WANT_FRAME_POINTERS)
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5441/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Kconfig help text for MEMORY_NOTIFIER_ERROR_INJECT and
OF_RECONFIG_NOTIFIER_ERROR_INJECT has mismatched module names.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files. Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.
To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.
Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs. ones which emit errors. The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.
While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several places in kernel where modules unescapes input to convert
C-Style Escape Sequences into byte codes.
The patch provides generic implementation of such approach. Test cases are
also included into the patch.
[akpm@linux-foundation.org: clarify comment]
[akpm@linux-foundation.org: export get_random_int() to modules]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: William Hubbs <w.d.hubbs@gmail.com>
Cc: Chris Brannon <chris@the-brannons.com>
Cc: Kirk Reiser <kirk@braille.uwo.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:
- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()
Signed-off-by: James Hogan <james.hogan@imgtec.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRMmVXAAoJEKHZs+irPybfivgP/inEXqJyfw59omQdjwvYcU/a
/u0MJ3UKSNS3U+HknfaFCy/Nwk1dqPLjqqyVC1V6AbUPBXlaEwGcimlNRx2uRjdq
Uh96upMLHsNuF/xiiR477g3RwY0egIJdM1R1bGi3mZ3vVrNQGF+wbni6f61xCWGz
M/4rDglpQvE79oLhYdgj6tidZtHQT0YWtERA9W90zkQWXGYmpFPKBKbfZAi5+rKQ
U6Gpg26orUugzXNaxltJEYKE8gjLTppEabx8DARnItZ4zCMy4dw5RBJ35RFvQw6e
eSmfgTy9w9WqBMY2+QMSgU0KQt1IITCzX7OlOXC0jALQJXoU0WWbOELlBVQLCwF1
T0OcR/5ZP/hIlOk5Dh+e9U3AtbASXdMtqA0ZUe78woH1CBf7Nc/0c0vRg23EdMh8
lnHDJxT/UqskoOYLI4kgWbEdLDy4uTh19U2pVi7VCo7ksLB9Bj9Xc8VSKgscSXTl
OwzN+c4Jgtu8FDFTp+Af4AT8pYGJ08j8L2ErsV2sOv3Q44U5WXdrMz3GSgwXj8+4
wZk3HvdkQVkMD5sJCUZgAswaN6BnbB0pHdCz4wMQ8jR/Ogs015Ipk64Ecym9S/4n
uES7PnDtt/4lb5EyX2ScbvdnZTAFTaaP7OOhC77BOQvbQjIW1tkAcxWJqRry86uS
iM0BFgK6Ohx3geqa5Ft0
=65cR
-----END PGP SIGNATURE-----
Merge tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull new ImgTec Meta architecture from James Hogan:
"This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:
- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of
metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()"
* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
metag: Provide dma_get_sgtable()
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
metag: copy devicetree to non-init memory
metag: cleanup metag_ksyms.c includes
metag: move mm/init.c exports out of metag_ksyms.c
metag: move usercopy.c exports out of metag_ksyms.c
metag: move setup.c exports out of metag_ksyms.c
metag: move kick.c exports out of metag_ksyms.c
metag: move traps.c exports out of metag_ksyms.c
metag: move irq enable out of irqflags.h on SMP
genksyms: fix metag symbol prefix on crc symbols
metag: hugetlb: convert to vm_unmapped_area()
metag: export clear_page and copy_page
metag: export metag_code_cache_flush_all
metag: protect more non-MMU memory regions
metag: make TXPRIVEXT bits explicit
metag: kernel/setup.c: sort includes
perf: Enable building perf tools for Meta
metag: add boot time LNKGET/LNKSET check
metag: add __init to metag_cache_probe()
...
Add [!]METAG to a couple of Kconfig dependencies in lib/Kconfig.debug.
Don't allow stack utilization instrumentation on metag, and allow
building with frame pointers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Merge misc patches from Andrew Morton:
- Florian has vanished so I appear to have become fbdev maintainer
again :(
- Joel and Mark are distracted to welcome to the new OCFS2 maintainer
- The backlight queue
- Small core kernel changes
- lib/ updates
- The rtc queue
- Various random bits
* akpm: (164 commits)
rtc: rtc-davinci: use devm_*() functions
rtc: rtc-max8997: use devm_request_threaded_irq()
rtc: rtc-max8907: use devm_request_threaded_irq()
rtc: rtc-da9052: use devm_request_threaded_irq()
rtc: rtc-wm831x: use devm_request_threaded_irq()
rtc: rtc-tps80031: use devm_request_threaded_irq()
rtc: rtc-lp8788: use devm_request_threaded_irq()
rtc: rtc-coh901331: use devm_clk_get()
rtc: rtc-vt8500: use devm_*() functions
rtc: rtc-tps6586x: use devm_request_threaded_irq()
rtc: rtc-imxdi: use devm_clk_get()
rtc: rtc-cmos: use dev_warn()/dev_dbg() instead of printk()/pr_debug()
rtc: rtc-pcf8583: use dev_warn() instead of printk()
rtc: rtc-sun4v: use pr_warn() instead of printk()
rtc: rtc-vr41xx: use dev_info() instead of printk()
rtc: rtc-rs5c313: use pr_err() instead of printk()
rtc: rtc-at91rm9200: use dev_dbg()/dev_err() instead of printk()/pr_debug()
rtc: rtc-rs5c372: use dev_dbg()/dev_warn() instead of printk()/pr_debug()
rtc: rtc-ds2404: use dev_err() instead of printk()
rtc: rtc-efi: use dev_err()/dev_warn()/pr_err() instead of printk()
...
CONFIG_EXPERT doesn't really make sense, and hides it unintentionally.
Remove superfluous "default n" pointed out by Ingo as well.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the big driver core merge for 3.9-rc1
There are two major series here, both of which touch lots of drivers all
over the kernel, and will cause you some merge conflicts:
- add a new function called devm_ioremap_resource() to properly be
able to check return values.
- remove CONFIG_EXPERIMENTAL
If you need me to provide a merged tree to handle these resolutions,
please let me know.
Other than those patches, there's not much here, some minor fixes and
updates.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9
weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW
=yWAQ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches from Greg Kroah-Hartman:
"Here is the big driver core merge for 3.9-rc1
There are two major series here, both of which touch lots of drivers
all over the kernel, and will cause you some merge conflicts:
- add a new function called devm_ioremap_resource() to properly be
able to check return values.
- remove CONFIG_EXPERIMENTAL
Other than those patches, there's not much here, some minor fixes and
updates"
Fix up trivial conflicts
* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
base: memory: fix soft/hard_offline_page permissions
drivercore: Fix ordering between deferred_probe and exiting initcalls
backlight: fix class_find_device() arguments
TTY: mark tty_get_device call with the proper const values
driver-core: constify data for class_find_device()
firmware: Ignore abort check when no user-helper is used
firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
firmware: Make user-mode helper optional
firmware: Refactoring for splitting user-mode helper code
Driver core: treat unregistered bus_types as having no devices
watchdog: Convert to devm_ioremap_resource()
thermal: Convert to devm_ioremap_resource()
spi: Convert to devm_ioremap_resource()
power: Convert to devm_ioremap_resource()
mtd: Convert to devm_ioremap_resource()
mmc: Convert to devm_ioremap_resource()
mfd: Convert to devm_ioremap_resource()
media: Convert to devm_ioremap_resource()
iommu: Convert to devm_ioremap_resource()
drm: Convert to devm_ioremap_resource()
...
doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation.
fixes.2013.01.26a: Miscellaneous fixes.
tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to
simplify callback advancement.
tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.
Tiny RCU has historically omitted RCU CPU stall warnings in order to
reduce memory requirements, however, lack of these warnings caused
Thomas Gleixner some debugging pain recently. Therefore, this commit
adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y. This keeps
the memory footprint small, while still enabling CPU stall warnings
in kernels built to enable them.
Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON
config variable to simplify #if expressions.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The RCU-related debugging Kconfig options are in two different places,
and consume too much screen real estate. This commit therefore
consolidates them into their own menu.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
CC: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
CC: James Morris <james.l.morris@oracle.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Akinobu Mita <akinobu.mita@gmail.com>
CC: Ingo Molnar <mingo@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, rcutorture traces every read-side access. This can be
problematic because even a two-minute rcutorture run on a two-CPU system
can generate 28,853,363 reads. Normally, only a failing read is of
interest, so this commit traces adjusts rcutorture's tracing to only
trace failing reads. The resulting event tracing records the time
and the ->completed value captured at the beginning of the RCU read-side
critical section, allowing correlation with other event-tracing messages.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Add fix to build problem located by Randy Dunlap based on
diagnosis by Steven Rostedt. ]
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the last of the __dev* markings from the kernel from
a variety of different, tiny, places.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull powerpc update from Benjamin Herrenschmidt:
"The main highlight is probably some base POWER8 support. There's more
to come such as transactional memory support but that will wait for
the next one.
Overall it's pretty quiet, or rather I've been pretty poor at picking
things up from patchwork and reviewing them this time around and Kumar
no better on the FSL side it seems..."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits)
powerpc+of: Rename and fix OF reconfig notifier error inject module
powerpc: mpc5200: Add a3m071 board support
powerpc/512x: don't compile any platform DIU code if the DIU is not enabled
powerpc/mpc52xx: use module_platform_driver macro
powerpc+of: Export of_reconfig_notifier_[register,unregister]
powerpc/dma/raidengine: add raidengine device
powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct
powerpc/mpc85xx: Change spin table to cached memory
powerpc/fsl-pci: Add PCI controller ATMU PM support
powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI
drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS]
powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers
powerpc: Disable relocation on exceptions when kexecing
powerpc: Enable relocation on during exceptions at boot
powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function
powerpc: Add wrappers to enable/disable relocation on exceptions
powerpc: Add set_mode hcall
powerpc: Setup relocation on exceptions for bare metal systems
powerpc: Move initial mfspr LPCR out of __init_LPCR
powerpc: Add relocation on exception vector handlers
...
This module used to inject errors in the pSeries specific dynamic
reconfiguration notifiers. Those are gone however, replaced by
generic notifiers for changes to the device-tree. So let's update
the module to deal with these instead and rename it along the way.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Pull trivial branch from Jiri Kosina:
"Usual stuff -- comment/printk typo fixes, documentation updates, dead
code elimination."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
HOWTO: fix double words typo
x86 mtrr: fix comment typo in mtrr_bp_init
propagate name change to comments in kernel source
doc: Update the name of profiling based on sysfs
treewide: Fix typos in various drivers
treewide: Fix typos in various Kconfig
wireless: mwifiex: Fix typo in wireless/mwifiex driver
messages: i2o: Fix typo in messages/i2o
scripts/kernel-doc: check that non-void fcts describe their return value
Kernel-doc: Convention: Use a "Return" section to describe return values
radeon: Fix typo and copy/paste error in comments
doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
various: Fix spelling of "asynchronous" in comments.
Fix misspellings of "whether" in comments.
eisa: Fix spelling of "asynchronous".
various: Fix spelling of "registered" in comments.
doc: fix quite a few typos within Documentation
target: iscsi: fix comment typos in target/iscsi drivers
treewide: fix typo of "suport" in various comments and Kconfig
treewide: fix typo of "suppport" in various comments
...
The RCU CPU stall warning timeout has defaulted to 60 seconds for
some years, with almost no false positives. This commit therefore
reduces the default to 21 seconds, slightly shorter than the new
soft-lockup timeout.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Add a CONFIG_DEBUG_VM_RB build option for the previously existing
DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using
recursive algorithms, we can expose it a bit more.
Also extend this code to validate_mm() after stack expansion, and to check
that the vma's start and last pgoffs have not changed since the nodes were
inserted on the anon vma interval tree (as it is important that the nodes
be reindexed after each such update).
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After both prio_tree users have been converted to use red-black trees,
there is no need to keep around the prio tree library anymore.
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch 1 implements support for interval trees, on top of the augmented
rbtree API. It also adds synthetic tests to compare the performance of
interval trees vs prio trees. Short answers is that interval trees are
slightly faster (~25%) on insert/erase, and much faster (~2.4 - 3x)
on search. It is debatable how realistic the synthetic test is, and I have
not made such measurements yet, but my impression is that interval trees
would still come out faster.
Patch 2 uses a preprocessor template to make the interval tree generic,
and uses it as a replacement for the vma prio_tree.
Patch 3 takes the other prio_tree user, kmemleak, and converts it to use
a basic rbtree. We don't actually need the augmented rbtree support here
because the intervals are always non-overlapping.
Patch 4 removes the now-unused prio tree library.
Patch 5 proposes an additional optimization to rb_erase_augmented, now
providing it as an inline function so that the augmented callbacks can be
inlined in. This provides an additional 5-10% performance improvement
for the interval tree insert/erase benchmark. There is a maintainance cost
as it exposes augmented rbtree users to some of the rbtree library internals;
however I think this cost shouldn't be too high as I expect the augmented
rbtree will always have much less users than the base rbtree.
I should probably add a quick summary of why I think it makes sense to
replace prio trees with augmented rbtree based interval trees now. One of
the drivers is that we need augmented rbtrees for Rik's vma gap finding
code, and once you have them, it just makes sense to use them for interval
trees as well, as this is the simpler and more well known algorithm. prio
trees, in comparison, seem *too* clever: they impose an additional 'heap'
constraint on the tree, which they use to guarantee a faster worst-case
complexity of O(k+log N) for stabbing queries in a well-balanced prio
tree, vs O(k*log N) for interval trees (where k=number of matches,
N=number of intervals). Now this sounds great, but in practice prio trees
don't realize this theorical benefit. First, the additional constraint
makes them harder to update, so that the kernel implementation has to
simplify things by balancing them like a radix tree, which is not always
ideal. Second, the fact that there are both index and heap properties
makes both tree manipulation and search more complex, which results in a
higher multiplicative time constant. As it turns out, the simple interval
tree algorithm ends up running faster than the more clever prio tree.
This patch:
Add two test modules:
- prio_tree_test measures the performance of lib/prio_tree.c, both for
insertion/removal and for stabbing searches
- interval_tree_test measures the performance of a library of equivalent
functionality, built using the augmented rbtree support.
In order to support the second test module, lib/interval_tree.c is
introduced. It is kept separate from the interval_tree_test main file
for two reasons: first we don't want to provide an unfair advantage
over prio_tree_test by having everything in a single compilation unit,
and second there is the possibility that the interval tree functionality
could get some non-test users in kernel over time.
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This small module helps measure the performance of rbtree insert and
erase.
Additionally, we run a few correctness tests to check that the rbtrees
have all desired properties:
- contains the right number of nodes in the order desired,
- never two consecutive red nodes on any path,
- all paths to leaf nodes have the same number of black nodes,
- root node is black
[akpm@linux-foundation.org: fix printk warning: sparc64 cycles_t is unsigned long]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce HAVE_DEBUG_KMEMLEAK config option and select it in corresponding
architecture Kconfig files. DEBUG_KMEMLEAK now only depends on
HAVE_DEBUG_KMEMLEAK.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The main option should not appear in the resulting .config when the
dependencies aren't met (i.e. use "depends on" rather than directly
setting the default from the combined dependency values).
The sub-options should depend on the main option rather than a more
generic higher level one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Features currently supported:
- 39-bit address space for user and kernel (each)
- 4KB and 64KB page configurations
- Compat (32-bit) user applications (ARMv7, EABI only)
- Flattened Device Tree (mandated for all AArch64 platforms)
- ARM generic timers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJQabRiAAoJEGvWsS0AyF7xXgcQAK+FTXt0ikdQYMkV5AIZXb9i
xHRhuiZWx2vKyk0mCqpyGLY58GSmSb6uTBg/2P2Ej7vXdH/RB2goPzjlspfjkDL4
o8RJp7eQ07Uz3KRDYEJgMP8xKZid6KFG93RJ6TjjpKZLuDBdwiG1GP1vb0jVcWfo
ttZrj/aI8lMcqrh3Vq5qefP7GWP1OVATqeaGTiT7oo38pXwF3t237xfBr2iDGFBp
ZgIRddrxpa7JYUesfJDDDdGHvLq7Vh2jJV+io9qasBZDrtppGJIhZ0vUni2DgIi7
r4i1LcynDN4JaG0maZ4U/YQm74TCD4BqxV8GJ7zwLPTWeN+of+skjhPSLOkA+0fp
I+sWjXlv200gDfJZ9qnUld2kFpoDfJi2b7fNDouSDd2OhmVOVWG3jnVP4Z7meVSb
O8BYzWDdsAiabuwciUY3OsmW6424lT93b2v86Vncs4unKMvEjOPxYZbUxhqX8f2j
gsmWwwD/yS4THx2B6OyW9VT3I5J6miqs2Glt/GG6vPWT5AKQJn9jCxKaBGhPMPIs
xe5/GycBYjdk/Y8qRjegxFbEqzQuiRzmkeFn5jwjmBLqpGNbZDpvMaL6adhAKM5/
v6UIKa91ra4fC9N0h6G61pOc9N9DbT8wPbCbdYY0RMTMRuLDZDgAM3Bvz0r2APdD
96leNy6vx684hbkCSLJs
=buJB
-----END PGP SIGNATURE-----
Merge tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 support from Catalin Marinas:
"Linux support for the 64-bit ARM architecture (AArch64)
Features currently supported:
- 39-bit address space for user and kernel (each)
- 4KB and 64KB page configurations
- Compat (32-bit) user applications (ARMv7, EABI only)
- Flattened Device Tree (mandated for all AArch64 platforms)
- ARM generic timers"
* tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits)
arm64: ptrace: remove obsolete ptrace request numbers from user headers
arm64: Do not set the SMP/nAMP processor bit
arm64: MAINTAINERS update
arm64: Build infrastructure
arm64: Miscellaneous header files
arm64: Generic timers support
arm64: Loadable modules
arm64: Miscellaneous library functions
arm64: Performance counters support
arm64: Add support for /proc/sys/debug/exception-trace
arm64: Debugging support
arm64: Floating point and SIMD
arm64: 32-bit (compat) applications support
arm64: User access library functions
arm64: Signal handling support
arm64: VDSO support
arm64: System calls handling
arm64: ELF definitions
arm64: SMP support
arm64: DMA mapping API
...
There have been some recent bugs that were triggered only when
preemptible RCU's __rcu_read_unlock() was preempted just after setting
->rcu_read_lock_nesting to INT_MIN, which is a low-probability event.
Therefore, reproducing those bugs (to say nothing of gaining confidence
in alleged fixes) was quite difficult. This commit therefore creates
a new debug-only RCU kernel config option that forces a short delay
in __rcu_read_unlock() to increase the probability of those sorts of
bugs occurring.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This patch adds Makefile and Kconfig files required for building an
AArch64 kernel.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
This provides the ability to inject artifical errors to pSeries reconfig
notifier chain callbacks. It is controlled through debugfs interface
under /sys/kernel/debug/notifier-error-inject/pSeries-reconfig
If the notifier call chain should be failed with some events
notified, write the error code to "actions/<notifier event>/error".
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This provides the ability to inject artifical errors to memory hotplug
notifier chain callbacks. It is controlled through debugfs interface
under /sys/kernel/debug/notifier-error-inject/memory
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Example: Inject memory hotplug offline error (-12 == -ENOMEM)
# cd /sys/kernel/debug/notifier-error-inject/memory
# echo -12 > actions/MEM_GOING_OFFLINE/error
# echo offline > /sys/devices/system/memory/memoryXXX/state
bash: echo: write error: Cannot allocate memory
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This provides the ability to inject artifical errors to PM notifier chain
callbacks. It is controlled through debugfs interface under
/sys/kernel/debug/notifier-error-inject/pm
Each of the files in "error" directory represents an event which can be
failed and contains the error code. If the notifier call chain should be
failed with some events notified, write the error code to the files.
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Example: Inject PM suspend error (-12 = -ENOMEM)
# cd /sys/kernel/debug/notifier-error-inject/pm
# echo -12 > actions/PM_SUSPEND_PREPARE/error
# echo mem > /sys/power/state
bash: echo: write error: Cannot allocate memory
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rewrite existing cpu-notifier-error-inject module to use debugfs based new
framework.
This change removes cpu_up_prepare_error and cpu_down_prepare_error module
parameters which were used to specify error code to be injected. We could
keep these module parameters for backward compatibility by module_param_cb
but it seems overkill for this module.
This provides the ability to inject artifical errors to CPU notifier chain
callbacks. It is controlled through debugfs interface under
/sys/kernel/debug/notifier-error-inject/cpu
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Example1: inject CPU offline error (-1 == -EPERM)
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -1 > actions/CPU_DOWN_PREPARE/error
# echo 0 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Operation not permitted
Example2: inject CPU online error (-2 == -ENOENT)
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -2 > actions/CPU_UP_PREPARE/error
# echo 1 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: No such file or directory
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset provides kernel modules that can be used to test the error
handling of notifier call chain failures by injecting artifical errors to
the following notifier chain callbacks.
* CPU notifier
* PM notifier
* memory hotplug notifier
* powerpc pSeries reconfig notifier
Example: Inject CPU offline error (-1 == -EPERM)
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -1 > actions/CPU_DOWN_PREPARE/error
# echo 0 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Operation not permitted
The patchset also adds cpu and memory hotplug tests to
tools/testing/selftests These tests first do simple online and offline
test and then do fault injection tests if notifier error injection
module is available.
This patch:
The notifier error injection provides the ability to inject artifical
errors to specified notifier chain callbacks. It is useful to test the
error handling of notifier call chain failures.
This adds common basic functions to define which type of events can be
fail and to initialize the debugfs interface to control what error code
should be returned and which event should be failed.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQDcvUAAoJEKurIx+X31iBNTkP/jzJoKNYNlx8huB9WdG2v0J0
S5mFTrT5COT6DtglWWAvwADiNqEdvAJ11PDFVvYnVVj16wamXMMYQS0OnCh1vR1T
ShzB5hYYlsa8fcxj4ztRvN+YT8IR59RyPYbN0/gA/vtgnTyWt9Uq0xRIuCmWaHbb
EV1pL655I/3lkv4OiJnvILbiO6CmTsHMCp+rYrs6fQFLdbMSzCSZ8Pg7iqy51gyX
chiXSYGatcXW1da+RssZRsXjWMDLO+iagA2sm/azdgPGK6Z/egQXPLIEjcuQE852
jlUraBQQDCfyHQP29iOPUzidJVi9QDgY0iUKu37Y+FQoAX9O8fH8w7Ycd4gYyun5
+QcTTDT92AyWCyQhfBsZ/w2LX20MWxhRkHqzrtmSSR3U1zy4zCfIIP9tD1/7QCcz
+YnxZwb2tHnai0lYMjK2q0xsMmVWAZutUEArJrIHL068hpgdBC6+ONM+rxT+dIJO
U4X2Eje2ZC2vWZrIthoERXQMdZGgnpK8pBv5z7dwGTLGHqX7yTNwZuXxVlhHYkAw
Mw3r+Eq8lDHnRWfMle48PAD07/c9pE5LrBzkwXTiCqSKbgJ/CnBwasKTWbpO/FDF
d+Gk6ofaAnAIGLmQpluMqyI/DOix2NQnZFR4ZY3P8sUBmh+fwLU9fjtSAyiPH9aL
GXJgleqfXVyd7c5QhkEK
=K7ad
-----END PGP SIGNATURE-----
Merge tag 'please-pull-misc-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull misc Itanium fixes from Tony Luck.
* tag 'please-pull-misc-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC
[IA64] Port OOM changes to ia64_do_page_fault
The stack_not_used() function in <linux/sched.h> assumes that stacks
grow downwards. This is not true on IA64 or PARISC, so this function
would walk off in the wrong direction and into the weeds.
Found on IA64 because of a compilation failure with recursive dependencies
on IA64_TASKSIZE and IA64_THREAD_INFO_SIZE.
Fixing the code is possible, but should be combined with other
infrastructure additions to set up the "canary" at the end of the stack.
Reported-by: Fengguang Wu <fengguang.wu@intel.com> (failed allmodconfig build)
Signed-off-by: Tony Luck <tony.luck@intel.com>
The output of the timestamps got lost with the conversion of the
kmsg buffer to records; restore the old behavior.
Document, that CONFIG_PRINTK_TIME now only controls the output of
the timestamps in the syslog() system call and on the console, and
not the recording of the timestamps.
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Several distros set this by default by patching panic_on_oops.
It seems to fit with the BOOTPARAM_{HARD,SOFT}_PANIC options
though, so let's add a Kconfig entry and reduce some more
upstream delta.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120411121529.GH26688@redacted.bos.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In dynamic-debug-howto.txt:
- add section: Debug Messages at Module Initialization Time
- update flags indicators in example outputs to include '='
- make flags descriptions tabular
- add item on '_' flag-char
- add dyndbg, boot-args examples
- rewrap some paragraphs with long lines
In Kconfig.debug, note that compiling with -DDEBUG enables all
pr_debug()s in that code.
In kernel-parameters.txt, add dyndbg and module.dyndbg items,
and deprecate ddebug_query.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a config option to disable various gcc compiler optimizations that
make assembler listings much harder to read. This is everything that reorders
code significantly or creates partial functions.
This is mainly to keep kernel hackers sane.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1332960678-11879-2-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Pull x86 updates from Ingo Molnar.
This touches some non-x86 files due to the sanitized INLINE_SPIN_UNLOCK
config usage.
Fixed up trivial conflicts due to just header include changes (removing
headers due to cpu_idle() merge clashing with the <asm/system.h> split).
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic/amd: Be more verbose about LVT offset assignments
x86, tls: Off by one limit check
x86/ioapic: Add io_apic_ops driver layer to allow interception
x86/olpc: Add debugfs interface for EC commands
x86: Merge the x86_32 and x86_64 cpu_idle() functions
x86/kconfig: Remove CONFIG_TR=y from the defconfigs
x86: Stop recursive fault in print_context_stack after stack overflow
x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y
x86/apic: Add separate apic_id_valid() functions for selected apic drivers
locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage
x86/kconfig: Update defconfigs
x86: Fix excessive MSR print out when show_msr is not specified
Pull sysctl updates from Eric Biederman:
- Rewrite of sysctl for speed and clarity.
Insert/remove/Lookup in sysctl are all now O(NlogN) operations, and
are no longer bottlenecks in the process of adding and removing
network devices.
sysctl is now focused on being a filesystem instead of system call
and the code can all be found in fs/proc/proc_sysctl.c. Hopefully
this means the code is now approachable.
Much thanks is owed to Lucian Grinjincu for keeping at this until
something was found that was usable.
- The recent proc_sys_poll oops found by the fuzzer during hibernation
is fixed.
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl: (36 commits)
sysctl: protect poll() in entries that may go away
sysctl: Don't call sysctl_follow_link unless we are a link.
sysctl: Comments to make the code clearer.
sysctl: Correct error return from get_subdir
sysctl: An easier to read version of find_subdir
sysctl: fix memset parameters in setup_sysctl_set()
sysctl: remove an unused variable
sysctl: Add register_sysctl for normal sysctl users
sysctl: Index sysctl directories with rbtrees.
sysctl: Make the header lists per directory.
sysctl: Move sysctl_check_dups into insert_header
sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy
sysctl: Replace root_list with links between sysctl_table_sets.
sysctl: Add sysctl_print_dir and use it in get_subdir
sysctl: Stop requiring explicit management of sysctl directories
sysctl: Add a root pointer to ctl_table_set
sysctl: Rewrite proc_sys_readdir in terms of first_entry and next_entry
sysctl: Rewrite proc_sys_lookup introducing find_entry and lookup_entry.
sysctl: Normalize the root_table data structure.
sysctl: Factor out insert_header and erase_header
...
ARCH_HAS_NMI_WATCHDOG is a macro defined by arch, but config
HARDLOCKUP_DETECTOR depends on it. This is wrong, ARCH_HAS_NMI_WATCHDOG
has to be a Kconfig config, and arch's need it should select it
explicitly.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Get rid of INLINE_SPIN_UNLOCK entirely replacing it with
UNINLINE_SPIN_UNLOCK instead of the reverse meaning.
Whoever wants to change the default spinlock inlining
behavior and uninline the spinlocks for some weird reason,
such as spinlock debugging, paravirt etc. can now all just
select UNINLINE_SPIN_UNLOCK
Original discussion at: https://lkml.org/lkml/2012/3/21/357
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20120322095502.30866.75756.sendpatchset@codeblue
[ tidied up the changelog a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)
This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.
The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:
$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf
This output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.
This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.
- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.
- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485
- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.
- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:
struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code
...
static_key_slow_inc();
...
static_key_slow_inc();
...
The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.
This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.
- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.
- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.
- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.
- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
There have been situations where RCU CPU stall warnings were caused by
issues in scheduling-clock timer initialization. To make it easier to
track these down, this commit causes the RCU CPU stall-warning messages
to print out the number of scheduling-clock interrupts taken in the
current grace period for each stalled CPU.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The RCU_TRACE kernel parameter has always been intended for debugging,
not for production use. Formalize this by moving RCU_TRACE from
init/Kconfig to lib/Kconfig.debug.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The soft and hard lockup thresholds have changed so the
corresponding Kconfig entries need to be updated accordingly.
Add a reference to watchdog_thresh while at it.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Stop validating subdirectories now that we only register leaf tables
- Cleanup and improve the duplicate filename check.
* Run the duplicate filename check under the sysctl_lock to guarantee
we never add duplicate names.
* Reduce the duplicate filename check to nearly O(M*N) where M is the
number of entries in tthe table we are registering and N is the
number of entries in the directory before we got there.
- Move the duplicate filename check into it's own function and call
it directtly from __register_sysctl_table
- Kill the config option as the sanity checks are now cheap enough
the config option is unnecessary. The original reason for the config
option was because we had a huge table used to verify the proc filename
to binary sysctl mapping. That table has now evolved into the binary_sysctl
translation layer and is no longer part of the sysctl_check code.
- Tighten up the permission checks. Guarnateeing that files only have read
or write permissions.
- Removed redudant check for parents having a procname as now everything has
a procname.
- Generalize the backtrace logic so that we print a backtrace from
any failure of __register_sysctl_table that was not caused by
a memmory allocation failure. The backtrace allows us to track
down who erroneously registered a sysctl table.
Bechmark before (CONFIG_SYSCTL_CHECK=y):
make-dummies 0 999 -> 12s
rmmod dummy -> 0.08s
Bechmark before (CONFIG_SYSCTL_CHECK=n):
make-dummies 0 999 -> 0.7s
rmmod dummy -> 0.06s
make-dummies 0 99999 -> 1m13s
rmmod dummy -> 0.38s
Benchmark after:
make-dummies 0 999 -> 0.65s
rmmod dummy -> 0.055s
make-dummies 0 9999 -> 1m10s
rmmod dummy -> 0.39s
The sysctl sanity checks now impose no measurable cost.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Ensure that memory hotplug can co-exist with kmemleak
by taking the hotplug lock before scanning the memory
banks.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Added missing _secs in the help message of config DEFAULT_HUNG_TASK_TIMEOUT.
Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (83 commits)
mmc: fix compile error when CONFIG_BLOCK is not enabled
mmc: core: Cleanup eMMC4.5 conditionals
mmc: omap_hsmmc: if multiblock reads are broken, disable them
mmc: core: add workaround for controllers with broken multiblock reads
mmc: core: Prevent too long response times for suspend
mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00
mmc: sd: Handle SD3.0 cards not supporting UHS-I bus speed mode
mmc: core: support HPI send command
mmc: core: Add cache control for eMMC4.5 device
mmc: core: Modify the timeout value for writing power class
mmc: core: new discard feature support at eMMC v4.5
mmc: core: mmc sanitize feature support for v4.5
mmc: dw_mmc: modify DATA register offset
mmc: sdhci-pci: add flag for devices that can support runtime PM
mmc: omap_hsmmc: ensure pbias configuration is always done
mmc: core: Add Power Off Notify Feature eMMC 4.5
mmc: sdhci-s3c: fix potential NULL dereference
mmc: replace printk with appropriate display macro
mmc: core: Add default timeout value for CMD6
mmc: sdhci-pci: add runtime pm support
...
This adds support to inject data errors after a completed host transfer.
The mmc core will return error even though the host transfer is successful.
This simple fault injection proved to be very useful to test the
non-blocking error handling in the mmc_blk_issue_rw_rq().
Random faults can also test how the host driver handles pre_req()
and post_req() in case of errors.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits)
MAINTAINERS: linux-m32r is moderated for non-subscribers
linux@lists.openrisc.net is moderated for non-subscribers
Drop default from "DM365 codec select" choice
parisc: Kconfig: cleanup Kernel page size default
Kconfig: remove redundant CONFIG_ prefix on two symbols
cris: remove arch/cris/arch-v32/lib/nand_init.S
microblaze: add missing CONFIG_ prefixes
h8300: drop puzzling Kconfig dependencies
MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers
tty: drop superfluous dependency in Kconfig
ARM: mxc: fix Kconfig typo 'i.MX51'
Fix file references in Kconfig files
aic7xxx: fix Kconfig references to READMEs
Fix file references in drivers/ide/
thinkpad_acpi: Fix printk typo 'bluestooth'
bcmring: drop commented out line in Kconfig
btmrvl_sdio: fix typo 'btmrvl_sdio_sd6888'
doc: raw1394: Trivial typo fix
CIFS: Don't free volume_info->UNC until we are entirely done with it.
treewide: Correct spelling of successfully in comments
...
Thumb2 kernels cannot be built with frame pointers, but can use the
ARM_UNWIND feature for unwinding instead. This makes sure that all
features that rely on unwinding includeing CONFIG_LATENCYTOP and
FAULT_INJECTION_STACKTRACE_FILTER do not enable frame pointers
when the unwinder is already selected, and we always build with
the unwinder when we want a thumb2 kernel, to make sure we do not
get the frame pointers instead.
A different option would be to redefine the CONFIG_FRAME_POINTERS
option on ARM to mean builing with either frame pointers or
the unwinder, and then select which one to use based on the
CPU architecture or another user option. That would still allow
building thumb2 kernels without the unwinder but would also be
more confusing.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The patch below removes an extra "it" in the comment.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
sched: Cleanup duplicate local variable in [enqueue|dequeue]_task_fair
sched: Replace use of entity_key()
sched: Separate group-scheduling code more clearly
sched: Reorder root_domain to remove 64 bit alignment padding
sched: Do not attempt to destroy uninitialized rt_bandwidth
sched: Remove unused function cpu_cfs_rq()
sched: Fix (harmless) typo 'CONFG_FAIR_GROUP_SCHED'
sched, cgroup: Optimize load_balance_fair()
sched: Don't update shares twice on on_rq parent
sched: update correct entity's runtime in check_preempt_wakeup()
xtensa: Use generic config PREEMPT definition
h8300: Use generic config PREEMPT definition
m32r: Use generic PREEMPT config
sched: Skip autogroup when looking for all rt sched groups
sched: Simplify mutex_spin_on_owner()
sched: Remove rcu_read_lock() from wake_affine()
sched: Generalize sleep inside spinlock detection
sched: Make sleeping inside spinlock detection working in !CONFIG_PREEMPT
sched: Isolate preempt counting in its own config option
sched: Remove pointless in_atomic() definition check
...
The sleeping inside spinlock detection is actually used
for more general sleeping inside atomic sections
debugging: preemption disabled, rcu read side critical
sections, interrupts, interrupt disabled, etc...
Change the name of the config and its help section to
reflect its more general role.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Select CONFIG_PREEMPT_COUNT when we enable the sleeping inside
spinlock detection, so that the preempt offset gets correctly
incremented/decremented from preempt_disable()/preempt_enable().
This makes the preempt count eventually working in !CONFIG_PREEMPT
when that debug option is set and thus fixes the detection of explicit
preemption disabled sections under such config. Code that sleeps
in explicitly preempt disabled section can be finally spotted
in non-preemptible kernels.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. Move it
to lib/Kconfig.debug so each arch doesn't have to define it. This
obviously makes the option generic, but that's fine because the config is
already used in generic code.
It's not obvious to me that sysrq-P actually does anything caution by
keeping the most inclusive wording.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
DEBUG_PER_CPU_MAPS is used in lib/cpumask.c as well as in
inlcude/linux/cpumask.h and thus it has outgrown its use within x86 and
powerpc alone. Any arch with SMP support may want to get some more
debugging, so make this option generic.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
b43: fix comment typo reqest -> request
Haavard Skinnemoen has left Atmel
cris: typo in mach-fs Makefile
Kconfig: fix copy/paste-ism for dell-wmi-aio driver
doc: timers-howto: fix a typo ("unsgined")
perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
treewide: fix a few typos in comments
regulator: change debug statement be consistent with the style of the rest
Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
audit: acquire creds selectively to reduce atomic op overhead
rtlwifi: don't touch with treewide double semicolon removal
treewide: cleanup continuations and remove logging message whitespace
ath9k_hw: don't touch with treewide double semicolon removal
include/linux/leds-regulator.h: fix syntax in example code
tty: fix typo in descripton of tty_termios_encode_baud_rate
xtensa: remove obsolete BKL kernel option from defconfig
m68k: fix comment typo 'occcured'
arch:Kconfig.locks Remove unused config option.
treewide: remove extra semicolons
...
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
seqlock: Don't smp_rmb in seqlock reader spin loop
watchdog, hung_task_timeout: Add Kconfig configurable default
lockdep: Remove cmpxchg to update nr_chain_hlocks
lockdep: Print a nicer description for simple irq lock inversions
lockdep: Replace "Bad BFS generated tree" message with something less cryptic
lockdep: Print a nicer description for irq inversion bugs
lockdep: Print a nicer description for simple deadlocks
lockdep: Print a nicer description for normal deadlocks
lockdep: Print a nicer description for irq lock inversions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
kmemleak: Initialise kmemleak after debug_objects_mem_init()
kmemleak: Select DEBUG_FS unconditionally in DEBUG_KMEMLEAK
kmemleak: Do not return a pointer to an object that kmemleak did not get
In the past DEBUG_FS used to depend on SYSFS and DEBUG_KMEMLEAK selected
it conditionally. This is no longer the case, so always select DEBUG_FS
via DEBUG_KMEMLEAK.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The prohibition of DEBUG_OBJECTS_RCU_HEAD from !PREEMPT was due to the
fixup actions. So just produce a warning from !PREEMPT.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The RCU CPU stall warnings can now be controlled using the
rcu_cpu_stall_suppress boot-time parameter or via the same parameter
from sysfs. There is therefore no longer any reason to have
kernel config parameters for this feature. This commit therefore
removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel config parameters. The RCU_CPU_STALL_TIMEOUT parameter remains
to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter
remains to allow task-stall information to be suppressed if desired.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This patch allows the default value for sysctl_hung_task_timeout_secs
to be set at build time. The feature carries virtually no overhead,
so it makes sense to keep it enabled. On heavily loaded systems, though,
it can end up triggering stack traces when there is no bug other than
the system being underprovisioned. We use this patch to keep the hung task
facility available but disabled at boot-time.
The default of 120 seconds is preserved. As a note, commit e162b39a may
have accidentally reverted commit fb822db4, which raised the default from
120 seconds to 480 seconds.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Mandeep Singh Baines <msb@google.com>
Link: http://lkml.kernel.org/r/4DB8600C.8080000@suse.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cite Documentation/kernel-parameters.txt for an alternative to
building with PRINTK_TIME compiled in.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mm/kmemleak-test.c is used to provide an example of how kmemleak
tool works.
Memory is leaked at module unload-time, so building the test
in kernel (Y) makes the leaks impossible and the test useless.
Qualify DEBUG_KMEMLEAK_TEST config symbol with "depends on m",
to restrict module-only building.
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. simple_strto*() do not contain overflow checks and crufty,
libc way to indicate failure.
2. strict_strto*() also do not have overflow checks but the name and
comments pretend they do.
3. Both families have only "long long" and "long" variants,
but users want strtou8()
4. Both "simple" and "strict" prefixes are wrong:
Simple doesn't exactly say what's so simple, strict should not exist
because conversion should be strict by default.
The solution is to use "k" prefix and add convertors for more types.
Enter
kstrtoull()
kstrtoll()
kstrtoul()
kstrtol()
kstrtouint()
kstrtoint()
kstrtou64()
kstrtos64()
kstrtou32()
kstrtos32()
kstrtou16()
kstrtos16()
kstrtou8()
kstrtos8()
Include runtime testsuite (somewhat incomplete) as well.
strict_strto*() become deprecated, stubbed to kstrto*() and
eventually will be removed altogether.
Use kstrto*() in code today!
Note: on some archs _kstrtoul() and _kstrtol() are left in tree, even if
they'll be unused at runtime. This is temporarily solution,
because I don't want to hardcode list of archs where these
functions aren't needed. Current solution with sizeof() and
__alignof__ at least always works.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We've been burned by regressions/bugs which we later realized could have
been triaged quicker if only we'd paid closer attention to dmesg. To make
it easier to audit dmesg, we'd like to make DEFAULT_MESSAGE_LEVEL
Kconfig-settable. That way we can set it to KERN_NOTICE and audit any
messages <= KERN_WARNING.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Cc: Olof Johansson <olofj@chromium.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a cpu is considered stuck, instead of limping along and just printing
a warning, it is sometimes preferred to just panic, let kdump capture the
vmcore and reboot. This gets the machine back into a stable state quickly
while saving the info that got it into a stuck state to begin with.
Add a Kconfig option to allow users to set the hardlockup to panic
by default. Also add in a 'nmi_watchdog=nopanic' to override this.
[akpm@linux-foundation.org: fix strncmp length]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: Make DEBUG_SECTION_MISMATCH selectable, but not on by default
genksyms: Regenerate lexer and parser
genksyms: Track changes to enum constants
genksyms: simplify usage of find_symbol()
genksyms: Add helpers for building string lists
genksyms: Simplify printing of symbol types
genksyms: Simplify lexer
genksyms: Do not paste the bison header file to lex.c
modpost: fix trailing comma
KBuild: silence "'scripts/unifdef' is up to date."
kbuild: Add extra gcc checks
kbuild: reenable section mismatch analysis
unifdef: update to upstream version 2.5
CONFIG_DEBUG_SECTION_MISMATCH has also runtime effects due to the
-fno-inline-functions-called-once compiler flag, so forcing it on
everyone is not a good idea.
Signed-off-by: Michal Marek <mmarek@suse.cz>
This removes the implementation of the big kernel lock,
at last. A lot of people have worked on this in the
past, I so the credit for this patch should be with
everyone who participated in the hunt.
The names on the Cc list are the people that were the
most active in this, according to the recorded git
history, in alphabetical order.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Alan Cox <alan@linux.intel.com>
Cc: Alessio Igor Bogani <abogani@texware.it>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Jan Blunck <jblunck@infradead.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Paul Menage <menage@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
This was disabled in commit
e5f95c8 (kbuild: print only total number of section mismatces found)
because there were too many warnings. Now we're down to a reasonable
number again, so we start scaring people with the details.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.
This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel. A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).
Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The x86 arch has shifted its use of the nmi_watchdog from a
local implementation to the global one provide by
kernel/watchdog.c. This shift has caused a whole bunch of
compile problems under different config options. I attempt to
simplify things with the patch below.
In order to simplify things, I had to come to terms with the
meaning of two terms ARCH_HAS_NMI_WATCHDOG and
CONFIG_HARDLOCKUP_DETECTOR. Basically they mean the same thing,
the former on a local level and the latter on a global level.
With the old x86 nmi watchdog gone, there is no need to rely on
defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
make sense any more. x86 will now use the global
implementation.
The changes below do a few things. First it changes the few
places that relied on ARCH_HAS_NMI_WATCHDOG to use
CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
anyway, so nothing unusual here). Those pieces of code were
relying more on local apic functionality the nmi watchdog
functionality, so the change should make sense.
Second, I removed the x86 implementation of
touch_nmi_watchdog(). It isn't need now, instead x86 will rely
on kernel/watchdog.c's implementation.
Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
x86. And tweaked the include/linux/nmi.h file to tell users to
look for an externally defined touch_nmi_watchdog in the case of
ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
changes removes some of the ugliness in that file.
Finally, I added a Kconfig dependency for
CONFIG_HARDLOCKUP_DETECTOR that said you can't have
ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR. You can
only have one nmi_watchdog.
Tested with
ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
(various broken configs)
Hopefully, after this patch I won't get any more compile broken
emails. :-)
v3:
changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (48 commits)
DMAENGINE: move COH901318 to arch_initcall
dma: imx-dma: fix signedness bug
dma/timberdale: simplify conditional
ste_dma40: remove channel_type
ste_dma40: remove enum for endianess
ste_dma40: remove TIM_FOR_LINK option
ste_dma40: move mode_opt to separate config
ste_dma40: move channel mode to a separate field
ste_dma40: move priority to separate field
ste_dma40: add variable to indicate valid dma_cfg
async_tx: make async_tx channel switching opt-in
move async raid6 test to lib/Kconfig.debug
dmaengine: Add Freescale i.MX1/21/27 DMA driver
intel_mid_dma: change the slave interface
intel_mid_dma: fix the WARN_ONs
intel_mid_dma: Add sg list support to DMA driver
intel_mid_dma: Allow DMAC2 to share interrupt
intel_mid_dma: Allow IRQ sharing
intel_mid_dma: Add runtime PM support
DMAENGINE: define a dummy filter function for ste_dma40
...
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: convert a BUG_ON to BUILD_BUG_ON
arch/tile: make ptrace() work properly for TILE-Gx COMPAT mode
arch/tile: support new info op generated by compiler
arch/tile: minor whitespace/naming changes for string support files
arch/tile: enable single-step support for TILE-Gx
arch/tile: parameterize system PLs to support KVM port
arch/tile: add Tilera's <arch/sim.h> header as an open-source header
arch/tile: Bomb C99 comments to C89 comments in tile's <arch/sim_def.h>
arch/tile: prevent corrupt top frame from causing backtracer runaway
arch/tile: various top-level Makefile cleanups
arch/tile: change lower bound on syscall error return to -4095
arch/tile: properly export __mb_incoherent for modules
arch/tile: provide a definition of MAP_STACK
kmemleak: add TILE to the list of supported architectures.
char: hvc: check for error case
arch/tile: Add a warning if we try to allocate too much vmalloc memory.
arch/tile: update some comments to clarify register usage.
arch/tile: use better "punctuation" for VMSPLIT_3_5G and friends
arch/tile: Use <asm-generic/syscalls.h>
tile: replace some BUG_ON checks with BUILD_BUG_ON checks
While hunting a non-existing bug in 'list_sort()', I've improved the
'list_sort_test()' function which tests the 'list_sort()' library call.
Although at the end I found a bug in my code, but not in 'list_sort()', I
think my clean-ups and improvements are worth merging because they make
the test function better.
This patch:
Make the self-tests selectable via Kconfig rather than by manual enabling
of DEBUG_LIST_SORT.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Don Mullis <don.mullis@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All percpu counters are linked to a global list on initialization and
removed from it on destruction. The list is walked during CPU up/down.
If a percpu counter is freed without being properly destroyed, the system
will oops only on the next CPU up/down making it pretty nasty to track
down. This patch adds debugobj support for percpu counters so that such
problems can be found easily.
As percpu counters don't make sense on stack and can't be statically
initialized, debugobj support is pretty simple. It's initialized and
activated on counter initialization, and deactivatd and destroyed on
counter destruction. With this patch applied, the bug fixed by commit
602586a83b (shmem: put_super must
percpu_counter_destroy) triggers the following warning on tmpfs unmount
and the system won't oops on the next cpu up/down operation.
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:259 debug_print_object+0x5c/0x70()
Hardware name: Bochs
ODEBUG: free active (active state 0) object type: percpu_counter
Modules linked in:
Pid: 3999, comm: umount Not tainted 2.6.36-rc2-work+ #5
Call Trace:
[<ffffffff81083f7f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff81084076>] warn_slowpath_fmt+0x46/0x50
[<ffffffff813b45cc>] debug_print_object+0x5c/0x70
[<ffffffff813b50e5>] debug_check_no_obj_freed+0x125/0x210
[<ffffffff811577d3>] kfree+0xb3/0x2f0
[<ffffffff81132edd>] shmem_put_super+0x1d/0x30
[<ffffffff81162e96>] generic_shutdown_super+0x56/0xe0
[<ffffffff81162f86>] kill_anon_super+0x16/0x60
[<ffffffff81162ff7>] kill_litter_super+0x27/0x30
[<ffffffff81163295>] deactivate_locked_super+0x45/0x60
[<ffffffff81163cfa>] deactivate_super+0x4a/0x70
[<ffffffff8117d446>] mntput_no_expire+0x86/0xe0
[<ffffffff8117df7f>] sys_umount+0x6f/0x360
[<ffffffff8103f01b>] system_call_fastpath+0x16/0x1b
---[ end trace cce2a341ba3611a7 ]---
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglxlinutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: introduce CONFIG_BKL.
dabusb: remove the BKL
sunrpc: remove the big kernel lock
init/main.c: remove BKL notations
blktrace: remove the big kernel lock
rtmutex-tester: make it build without BKL
dvb-core: kill the big kernel lock
dvb/bt8xx: kill the big kernel lock
tlclk: remove big kernel lock
fix rawctl compat ioctls breakage on amd64 and itanic
uml: kill big kernel lock
parisc: remove big kernel lock
cris: autoconvert trivial BKL users
alpha: kill big kernel lock
isapnp: BKL removal
s390/block: kill the big kernel lock
hpet: kill BKL, add compat_ioctl
With all the patches we have queued in the BKL removal tree, only a
few dozen modules are left that actually rely on the BKL, and even
there are lots of low-hanging fruit. We need to decide what to do
about them, this patch illustrates one of the options:
Every user of the BKL is marked as 'depends on BKL' in Kconfig,
and the CONFIG_BKL becomes a user-visible option. If it gets
disabled, no BKL using module can be built any more and the BKL
code itself is compiled out.
The one exception is file locking, which is practically always
enabled and does a 'select BKL' instead. This effectively forces
CONFIG_BKL to be enabled until we have solved the fs/lockd
mess and can apply the patch that removes the BKL from fs/locks.c.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
All the necessary functionality was already there; we just need
to make it possible to select the config option.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The prompt for "Self test for hardware accelerated raid6 recovery" does not
belong in the top level configuration menu. All the options in
crypto/async_tx/Kconfig are selected and do not depend on CRYPTO.
Kconfig.debug seems like a reasonable fit.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.
Make SYSFS support independent of CONFIG_SLUB_DEBUG
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The PROVE_RCU_REPEATEDLY has no "Say Y"/"Say N" advice, so this commit
adds it.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
When CONFIG_IRQSOFF_TRACER is set and CONFIG_PROVE_LOCKING is not, we
get the following error:
$ make oldconfig
scripts/kconfig/conf --oldconfig arch/x86/Kconfig
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
This is because IRQSOFF_TRACER selects TRACE_IRQFLAGS but TRACE_IRQFLAGS
has PROVE_LOCKING as a dependency. This code is incorrect, and
this patch changes the TRACE_IRQFLAGS to be just a simple bool that
does not depend or select anything. Instead both IRQSOFF_TRACER and
PROVE_LOCKING select it.
Reported-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently, if RCU CPU stall warnings are enabled, they are enabled
immediately upon boot. They can be manually disabled via /sys (and
also re-enabled via /sys), and are automatically disabled upon panic.
However, some users need RCU CPU stalls to be disabled at boot time,
but to be enabled without rebuilding/rebooting. For example, someone
running a real-time application in production might not want the
additional latency of RCU CPU stall detection in normal operation, but
might need to enable it at any point for fault isolation purposes.
This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel configuration parameter that maintains the current behavior
(enable at boot) by default, but allows a kernel to be configured
with RCU CPU stall detection built into the kernel, but disabled at
boot time.
Requested-by: Clark Williams <williams@redhat.com>
Requested-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Also set the default to 60 seconds, up from the previous hard-coded timeout
of 10 seconds. This allows people who care to set short timeouts, while
avoiding people with unusual configurations (make randconfig!!!) from being
bothered with spurious CPU stall warnings.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit provides definitions for the __rcu annotation defined earlier.
This annotation permits sparse to check for correct use of RCU-protected
pointers. If a pointer that is annotated with __rcu is accessed
directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
or one of their variants), sparse can be made to complain. To enable
such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
kernel configuration option. Please note that these sparse complaints are
intended to be a debugging aid, -not- a code-style-enforcement mechanism.
There are special rcu_dereference_protected() and rcu_access_pointer()
accessors for use when RCU read-side protection is not required, for
example, when no other CPU has access to the data structure in question
or while the current CPU hold the update-side lock.
This patch also updates a number of docbook comments that were showing
their age.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Christopher Li <sparse@chrisli.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHED_DEBUG which has unmet direct dependencies (DEBUG_KERNEL &&
PROC_FS) warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHEDSTATS which has unmet direct dependencies (DEBUG_KERNEL && PROC_FS)
Add depends on STACKTRACE_SUPPORT for 'select STACKTRACE'.
Add depends on PROC_FS since that is where the output goes.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20100812123121.a7c99cde.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
kmemleak: Fix typo in the comment
lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
kmemleak: Add DocBook style comments to kmemleak.c
kmemleak: Introduce a default off mode for kmemleak
kmemleak: Show more information for objects found by alias
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (28 commits)
driver core: device_rename's new_name can be const
sysfs: Remove owner field from sysfs struct attribute
powerpc/pci: Remove owner field from attribute initialization in PCI bridge init
regulator: Remove owner field from attribute initialization in regulator core driver
leds: Remove owner field from attribute initialization in bd2802 driver
scsi: Remove owner field from attribute initialization in ARCMSR driver
scsi: Remove owner field from attribute initialization in LPFC driver
cgroupfs: create /sys/fs/cgroup to mount cgroupfs on
Driver core: Add BUS_NOTIFY_BIND_DRIVER
driver core: fix memory leak on one error path in bus_register()
debugfs: no longer needs to depend on SYSFS
sysfs: Fix one more signature discrepancy between sysfs implementation and docs.
sysfs: fix discrepancies between implementation and documentation
dcdbas: remove a redundant smi_data_buf_free in dcdbas_exit
dmi-id: fix a memory leak in dmi_id_init error path
sysfs: sysfs_chmod_file's attr can be const
firmware: Update hotplug script
Driver core: move platform device creation helpers to .init.text (if MODULE=n)
Driver core: reduce duplicated code for platform_device creation
Driver core: use kmemdup in platform_device_add_resources
...
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
tracing/kprobes: unregister_trace_probe needs to be called under mutex
perf: expose event__process function
perf events: Fix mmap offset determination
perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
perf, powerpc: Convert the FSL driver to use local64_t
perf tools: Don't keep unreferenced maps when unmaps are detected
perf session: Invalidate last_match when removing threads from rb_tree
perf session: Free the ref_reloc_sym memory at the right place
x86,mmiotrace: Add support for tracing STOS instruction
perf, sched migration: Librarize task states and event headers helpers
perf, sched migration: Librarize the GUI class
perf, sched migration: Make the GUI class client agnostic
perf, sched migration: Make it vertically scrollable
perf, sched migration: Parameterize cpu height and spacing
perf, sched migration: Fix key bindings
perf, sched migration: Ignore unhandled task states
perf, sched migration: Handle ignored migrate out events
perf: New migration tool overview
tracing: Drop cpparg() macro
perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
...
Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "net: Make accesses to ->br_port safe for sparse RCU"
mce: convert to rcu_dereference_index_check()
net: Make accesses to ->br_port safe for sparse RCU
vfs: add fs.h to define struct file
lockdep: Add an in_workqueue_context() lockdep-based test function
rcu: add __rcu API for later sparse checking
rcu: add an rcu_dereference_index_check()
tree/tiny rcu: Add debug RCU head objects
mm: remove all rcu head initializations
fs: remove all rcu head initializations, except on_stack initializations
powerpc: remove all rcu head initializations
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
modpost: support objects with more than 64k sections
trivial: fix a typo in a filename
frv: clean up arch/frv/Makefile
kbuild: allow assignment to {A,C}FLAGS_KERNEL on the command line
kbuild: allow assignment to {A,C,LD}FLAGS_MODULE on the command line
Kbuild: Add option to set -femit-struct-debug-baseonly
Makefile: "make kernelrelease" should show the correct full kernel version
Makefile.build: make KBUILD_SYMTYPES work again
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (49 commits)
microblaze: Add KGDB support
microblaze: Support brki rX, 0x18 for user application debugging
microblaze: Remove nop after MSRCLR/SET, MTS, MFS instructions
microblaze: Simplify syscall rutine
microblaze: Move PT_MODE saving to delay slot
microblaze: Fix _interrupt function
microblaze: Fix _user_exception function
microblaze: Put together addik instructions
microblaze: Use delay slot in syscall macros
microblaze: Save kernel mode in delay slot
microblaze: Do not mix register saving and mode setting
microblaze: Move SAVE_STATE upward
microblaze: entry.S: Macro optimization
microblaze: Optimize hw exception rutine
microblaze: Implement clear_ums macro and fix SAVE_STATE macro
microblaze: Remove additional setup for kernel_mode
microblaze: Optimize SAVE_STATE macro
microblaze: Remove additional loading
microblaze: Completely remove working with R11 register
microblaze: Do not setup BIP in _debug_exception
...
Microblaze doesn't support frame pointers. Ftrace code
uses CALLER_ADDR1 which is defined in linux/ftrace.h. For Microblaze
is 0.
Signed-off-by: Michal Simek <monstr@monstr.eu>
The last 't' of 'fault' is missing in the description of FAIL_IO_TIMEOUT.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Introduce a new DEBUG_KMEMLEAK_DEFAULT_OFF config parameter that allows
kmemleak to be disabled by default, but enabled on the command line
via: kmemleak=on. Although a reboot is required to turn it on, its still
useful to not require a re-compile.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Newer gcc has a -femit-struct-debug-baseonly option that dramatically
reduces the size of object files with debug info. What it does
is to only emit type information for structures when the structures
are defined in the same file or in a header file.
This means the type information for most headers are not included.
This is not good when the type information is actually
needed (e.g. with kgdb or systemtap)
But often kernel hackers only care about line numbers and don't
need all the type information anyways. In this case setting
the option can be a big win:
A build dir for a specific x86-64 configuration with gcc 4.5
shrunk from 2.3G to 1.2G. The compilation was also nearly a minute
faster.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
[mmarek: reformatted help text]
Signed-off-by: Michal Marek <mmarek@suse.cz>
Helps finding racy users of call_rcu(), which results in hangs because list
entries are overwritten and/or skipped.
Changelog since v4:
- Bissectability is now OK
- Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
call_rcu(). Statically initialized objects are detected with
object_is_static().
- Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
- Remove init_rcu_head() completely.
Changelog since v3:
- Include comments from Lai Jiangshan
This new patch version is based on the debugobjects with the newly introduced
"active state" tracker.
Non-initialized entries are all considered as "statically initialized". An
activation fixup (triggered by call_rcu()) takes care of performing the debug
object initialization without issuing any warning. Since we cannot increase the
size of struct rcu_head, I don't see much room to put an identifier for
statically initialized rcu_head structures. So for now, we have to live without
"activation without explicit init" detection. But the main purpose of this debug
option is to detect double-activations (double call_rcu() use of a rcu_head
before the callback is executed), which is correctly addressed here.
This also detects potential internal RCU callback corruption, which would cause
the callbacks to be executed twice.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
I used this module to test the series of modification to the cpu notifiers
code.
Example1: inject CPU offline error (-1 == -EPERM)
# modprobe cpu-notifier-error-inject cpu_down_prepare_error=-1
# echo 0 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Operation not permitted
Example2: inject CPU online error (-2 == -ENOENT)
# modprobe cpu-notifier-error-inject cpu_up_prepare_error=-2
# echo 1 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: No such file or directory
[akpm@linux-foundation.org: fix Kconfig help text]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The lockup detector is a new feature that now involves the
nmi watchdog. Drop the default y and let the user choose.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix LOCK_PREFIX_HERE for uniprocessor build
x86, atomic64: In selftest, distinguish x86-64 from 586+
x86-32: Fix atomic64_inc_not_zero return value convention
lib: Fix atomic64_inc_not_zero test
lib: Fix atomic64_add_unless return value convention
x86-32: Fix atomic64_add_unless return value convention
lib: Fix atomic64_add_unless test
x86: Implement atomic[64]_dec_if_positive()
lib: Only test atomic64_dec_if_positive on archs having it
x86-32: Rewrite 32-bit atomic64 functions in assembly
lib: Add self-test for atomic64_t
x86-32: Allow UP/SMP lock replacement in cmpxchg64
x86: Add support for lock prefix in alternatives
This new config is deemed to simplify even more the lockup detector
dependencies and can make it easier to bring a smooth sorting
between archs that support the new generic lockup detector and those
that still have their own, especially for those that are in the
middle of this migration.
Instead of checking whether we have CONFIG_LOCKUP_DETECTOR +
CONFIG_PERF_EVENTS_NMI each time an arch wants to know if it needs
to build its own lockup detector, take a shortcut with this new
config. It is enabled only if the hardlockup detection part of
the whole lockup detector is on.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
We kept CONFIG_DETECT_SOFTLOCKUP around for compatibility with
older configs. But it was enabled by default if CONFIG_DEBUG_KERNEL.
So if we want to enable CONFIG_LOCKUP_DETECTOR on configs that had
CONFIG_DETECT_SOFTLOCKUP, all we need is to have the same enabling
by default if CONFIG_DEBUG_KERNEL. We can then remove
CONFIG_DETECT_SOFTLOCKUP directly.
So tag CONFIG_LOCKUP_DETECTOR as default y. This is what we want for
most serious kernel debugging anyway.
And also forbid the lockup detector in S390 as it was for the
previous softlockup detector, event though the true reason for that
is not outlined.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Panic on softlockups was still depending on the softlockup detector.
But the latter has been merged into the lockup detector now.
Let's update this config dependency.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
The new nmi_watchdog (which uses the perf event subsystem) is very
similar in structure to the softlockup detector. Using Ingo's
suggestion, I combined the two functionalities into one file:
kernel/watchdog.c.
Now both the nmi_watchdog (or hardlockup detector) and softlockup
detector sit on top of the perf event subsystem, which is run every
60 seconds or so to see if there are any lockups.
To detect hardlockups, cpus not responding to interrupts, I
implemented an hrtimer that runs 5 times for every perf event
overflow event. If that stops counting on a cpu, then the cpu is
most likely in trouble.
To detect softlockups, tasks not yielding to the scheduler, I used the
previous kthread idea that now gets kicked every time the hrtimer fires.
If the kthread isn't being scheduled neither is anyone else and the
warning is printed to the console.
I tested this on x86_64 and both the softlockup and hardlockup paths
work.
V2:
- cleaned up the Kconfig and softlockup combination
- surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
- seperated out the softlockup case from perf event subsystem
- re-arranged the enabling/disabling nmi watchdog from proc space
- added cpumasks for hardlockup failure cases
- removed fallback to soft events if no PMU exists for hard events
V3:
- comment cleanups
- drop support for older softlockup code
- per_cpu cleanups
- completely remove software clock base hardlockup detector
- use per_cpu masking on hard/soft lockup detection
- #ifdef cleanups
- rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
- documentation additions
V4:
- documentation fixes
- convert per_cpu to __get_cpu_var
- powerpc compile fixes
V5:
- split apart warn flags for hard and soft lockups
TODO:
- figure out how to make an arch-agnostic clock2cycles call
(if possible) to feed into perf events as a sample period
[fweisbec: merged conflict patch]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The CPU_STALL_VERBOSE kernel configuration parameter was added to
2.6.34 to identify any preempted/blocked tasks that were preventing
the current grace period from completing when running preemptible
RCU. As is conventional for new configurations parameters, this
defaulted disabled. It is now time to enable it by default.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There is no need to disable lockdep after an RCU lockdep splat,
so remove the debug_lockdeps_off() from lockdep_rcu_dereference().
To avoid repeated lockdep splats, use a static variable in the inlined
rcu_dereference_check() and rcu_dereference_protected() macros so that
a given instance splats only once, but so that multiple instances can
be detected per boot.
This is controlled by a new config variable CONFIG_PROVE_RCU_REPEATEDLY,
which is disabled by default. This provides the normal lockdep behavior
by default, but permits people who want to find multiple RCU-lockdep
splats per boot to easily do so.
Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Merge reason:
Conflict between LOCK_PREFIX_HERE and relative alternatives
pointers
Resolved Conflicts:
arch/x86/include/asm/alternative.h
arch/x86/kernel/alternative.c
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
We see only one section mismatch now after thousands of randconfigs, and a
bug has been filed about that one.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add adds a debugfs interface and additional failure modes to LKDTM to
provide similar functionality to the provoke-crash driver submitted here:
http://lwn.net/Articles/371208/
Crashes can now be induced either through module parameters (as before)
or through the debugfs interface as in provoke-crash.
The patch also provides a new "direct" interface, where KPROBES are not
used, i.e., the crash is invoked directly upon write to the debugfs
file. When built without KPROBES configured, only this mode is available.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Cc: M. Mohan Kumar <mohan@in.ibm.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was nice to enable it by default for testing - but before we
push it upstream we want it to be off - so that people can
opt-in gradually.
Cc: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266880143-24943-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits)
rcu: Fix accelerated GPs for last non-dynticked CPU
rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot
rcu: Fix accelerated grace periods for last non-dynticked CPU
rcu: Export rcu_scheduler_active
rcu: Make rcu_read_lock_sched_held() take boot time into account
rcu: Make lockdep_rcu_dereference() message less alarmist
sched, cgroups: Fix module export
rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information
rcu: Fix rcutorture mod_timer argument to delay one jiffy
rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection
rcu: Convert to raw_spinlocks
rcu: Stop overflowing signed integers
rcu: Use canonical URL for Mathieu's dissertation
rcu: Accelerate grace period if last non-dynticked CPU
rcu: Fix citation of Mathieu's dissertation
rcu: Documentation update for CONFIG_PROVE_RCU
security: Apply lockdep-based checking to rcu_dereference() uses
idr: Apply lockdep-based diagnostics to rcu_dereference() uses
radix-tree: Disable RCU lockdep checking in radix tree
vfs: Abstract rcu_dereference_check for files-fdtable use
...
Remove pointless union in the breakpoint field of hw_perf_event.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
I've forgot to add 'perf lock' line to command-list.txt,
so users of perf could not find perf lock when they type 'perf'.
Fixing command-list.txt requires document
(tools/perf/Documentation/perf-lock.txt).
But perf lock is too much "under construction" to write a
stable document, so this is something like pseudo document for now.
And I wrote description of perf lock at help section of
CONFIG_LOCK_STAT, this will navigate users of lock trace events.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This patch adds self-test on boot code for atomic64_t.
This has been used to test the later changes in this patchset.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-4-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Inspection is proving insufficient to catch all RCU misuses,
which is understandable given that rcu_dereference() might be
protected by any of four different flavors of RCU (RCU, RCU-bh,
RCU-sched, and SRCU), and might also/instead be protected by any
of a number of locking primitives. It is therefore time to
enlist the aid of lockdep.
This set of patches is inspired by earlier work by Peter
Zijlstra and Thomas Gleixner, and takes the following approach:
o Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.
o Set up separate lockdep classes for each instance of SRCU.
o Create primitives that check for being in an RCU read-side
critical section. These return exact answers if lockdep is
fully enabled, but if unsure, report being in an RCU read-side
critical section. (We want to avoid false positives!)
The primitives are:
For RCU: rcu_read_lock_held(void)
For RCU-bh: rcu_read_lock_bh_held(void)
For RCU-sched: rcu_read_lock_sched_held(void)
For SRCU: srcu_read_lock_held(struct srcu_struct *sp)
o Add rcu_dereference_check(), which takes a second argument
in which one places a boolean expression based on the above
primitives and/or lockdep_is_held().
o A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
rcu_dereference_check(). This depends on CONFIG_PROVE_LOCKING,
and should be quite helpful during the transition period while
CONFIG_PROVE_RCU-unaware patches are in flight.
The existing rcu_dereference() primitive does no checking, but
upcoming patches will change that.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-1-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Determines if an arch has setup arch specific perf_events and
nmi_watchdog code. This should restrict compiles to only those
arches ready.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
LKML-Reference: <1266013161-31197-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It wont even build on other platforms just yet - so restrict it
to x86 for now.
Cc: Don Zickus <dzickus@redhat.com>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
These are the bits that enable the new nmi_watchdog and safely
isolate the old nmi_watchdog. Only one or the other can run,
not both at the same time.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: gorcunov@gmail.com
Cc: aris@redhat.com
Cc: peterz@infradead.org
LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'kmemleak' of git://linux-arm.org/linux-2.6:
kmemleak: fix kconfig for crc32 build error
kmemleak: Reduce the false positives by checking for modified objects
kmemleak: Show the age of an unreferenced object
kmemleak: Release the object lock before calling put_object()
kmemleak: Scan the _ftrace_events section in modules
kmemleak: Simplify the kmemleak_scan_area() function prototype
kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
security/tomoyo: Remove now unnecessary handling of security_sysctl.
security/tomoyo: Add a special case to handle accesses through the internal proc mount.
sysctl: Drop & in front of every proc_handler.
sysctl: Remove CTL_NONE and CTL_UNNUMBERED
sysctl: kill dead ctl_handler definitions.
sysctl: Remove the last of the generic binary sysctl support
sysctl net: Remove unused binary sysctl code
sysctl security/tomoyo: Don't look at ctl_name
sysctl arm: Remove binary sysctl support
sysctl x86: Remove dead binary sysctl support
sysctl sh: Remove dead binary sysctl support
sysctl powerpc: Remove dead binary sysctl support
sysctl ia64: Remove dead binary sysctl support
sysctl s390: Remove dead sysctl binary support
sysctl frv: Remove dead binary sysctl support
sysctl mips/lasat: Remove dead binary sysctl support
sysctl drivers: Remove dead binary sysctl support
sysctl crypto: Remove dead binary sysctl support
sysctl security/keys: Remove dead binary sysctl support
sysctl kernel: Remove binary sysctl logic
...
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
rcu: Make RCU's CPU-stall detector be default
rcu: Add expedited grace-period support for preemptible RCU
rcu: Enable fourth level of TREE_RCU hierarchy
rcu: Rename "quiet" functions
rcu: Re-arrange code to reduce #ifdef pain
rcu: Eliminate unneeded function wrapping
rcu: Fix grace-period-stall bug on large systems with CPU hotplug
rcu: Eliminate __rcu_pending() false positives
rcu: Further cleanups of use of lastcomp
rcu: Simplify association of forced quiescent states with grace periods
rcu: Accelerate callback processing on CPUs not detecting GP end
rcu: Mark init-time-only rcu_bootup_announce() as __init
rcu: Simplify association of quiescent states with grace periods
rcu: Rename dynticks_completed to completed_fqs
rcu: Enable synchronize_sched_expedited() fastpath
rcu: Remove inline from forward-referenced functions
rcu: Fix note_new_gpnum() uses of ->gpnum
rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter
rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter
rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls
...
The RCU_CPU_STALL_DETECTOR costs almost nothing and has located
some bugs that might otherwise have been difficult to track
down. Make it be default for the TREE RCU implementations.
The vmlinux size impact is limited (on 64-bit x86 defconfig):
text data bss dec hex filename
8440248 1260076 995588 10695912 a334e8 vmlinux.before
8440774 1260060 995588 10696422 a336e6 vmlinux.after
+526 bytes - acceptable default cost.
For RAM starved systems, TINY_RCU does not support CPU-stall detection
and is much smaller, but then again it is a uniprocessor...
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <12597846162906-git-send-email->
[ v2: added image size calculations to the changelog ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add debugobject support to track the life time of work_structs.
While at it, remove duplicate definition of
INIT_DELAYED_WORK_ON_STACK().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Now that the sys_sysctl is now a compatibility wrapper around
/proc/sys we can remove much of sysctl_check and reduce it
to a few remaining sanity checks. This completely decouples
it from the binary sysctl system call.
Little things like ensuring that the sysctl has not already
been registered are all that remain.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
We don't need an explicit PPC64 in the DEBUG_PREEMPT dependancies as all
PPC platforms now support TRACE_IRQFLAGS_SUPPORT.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Also increase the maximum possible kmemleak early log entries since
2000 are not sufficient on s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sam suggested moving STRIP_ASM_SYMS into the Kernel hacking menu
from the General Setup menu. It makes more sense there.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (134 commits)
powerpc/nvram: Enable use Generic NVRAM driver for different size chips
powerpc/iseries: Fix oops reading from /proc/iSeries/mf/*/cmdline
powerpc/ps3: Workaround for flash memory I/O error
powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
powerpc/perf_counters: Reduce stack usage of power_check_constraints
powerpc: Fix bug where perf_counters breaks oprofile
powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
powerpc/irq: Improve nanodoc
powerpc: Fix some late PowerMac G5 with PCIe ATI graphics
powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
powerpc/book3e: Add missing page sizes
powerpc/pseries: Fix to handle slb resize across migration
powerpc/powermac: Thermal control turns system off too eagerly
powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
powerpc/405ex: support cuImage via included dtb
powerpc/405ex: provide necessary fixup function to support cuImage
powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC
powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.
powerpc/44x: Update Arches defconfig
powerpc/44x: Update Arches dts
...
Fix up conflicts in drivers/char/agp/uninorth-agp.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
powerpc64: convert to dynamic percpu allocator
sparc64: use embedding percpu first chunk allocator
percpu: kill lpage first chunk allocator
x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
percpu: update embedding first chunk allocator to handle sparse units
percpu: use group information to allocate vmap areas sparsely
vmalloc: implement pcpu_get_vm_areas()
vmalloc: separate out insert_vmalloc_vm()
percpu: add chunk->base_addr
percpu: add pcpu_unit_offsets[]
percpu: introduce pcpu_alloc_info and pcpu_group_info
percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
percpu: add @align to pcpu_fc_alloc_fn_t
percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
percpu: drop @static_size from first chunk allocators
percpu: generalize first chunk allocator selection
percpu: build first chunk allocators selectively
percpu: rename 4k first chunk allocator to page
percpu: improve boot messages
percpu: fix pcpu_reclaim() locking
...
Fix trivial conflict as by Tejun Heo in kernel/sched.c
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
rcu: Move end of special early-boot RCU operation earlier
rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
rcu: Remove lockdep annotations from RCU's _notrace() API members
rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
rcu: Add CPU-offline processing for single-node configurations
rcu: Add "notrace" to RCU function headers used by ftrace
rcu: Remove CONFIG_PREEMPT_RCU
rcu: Merge preemptable-RCU functionality into hierarchical RCU
rcu: Simplify rcu_pending()/rcu_check_callbacks() API
rcu: Use debugfs_remove_recursive() simplify code.
rcu: Merge per-RCU-flavor initialization into pre-existing macro
rcu: Fix online/offline indication for rcudata.csv trace file
rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
rcu: Renamings to increase RCU clarity
rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
rcu: Fix typo in rcu_irq_exit() comment header
rcu: Make rcupreempt_trace.c look at offline CPUs
...
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management. The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).
Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.
This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):
http://www.kerneloops.org/oops.php?number=252883
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Create a kernel/rcutree_plugin.h file that contains definitions
for preemptable RCU (or, under the #else branch of the #ifdef,
empty definitions for the classic non-preemptable semantics).
These definitions fit into plugins defined in kernel/rcutree.c
for this purpose.
This variant of preemptable RCU uses a new algorithm whose
read-side expense is roughly that of classic hierarchical RCU
under CONFIG_PREEMPT. This new algorithm's update-side expense
is similar to that of classic hierarchical RCU, and, in absence
of read-side preemption or blocking, is exactly that of classic
hierarchical RCU. Perhaps more important, this new algorithm
has a much simpler implementation, saving well over 1,000 lines
of code compared to mainline's implementation of preemptable
RCU, which will hopefully be retired in favor of this new
algorithm.
The simplifications are obtained by maintaining per-task
nesting state for running tasks, and using a simple
lock-protected algorithm to handle accounting when tasks block
within RCU read-side critical sections, making use of lessons
learned while creating numerous user-level RCU implementations
over the past 18 months.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference: <12509746134003-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Very lightly tested, doesn't crash the kernel.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes. As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.
Conflicts:
arch/alpha/include/asm/percpu.h
arch/mn10300/kernel/vmlinux.lds.S
include/linux/percpu-defs.h
(feature suggested by Sergey Senozhatsky)
Kmemleak needs to track all the memory allocations but some of these
happen before kmemleak is initialised. These are stored in an internal
buffer which may be exceeded in some kernel configurations. This patch
adds a configuration option with a default value of 400 and also removes
the stack dump when the early log buffer is exceeded.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Some archs (alpha and s390) need to use weak definitions for percpu
variables in modules so that the compiler generates external
references for them.
This patch implements weak percpu definitions which arch can enable by
defining ARCH_NEEDS_WEAK_PER_CPU in arch percpu header file. This
weak definition adds the following two restrictions on percpu variable
definitions.
1. percpu symbols must be unique whether static or not
2. percpu variables can't be defined inside a function
To ensure that these restrictions are observed in generic code, config
option DEBUG_FORCE_WEAK_PER_CPU enables weak percpu definitions for
all cases.
This patch is inspired by Ivan Kokshaysky's alpha percpu patch.
[ Impact: stricter rules for percpu variables, one more debug config option ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Howells <dhowells@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Selecting DEBUG_SLAB or SLUB_DEBUG by the KMEMLEAK menu entry may cause
issues with other dependencies (KMEMCHECK). These configuration options
aren't strictly needed by kmemleak but they may increase the chances of
finding leaks. This patch also updates the KMEMLEAK config entry help
text.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: Select frame pointers on x86
dma-debug: be more careful when building reference entries
dma-debug: check for sg_call_ents in best-fit algorithm too
x86 stack traces are a piece of crap without frame pointers, and its not
like the 'performance gain' of not having stack pointers matters when you
selected lockdep.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <new-submission>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Kconfig options of kmemcheck are hidden under arch/x86 which makes porting
to other architectures harder. To fix that, move the Kconfig bits to
lib/Kconfig.kmemcheck and introduce a CONFIG_HAVE_ARCH_KMEMCHECK config option
that architectures can define.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
let it rip!
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
This patch adds a loadable module that deliberately leaks memory. It
is used for testing various memory leaking scenarios.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds the Kconfig.debug and Makefile entries needed for
building kmemleak into the kernel.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On Tue, Apr 21, 2009 at 01:55:53PM +0200, Stefan Richter wrote:
> Robert P. J. Day wrote:
> > lib/Kconfig.debug: select PRINTK_DEBUG
> >
> > should that perhaps refer to "DYNAMIC_PRINTK_DEBUG"? since there is
> > no such thing as a PRINTK_DEBUG Kconfig variable.
>
> Looks like a rudiment from an earlier version of Jason's "driver core:
> basic infrastructure for per-module dynamic debug messages",
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=346e15beb5343c2eb8216d820f2ed8f150822b08
> Search an LKML archive for '+#ifdef CONFIG_PRINTK_DEBUG'.
>
> Jason, should it be deleted or replaced by something?
We re-named 'DYNAMIC_PRINTK_DEBUG' to 'DYNAMIC_DEBUG' in 2.6.30....
'PRINTK_DEBUG' as pointed out never existed. So, it appears to be
extraneous, and should be removed. thanks for pointing it out.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Always omit frame pointers on s390. They aren't too useful for the
kernel since we have already the kernel stack backchain which allows
us to walk the kernel stack.
So eleminate the extra code for frame pointers. Only allow the extra
code for the function tracer since the gcc compile options -pg and
-fomit-frame-pointer are incompatible.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't offer a default-y option when the user has turned off
CONFIG_DETECT_SOFTLOCKUP already.
Do offer it as 'y' only if DETECT_SOFTLOCKUP is on already.
This makes it match previous behavior - where the hung-task check was
embedded i CONFIG_DETECT_SOFTLOCKUP code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390. This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().
This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
dma-debug: make memory range checks more consistent
dma-debug: warn of unmapping an invalid dma address
dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
dma-debug/x86: register pci bus for dma-debug leak detection
dma-debug: add a check dma memory leaks
dma-debug: add checks for kernel text and rodata
dma-debug: print stacktrace of mapping path on unmap error
dma-debug: Documentation update
dma-debug: x86 architecture bindings
dma-debug: add function to dump dma mappings
dma-debug: add checks for sync_single_sg_*
dma-debug: add checks for sync_single_range_*
dma-debug: add checks for sync_single_*
dma-debug: add checking for [alloc|free]_coherent
dma-debug: add add checking for map/unmap_sg
dma-debug: add checking for map/unmap_page/single
dma-debug: add core checking functions
dma-debug: add debugfs interface
dma-debug: add kernel command line parameters
dma-debug: add initialization code
...
Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c
This patch combines Greg Bank's dprintk() work with the existing dynamic
printk patchset, we are now calling it 'dynamic debug'.
The new feature of this patchset is a richer /debugfs control file interface,
(an example output from my system is at the bottom), which allows fined grained
control over the the debug output. The output can be controlled by function,
file, module, format string, and line number.
for example, enabled all debug messages in module 'nf_conntrack':
echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
to disable them:
echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
A further explanation can be found in the documentation patch.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BUILD_DOCSRC should be controlled by "config" instead of "menuconfig".
I have no idea how I managed to use "menuconfig" here.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch also makes the frame pointer default to y only if
!ARM_UNWIND. LOCKDEP no longer selects FRAME_POINTER if ARM_UNWIND is
enabled.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
debugobjects: add and use INIT_WORK_ON_STACK
rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR
relay: fix lock imbalance in relay_late_setup_files
oprofile: fix uninitialized use of struct op_entry
rcu: move Kconfig menu
softlock: fix false panic which can occur if softlockup_thresh is reduced
rcu: add __cpuinit to rcu_init_percpu_data()
Impact: remove the old CONFIG_RCU_CPU_STALL_DETECTOR
tree_rcu introduce CONFIG_RCU_CPU_STALL_DETECTOR again.
These two are the same exactly except:
the old one "depends on CLASSIC_RCU"
the new one "depends on CLASSIC_RCU || TREE_RCU"
This patch remove the old one.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit da4276b829 changed a dependency
for FRAME_POINTER from X86 to ARCH_WANT_FRAME_POINTERS, but didn't
actually define it.
This patch adds the definition for ARCH_WANT_FRAME_POINTERS. Without it,
FRAME_POINTER can't be enabled on x86.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Decoupling allows:
* hung tasks check to happen at very low priority
* hung tasks check and softlockup to be enabled/disabled independently
at compile and/or run-time
* individual panic settings to be enabled disabled independently
at compile and/or run-time
* softlockup threshold to be reduced without increasing hung tasks
poll frequency (hung task check is expensive relative to softlock watchdog)
* hung task check to be zero over-head when disabled at run-time
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
x86: fix section mismatch warnings in mcheck/mce_amd_64.c
x86: offer frame pointers in all build modes
x86: remove duplicated #include's
x86: k8 numa register active regions later
x86: update Alan Cox's email addresses
x86: rename all fields of mpc_table mpc_X to X
x86: rename all fields of mpc_oemtable oem_X to X
x86: rename all fields of mpc_bus mpc_X to X
x86: rename all fields of mpc_cpu mpc_X to X
x86: rename all fields of mpc_intsrc mpc_X to X
x86: rename all fields of mpc_lintsrc mpc_X to X
x86: rename all fields of mpc_iopic mpc_X to X
x86: irqinit_64.c init_ISA_irqs should be static
Documentation/x86/boot.txt: payload length was changed to payload_length
x86: setup_percpu.c fix style problems
x86: irqinit_64.c fix style problems
x86: irqinit_32.c fix style problems
x86: i8259.c fix style problems
x86: irq_32.c fix style problems
x86: ioport.c fix style problems
...
Make VMAs per mm_struct as for MMU-mode linux. This solves two problems:
(1) In SYSV SHM where nattch for a segment does not reflect the number of
shmat's (and forks) done.
(2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
that a VMA might be shared and already have its vm_mm assigned to another
process or a dead process.
A new struct (vm_region) is introduced to track a mapped region and to remember
the circumstances under which it may be shared and the vm_list_struct structure
is discarded as it's no longer required.
This patch makes the following additional changes:
(1) Regions are now allocated with alloc_pages() rather than kmalloc() and
with no recourse to __GFP_COMP, so the pages are not composite. Instead,
each page has a reference on it held by the region. Anything else that is
interested in such a page will have to get a reference on it to retain it.
When the pages are released due to unmapping, each page is passed to
put_page() and will be freed when the page usage count reaches zero.
(2) Excess pages are trimmed after an allocation as the allocation must be
made as a power-of-2 quantity of pages.
(3) VMAs are added to the parent MM's R/B tree and mmap lists. As an MM may
end up with overlapping VMAs within the tree, the VMA struct address is
appended to the sort key.
(4) Non-anonymous VMAs are now added to the backing inode's prio list.
(5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
the backing region. The VMA and region structs will be split if
necessary.
(6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
segment instead of all the attachments at that addresss. Multiple
shmat()'s return the same address under NOMMU-mode instead of different
virtual addresses as under MMU-mode.
(7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
(8) /proc/maps is now the global list of mapped regions, and may list bits
that aren't actually mapped anywhere.
(9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
of RAM currently allocated by mmap to hold mappable regions that can't be
mapped directly. These are copies of the backing device or file if not
anonymous.
These changes make NOMMU mode more similar to MMU mode. The downside is that
NOMMU mode requires some extra memory to track things over NOMMU without this
patch (VMAs are no longer shared, and there are now region structs).
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
CONFIG_FRAME_POINTERS=y results in much better debug info for the
kernel (clear and precise backtraces), with the only drawback being
a ~1% increase in kernel size.
So offer it unconditionally and enable it by default.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: avoid leaking caches or refcounts on sysfs error
slab: Fix comment on #endif
slab: remove GFP_THISNODE clearing from alloc_slabmgmt()
slub: Add might_sleep_if() to slab_alloc()
SLUB: failslab support
slub: Fix incorrect use of loose
slab: Update the kmem_cache_create documentation regarding the name parameter
slub: make early_kmem_cache_node_alloc void
slab: unsigned slabp->inuse cannot be less than 0
slub - fix get_object_page comment
SLUB: Replace __builtin_return_address(0) with _RET_IP_.
SLUB: cleanup - define macros instead of hardcoded numbers
Currently fault-injection capability for SLAB allocator is only
available to SLAB. This patch makes it available to SLUB, too.
[penberg@cs.helsinki.fi: unify slab and slub implementations]
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
This patch fixes a long-standing performance bug in classic RCU that
results in massive internal-to-RCU lock contention on systems with
more than a few hundred CPUs. Although this patch creates a separate
flavor of RCU for ease of review and patch maintenance, it is intended
to replace classic RCU.
This patch still handles stress better than does mainline, so I am still
calling it ready for inclusion. This patch is against the -tip tree.
Nevertheless, experience on an actual 1000+ CPU machine would still be
most welcome.
Most of the changes noted below were found while creating an rcutiny
(which should permit ejecting the current rcuclassic) and while doing
detailed line-by-line documentation.
Updates from v9 (http://lkml.org/lkml/2008/12/2/334):
o Fixes from remainder of line-by-line code walkthrough,
including comment spelling, initialization, undesirable
narrowing due to type conversion, removing redundant memory
barriers, removing redundant local-variable initialization,
and removing redundant local variables.
I do not believe that any of these fixes address the CPU-hotplug
issues that Andi Kleen was seeing, but please do give it a whirl
in case the machine is smarter than I am.
A writeup from the walkthrough may be found at the following
URL, in case you are suffering from terminal insomnia or
masochism:
http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf
o Made rcutree tracing use seq_file, as suggested some time
ago by Lai Jiangshan.
o Added a .csv variant of the rcudata debugfs trace file, to allow
people having thousands of CPUs to drop the data into
a spreadsheet. Tested with oocalc and gnumeric. Updated
documentation to suit.
Updates from v8 (http://lkml.org/lkml/2008/11/15/139):
o Fix a theoretical race between grace-period initialization and
force_quiescent_state() that could occur if more than three
jiffies were required to carry out the grace-period
initialization. Which it might, if you had enough CPUs.
o Apply Ingo's printk-standardization patch.
o Substitute local variables for repeated accesses to global
variables.
o Fix comment misspellings and redundant (but harmless) increments
of ->n_rcu_pending (this latter after having explicitly added it).
o Apply checkpatch fixes.
Updates from v7 (http://lkml.org/lkml/2008/10/10/291):
o Fixed a number of problems noted by Gautham Shenoy, including
the cpu-stall-detection bug that he was having difficulty
convincing me was real. ;-)
o Changed cpu-stall detection to wait for ten seconds rather than
three in order to reduce false positive, as suggested by Ingo
Molnar.
o Produced a design document (http://lwn.net/Articles/305782/).
The act of writing this document uncovered a number of both
theoretical and "here and now" bugs as noted below.
o Fix dynticks_nesting accounting confusion, simplify WARN_ON()
condition, fix kerneldoc comments, and add memory barriers
in dynticks interface functions.
o Add more data to tracing.
o Remove unused "rcu_barrier" field from rcu_data structure.
o Count calls to rcu_pending() from scheduling-clock interrupt
to use as a surrogate timebase should jiffies stop counting.
o Fix a theoretical race between force_quiescent_state() and
grace-period initialization. Yes, initialization does have to
go on for some jiffies for this race to occur, but given enough
CPUs...
Updates from v6 (http://lkml.org/lkml/2008/9/23/448):
o Fix a number of checkpatch.pl complaints.
o Apply review comments from Ingo Molnar and Lai Jiangshan
on the stall-detection code.
o Fix several bugs in !CONFIG_SMP builds.
o Fix a misspelled config-parameter name so that RCU now announces
at boot time if stall detection is configured.
o Run tests on numerous combinations of configurations parameters,
which after the fixes above, now build and run correctly.
Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):
o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a
changeset some time ago, and finally got around to retesting
this option).
o Fix some tracing bugs in rcupreempt that caused incorrect
totals to be printed.
o I now test with a more brutal random-selection online/offline
script (attached). Probably more brutal than it needs to be
on the people reading it as well, but so it goes.
o A number of optimizations and usability improvements:
o Make rcu_pending() ignore the grace-period timeout when
there is no grace period in progress.
o Make force_quiescent_state() avoid going for a global
lock in the case where there is no grace period in
progress.
o Rearrange struct fields to improve struct layout.
o Make call_rcu() initiate a grace period if RCU was
idle, rather than waiting for the next scheduling
clock interrupt.
o Invoke rcu_irq_enter() and rcu_irq_exit() only when
idle, as suggested by Andi Kleen. I still don't
completely trust this change, and might back it out.
o Make CONFIG_RCU_TRACE be the single config variable
manipulated for all forms of RCU, instead of the prior
confusion.
o Document tracing files and formats for both rcupreempt
and rcutree.
Updates from v4 for those missing v5 given its bad subject line:
o Separated dynticks interface so that NMIs and irqs call separate
functions, greatly simplifying it. In particular, this code
no longer requires a proof of correctness. ;-)
o Separated dynticks state out into its own per-CPU structure,
avoiding the duplicated accounting.
o The case where a dynticks-idle CPU runs an irq handler that
invokes call_rcu() is now correctly handled, forcing that CPU
out of dynticks-idle mode.
o Review comments have been applied (thank you all!!!).
For but one example, fixed the dynticks-ordering issue that
Manfred pointed out, saving me much debugging. ;-)
o Adjusted rcuclassic and rcupreempt to handle dynticks changes.
Attached is an updated patch to Classic RCU that applies a hierarchy,
greatly reducing the contention on the top-level lock for large machines.
This passes 10-hour concurrent rcutorture and online-offline testing on
128-CPU ppc64 without dynticks enabled, and exposes some timekeeping
bugs in presence of dynticks (exciting working on a system where
"sleep 1" hangs until interrupted...), which were fixed in the
2.6.27 kernel. It is getting more reliable than mainline by some
measures, so the next version will be against -tip for inclusion.
See also Manfred Spraul's recent patches (or his earlier work from
2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2).
We will converge onto a common patch in the fullness of time, but are
currently exploring different regions of the design space. That said,
I have already gratefully stolen quite a few of Manfred's ideas.
This patch provides CONFIG_RCU_FANOUT, which controls the bushiness
of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on
64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,
there is no hierarchy. By default, the RCU initialization code will
adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA
architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable
this balancing, allowing the hierarchy to be exactly aligned to the
underlying hardware. Up to two levels of hierarchy are permitted
(in addition to the root node), allowing up to 16,384 CPUs on 32-bit
systems and up to 262,144 CPUs on 64-bit systems. I just know that I
am going to regret saying this, but this seems more than sufficient
for the foreseeable future. (Some architectures might wish to set
CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs.
If this becomes a real problem, additional levels can be added, but I
doubt that it will make a significant difference on real hardware.)
In the common case, a given CPU will manipulate its private rcu_data
structure and the rcu_node structure that it shares with its immediate
neighbors. This can reduce both lock and memory contention by multiple
orders of magnitude, which should eliminate the need for the strange
manipulations that are reported to be required when running Linux on
very large systems.
Some shortcomings:
o More bugs will probably surface as a result of an ongoing
line-by-line code inspection.
Patches will be provided as required.
o There are probably hangs, rcutorture failures, &c. Seems
quite stable on a 128-CPU machine, but that is kind of small
compared to 4096 CPUs. However, seems to do better than
mainline.
Patches will be provided as required.
o The memory footprint of this version is several KB larger
than rcuclassic.
A separate UP-only rcutiny patch will be provided, which will
reduce the memory footprint significantly, even compared
to the old rcuclassic. One such patch passes light testing,
and has a memory footprint smaller even than rcuclassic.
Initial reaction from various embedded guys was "it is not
worth it", so am putting it aside.
Credits:
o Manfred Spraul for ideas, review comments, and bugs spotted,
as well as some good friendly competition. ;-)
o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton
for reviews and comments.
o Thomas Gleixner for much-needed help with some timer issues
(see patches below).
o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,
Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton
Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines
alive despite my heavy abuse^Wtesting.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add .config driven boot parameter default value
Right now debugobjects can only be activated if the debug_objects
boot parameter is passed in via the boot command line.
Make this more convenient (and randomizable) by also providing
a .config method. Enable it by default. (DEBUG_OBJECTS itself
is default-off)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CONFIG_DEBUG_BLOCK_EXT_DEVT can break booting even on some modern
distros. Add BIG FAT WARNING to keep people at a safe distance.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Base infrastructure to enable per-module debug messages.
I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.
Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.
Usage:
Dynamic debugging is controlled by the debugfs file,
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:
<module_name> <enabled=0/1>
.
.
.
<module_name> : Name of the module in which the debug call resides
<enabled=0/1> : whether the messages are enabled or not
For example:
snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0
Enable a module:
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
Disable a module:
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
Enable all modules:
$echo "set enabled=1 all" > dynamic_printk/modules
Disable all modules:
$echo "set enabled=0 all" > dynamic_printk/modules
Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.
[gkh: minor cleanups and tweaks to make the build work quietly]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'rcu-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
rcu: RCU-based detection of stalled CPUs for Classic RCU, fix
rcu: RCU-based detection of stalled CPUs for Classic RCU
rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched()
rcu: fix sparse shadowed variable warning
doc/RCU: fix pseudocode in rcuref.txt
rcuclassic: fix compiler warning
rcu: use irq-safe locks
rcuclassic: fix compilation NG
rcu: fix locking cleanup fallout
rcu: remove redundant ACCESS_ONCE definition from rcupreempt.c
rcu: fix classic RCU locking cleanup lockdep problem
rcu: trace fix possible mem-leak
rcu: just rename call_rcu_bh instead of making it a macro
rcu: remove list_for_each_rcu()
rcu: fixes to include/linux/rcupreempt.h
rcu: classic RCU locking and memory-barrier cleanups
rcu: prevent console flood when one CPU sees another AWOL via RCU
rcu, debug: detect stalled grace periods, cleanups
rcu, debug: detect stalled grace periods
rcu classic: new algorithm for callbacks-processing(v2)
...
Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
DEBUG_BLOCK_EXT_DEVT shuffles SCSI and IDE device numbers and root
device number set using rdev become meaningless. Root devices should
be explicitly specified using textual names. Warn about it if root
can't be found and DEBUG_BLOCK_EXT_DEVT is enabled. Also, add warning
to the help text.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It's a debug option that you would explicitly enable to test this
feature, we should default it to 'n' to prevent accidental surprises
for now.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Extended devt introduces non-contiguos device numbers. This patch
implements a debug option which forces most devt allocations to be
from the extended area and spreads them out. This is enabled by
default if DEBUG_KERNEL is set and achieves...
1. Detects code paths in kernel or userland which expect predetermined
consecutive device numbers.
2. When something goes wrong, avoid corruption as adding to the minor
of earlier partition won't lead to the wrong but valid device.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled.
This is a debugging feature to detect infinite loops in kernel code, not
something that non-kernel-hackers would be expected to care about.
This feature can detect looping CPUs in !PREEMPT builds and looping CPUs
with preemption disabled in PREEMPT builds. This is essentially a port of
this functionality from the treercu patch, replacing the stall debug patch
that is already in tip/core/rcu (commit 67182ae1c4).
The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
during some development we suspected a case where we left something
in a notifier chain that was from a module that was unloaded already...
and that sort of thing is rather hard to track down.
This patch adds a very simple sanity check (which isn't all that
expensive) to make sure the notifier we're about to call is
actually from either the kernel itself of from a still-loaded
module, avoiding a hard-to-chase-down crash.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This bug is causing random crashes
(http://bugzilla.kernel.org/show_bug.cgi?id=11414).
-fno-omit-frame-pointer is only needed on powerpc when -pg is also
supplied, and there is a gcc bug that causes incorrect code generation
on 32-bit powerpc when -fno-omit-frame-pointer is used---it uses stack
locations below the stack pointer, which is not allowed by the ABI
because those locations can and sometimes do get corrupted by an
interrupt.
This ensures that CONFIG_FRAME_POINTER is only selected by ftrace.
When CONFIG_FTRACE is enabled we also pass -mno-sched-epilog to work
around the gcc codegen bug.
Patch based on work by:
Andreas Schwab <schwab@suse.de>
Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
I noticed that sysctl_check.o was the largest object file in
a allnoconfig build in kernel/*.
36243 0 0 36243 8d93 kernel/sysctl_check.o
This is because it was default y and && EMBEDDED. But I don't
really see a need for a non kernel developer to have their
sysctls checked all the time.
So move the Kconfig into the kernel debugging section and
also drop the default y and the EMBEDDED check.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently source files in the Documentation/ sub-dir can easily bit-rot
since they are not generally buildable, either because they are hidden in
text files or because there are no Makefile rules for them. This needs to
be fixed so that the source files remain usable and good examples of code
instead of bad examples.
Add the ability to build source files that are in the Documentation/ dir.
Add to Kconfig as "BUILD_DOCSRC" config symbol.
Use "CONFIG_BUILD_DOCSRC=1 make ..." to build objects from the
Documentation/ sources. Or enable BUILD_DOCSRC in the *config system.
However, this symbol depends on HEADERS_CHECK since the header files need
to be installed (for userspace builds).
Built (using cross-tools) for x86-64, i386, alpha, ia64, sparc32,
sparc64, powerpc, sh, m68k, & mips.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
this is a diagnostic patch for Classic RCU.
The approach is to record a timestamp at the beginning
of the grace period (in rcu_start_batch()), then have
rcu_check_callbacks() complain if:
1. it is running on a CPU that has holding up grace periods for
a long time (say one second). This will identify the culprit
assuming that the culprit has not disabled hardware irqs,
instruction execution, or some such.
2. it is running on a CPU that is not holding up grace periods,
but grace periods have been held up for an even longer time
(say two seconds).
It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter.
Rather than exponential backoff, it backs off to once per 30 seconds.
My feeling upon thinking on it was that if you have stalled RCU grace
periods for that long, a few extra printk() messages are probably the
least of your worries...
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: David Witbrodt <dawitbro@sbcglobal.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Boot initialisation is very complex, with significant numbers of
architecture-specific routines, hooks and code ordering. While significant
amounts of the initialisation is architecture-independent, it trusts the data
received from the architecture layer. This is a mistake, and has resulted in
a number of difficult-to-diagnose bugs.
This patchset adds some validation and tracing to memory initialisation. It
also introduces a few basic defensive measures. The validation code can be
explicitly disabled for embedded systems.
This patch:
Add additional debugging and verification code for memory initialisation.
Once enabled, the verification checks are always run and when required
additional debugging information may be outputted via a mminit_loglevel=
command-line parameter.
The verification code is placed in a new file mm/mm_init.c. Ideally other mm
initialisation code will be moved here over time.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: fix invalid proc_handler for softlockup_panic
softlockup: fix watchdog task wakeup frequency
softlockup: fix watchdog task wakeup frequency
softlockup: show irqtrace
softlockup: print a module list on being stuck
softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression
softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds
softlockup: fix softlockup_thresh fix
softlockup: fix softlockup_thresh unaligned access and disable detection at runtime
softlockup: allow panic on lockup
This patch adds saved stack-traces to the backtrace suite of self-tests.
Note that we don't depend on or unconditionally enable CONFIG_STACKTRACE
because not all architectures may have it (and we still want to enable the
other tests for those architectures).
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add some (configurable) expensive sanity checking to catch wrong address
translations on x86.
- create linux/mmdebug.h file to be able include this file in
asm headers to not get unsolvable loops in header files
- __phys_addr on x86_32 became a function in ioremap.c since
PAGE_OFFSET, is_vmalloc_addr and VMALLOC_* non-constasts are undefined
if declared in page_32.h
- add __phys_addr_const for initializing doublefault_tss.__cr3
Tested on 386, 386pae, x86_64 and x86_64 numa=fake=2.
Contains Andi's enable numa virtual address debug patch.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch re-institutes the ability to build rcutorture directly into
the Linux kernel. The reason that this capability was removed was that
this could result in your kernel being pretty much useless, as rcutorture
would be running starting from early boot. This problem has been avoided
by (1) making rcutorture run only three seconds of every six by default,
(2) adding a CONFIG_RCU_TORTURE_TEST_RUNNABLE that permits rcutorture
to be quiesced at boot time, and (3) adding a sysctl in /proc named
/proc/sys/kernel/rcutorture_runnable that permits rcutorture to be
quiesced and unquiesced when built into the kernel.
Please note that this /proc file is -not- available when rcutorture
is built as a module. Please also note that to get the earlier
take-no-prisoners behavior, you must use the boot command line to set
rcutorture's "stutter" parameter to zero.
The rcutorture quiescing mechanism is currently quite crude: loops
in each rcutorture process that poll a global variable once per tick.
Suggestions for improvement are welcome. The default action will
be to reduce the polling rate to a few times per second.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 9aaffc898f.
That commit was a very bad idea. RCU_TORTURE found many boot timing
bugs and other sorts of bugs in the past, so excluding it from
boot images is very silly.
The option already depends on DEBUG_KERNEL and is disabled by default.
Even when it runs, the test threads are reniced. If it annoys people
we could add a runtime sysctl.
allow users to configure the softlockup detector to generate a panic
instead of a warning message.
high-availability systems might opt for this strict method (combined
with panic_timeout= boot option/sysctl), instead of generating
softlockup warnings ad infinitum.
also, automated tests work better if the system reboots reliably (into
a safe kernel) in case of a lockup.
The full spectrum of configurability is supported: boot option, sysctl
option and Kconfig option.
it's default-disabled.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If CONFIG_FTRACE is selected and /proc/sys/kernel/ftrace_enabled is
set to a non-zero value the ftrace routine will be called everytime
we enter a kernel function that is not marked with the "notrace"
attribute.
The ftrace routine will then call a registered function if a function
happens to be registered.
[ This code has been highly hacked by Steven Rostedt and Ingo Molnar,
so don't blame Arnaldo for all of this ;-) ]
Update:
It is now possible to register more than one ftrace function.
If only one ftrace function is registered, that will be the
function that ftrace calls directly. If more than one function
is registered, then ftrace will call a function that will loop
through the functions to call.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We can see an ever repeating problem pattern with objects of any kind in the
kernel:
1) freeing of active objects
2) reinitialization of active objects
Both problems can be hard to debug because the crash happens at a point where
we have no chance to decode the root cause anymore. One problem spot are
kernel timers, where the detection of the problem often happens in interrupt
context and usually causes the machine to panic.
While working on a timer related bug report I had to hack specialized code
into the timer subsystem to get a reasonable hint for the root cause. This
debug hack was fine for temporary use, but far from a mergeable solution due
to the intrusiveness into the timer code.
The code further lacked the ability to detect and report the root cause
instantly and keep the system operational.
Keeping the system operational is important to get hold of the debug
information without special debugging aids like serial consoles and special
knowledge of the bug reporter.
The problems described above are not restricted to timers, but timers tend to
expose it usually in a full system crash. Other objects are less explosive,
but the symptoms caused by such mistakes can be even harder to debug.
Instead of creating specialized debugging code for the timer subsystem a
generic infrastructure is created which allows developers to verify their code
and provides an easy to enable debug facility for users in case of trouble.
The debugobjects core code keeps track of operations on static and dynamic
objects by inserting them into a hashed list and sanity checking them on
object operations and provides additional checks whenever kernel memory is
freed.
The tracked object operations are:
- initializing an object
- adding an object to a subsystem list
- deleting an object from a subsystem list
Each operation is sanity checked before the operation is executed and the
subsystem specific code can provide a fixup function which allows to prevent
the damage of the operation. When the sanity check triggers a warning message
and a stack trace is printed.
The list of operations can be extended if the need arises. For now it's
limited to the requirements of the first user (timers).
The core code enqueues the objects into hash buckets. The hash index is
generated from the address of the object to simplify the lookup for the check
on kfree/vfree. Each bucket has it's own spinlock to avoid contention on a
global lock.
The debug code can be compiled in without being active. The runtime overhead
is minimal and could be optimized by asm alternatives. A kernel command line
option enables the debugging code.
Thanks to Ingo Molnar for review, suggestions and cleanup patches.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add option to enable -Wframe-larger-than= on gcc 4.4
gcc mainline (upcoming 4.4) added a new -Wframe-larger-than=...
option to warn at build time about too large stack frames. Add a config
option to enable this warning, since this very useful for the kernel.
I choose (somewhat arbitarily) 2048 as default warning threshold for 64bit
and 1024 as default for 32bit architectures. With some research and
fixing all the code for smaller values these defaults should be probably
lowered.
With the default allyesconfigs have some new warnings, but I think
that is all code that should be just fixed.
At some point (when gcc 4.4 is released and widely used) this should
obsolete make checkstack
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
There have been a few oopses caused by 'struct file's with NULL f_vfsmnts.
There was also a set of potentially missed mnt_want_write()s from
dentry_open() calls.
This patch provides a very simple debugging framework to catch these kinds of
bugs. It will WARN_ON() them, but should stop us from having any oopses or
mnt_writer count imbalances.
I'm quite convinced that this is a good thing because it found bugs in the
stuff I was working on as soon as I wrote it.
[hch: made it conditional on a debug option.
But it's still a little bit too ugly]
[hch: merged forced remount r/o fix from Dave and akpm's fix for the fix]
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (43 commits)
firewire: cleanups
firewire: fix synchronization of gap counts
firewire: wait until PHY configuration packet was transmitted (fix bus reset loop)
firewire: remove unused struct member
firewire: use bitwise and to get reg in handle_registers
firewire: replace more hex values with defined csr constants
firewire: reread config ROM when device reset the bus
firewire: replace static ROM cache by allocated cache
firewire: fw-ohci: work around generation bug in TI controllers (fix AV/C and more)
firewire: fw-ohci: extend logging of bus generations and node ID
firewire: fw-ohci: conditionally log busReset interrupts
firewire: fw-ohci: don't append to AT context when it's not active
firewire: fw-ohci: log regAccessFail events
firewire: fw-ohci: make sure HCControl register LPS bit is set
firewire: fw-ohci: missing PPC PMac feature calls in failure path
firewire: fw-ohci: untangle a mixed unsigned/signed expression
firewire: debug interrupt events
firewire: fw-ohci: catch self_id_count == 0
firewire: fw-ohci: add self ID error check
firewire: fw-ohci: refactor probe, remove, suspend, resume
...
This way firewire-ohci can be used for remote debugging like ohci1394.
Version with amendment from Fri, 11 Apr 2008 00:08:08 +0200.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Bernhard Kaindl <bk@suse.de>
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove DEBUG_SEMAPHORE from Kconfig
Improve semaphore documentation
Simplify semaphore implementation
Add down_timeout and change ACPI to use it
Introduce down_killable()
Generic semaphore implementation
Add semaphore.h to kernel_lock.c
Fix quota.h includes