2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

71 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
e722d8daaf profile: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs. A lot of code is removed because
the for-loop is used and create_hash_tables() is removed since its purpose
is covered by the startup / teardown hooks.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.649867675@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:41:42 +02:00
Arnd Bergmann
ade356b99a profile: hide unused functions when !CONFIG_PROC_FS
A couple of functions and variables in the profile implementation are
used only on SMP systems by the procfs code, but are unused if either
procfs is disabled or in uniprocessor kernels.  gcc prints a harmless
warning about the unused symbols:

  kernel/profile.c:243:13: error: 'profile_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_flip_buffers(void)
               ^
  kernel/profile.c:266:13: error: 'profile_discard_flip_buffers' defined but not used [-Werror=unused-function]
   static void profile_discard_flip_buffers(void)
               ^
  kernel/profile.c:330:12: error: 'profile_cpu_callback' defined but not used [-Werror=unused-function]
   static int profile_cpu_callback(struct notifier_block *info,
              ^

This adds further #ifdef to the file, to annotate exactly in which cases
they are used.  I have done several thousand ARM randconfig kernels with
this patch applied and no longer get any warnings in this file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Mel Gorman
cb2517653f sched/debug: Make schedstats a runtime tunable that is disabled by default
schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.

This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.

The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.

These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.

netperf-tcp
                           4.5.0-rc1             4.5.0-rc1
                             vanilla          nostats-v3r1
Hmean    64         560.45 (  0.00%)      575.98 (  2.77%)
Hmean    128        766.66 (  0.00%)      795.79 (  3.80%)
Hmean    256        950.51 (  0.00%)      981.50 (  3.26%)
Hmean    1024      1433.25 (  0.00%)     1466.51 (  2.32%)
Hmean    2048      2810.54 (  0.00%)     2879.75 (  2.46%)
Hmean    3312      4618.18 (  0.00%)     4682.09 (  1.38%)
Hmean    4096      5306.42 (  0.00%)     5346.39 (  0.75%)
Hmean    8192     10581.44 (  0.00%)    10698.15 (  1.10%)
Hmean    16384    18857.70 (  0.00%)    18937.61 (  0.42%)

Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.

tbench4
                                 4.5.0-rc1             4.5.0-rc1
                                   vanilla          nostats-v3r1
Hmean    mb/sec-1         500.85 (  0.00%)      522.43 (  4.31%)
Hmean    mb/sec-2         984.66 (  0.00%)     1018.19 (  3.41%)
Hmean    mb/sec-4        1827.91 (  0.00%)     1847.78 (  1.09%)
Hmean    mb/sec-8        3561.36 (  0.00%)     3611.28 (  1.40%)
Hmean    mb/sec-16       5824.52 (  0.00%)     5929.03 (  1.79%)
Hmean    mb/sec-32      10943.10 (  0.00%)    10802.83 ( -1.28%)
Hmean    mb/sec-64      15950.81 (  0.00%)    16211.31 (  1.63%)
Hmean    mb/sec-128     15302.17 (  0.00%)    15445.11 (  0.93%)
Hmean    mb/sec-256     14866.18 (  0.00%)    15088.73 (  1.50%)
Hmean    mb/sec-512     15223.31 (  0.00%)    15373.69 (  0.99%)
Hmean    mb/sec-1024    14574.25 (  0.00%)    14598.02 (  0.16%)
Hmean    mb/sec-2048    13569.02 (  0.00%)    13733.86 (  1.21%)
Hmean    mb/sec-3072    12865.98 (  0.00%)    13209.23 (  2.67%)

Small gains of 2-4% at low thread counts and otherwise flat.  The
gains on the 8-core machine were slightly different

tbench4 on 8-core i7-3770 single socket machine
Hmean    mb/sec-1        442.59 (  0.00%)      448.73 (  1.39%)
Hmean    mb/sec-2        796.68 (  0.00%)      794.39 ( -0.29%)
Hmean    mb/sec-4       1322.52 (  0.00%)     1343.66 (  1.60%)
Hmean    mb/sec-8       2611.65 (  0.00%)     2694.86 (  3.19%)
Hmean    mb/sec-16      2537.07 (  0.00%)     2609.34 (  2.85%)
Hmean    mb/sec-32      2506.02 (  0.00%)     2578.18 (  2.88%)
Hmean    mb/sec-64      2511.06 (  0.00%)     2569.16 (  2.31%)
Hmean    mb/sec-128     2313.38 (  0.00%)     2395.50 (  3.55%)
Hmean    mb/sec-256     2110.04 (  0.00%)     2177.45 (  3.19%)
Hmean    mb/sec-512     2072.51 (  0.00%)     2053.97 ( -0.89%)

In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.

hackbench-pipes
                         4.5.0-rc1             4.5.0-rc1
                           vanilla          nostats-v3r1
Amean    1        0.0637 (  0.00%)      0.0660 ( -3.59%)
Amean    4        0.1229 (  0.00%)      0.1181 (  3.84%)
Amean    7        0.1921 (  0.00%)      0.1911 (  0.52%)
Amean    12       0.3117 (  0.00%)      0.2923 (  6.23%)
Amean    21       0.4050 (  0.00%)      0.3899 (  3.74%)
Amean    30       0.4586 (  0.00%)      0.4433 (  3.33%)
Amean    48       0.5910 (  0.00%)      0.5694 (  3.65%)
Amean    79       0.8663 (  0.00%)      0.8626 (  0.43%)
Amean    110      1.1543 (  0.00%)      1.1517 (  0.22%)
Amean    141      1.4457 (  0.00%)      1.4290 (  1.16%)
Amean    172      1.7090 (  0.00%)      1.6924 (  0.97%)
Amean    192      1.9126 (  0.00%)      1.9089 (  0.19%)

Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different

pipetest
                             4.5.0-rc1             4.5.0-rc1
                               vanilla          nostats-v2r2
Min         Time        4.13 (  0.00%)        3.99 (  3.39%)
1st-qrtle   Time        4.38 (  0.00%)        4.27 (  2.51%)
2nd-qrtle   Time        4.46 (  0.00%)        4.39 (  1.57%)
3rd-qrtle   Time        4.56 (  0.00%)        4.51 (  1.10%)
Max-90%     Time        4.67 (  0.00%)        4.60 (  1.50%)
Max-93%     Time        4.71 (  0.00%)        4.65 (  1.27%)
Max-95%     Time        4.74 (  0.00%)        4.71 (  0.63%)
Max-99%     Time        4.88 (  0.00%)        4.79 (  1.84%)
Max         Time        4.93 (  0.00%)        4.83 (  2.03%)
Mean        Time        4.48 (  0.00%)        4.39 (  1.91%)
Best99%Mean Time        4.47 (  0.00%)        4.39 (  1.91%)
Best95%Mean Time        4.46 (  0.00%)        4.38 (  1.93%)
Best90%Mean Time        4.45 (  0.00%)        4.36 (  1.98%)
Best50%Mean Time        4.36 (  0.00%)        4.25 (  2.49%)
Best10%Mean Time        4.23 (  0.00%)        4.10 (  3.13%)
Best5%Mean  Time        4.19 (  0.00%)        4.06 (  3.20%)
Best1%Mean  Time        4.13 (  0.00%)        4.00 (  3.39%)

Small improvement and similar gains were seen on the UMA machine.

The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09 11:54:23 +01:00
Vlastimil Babka
96db800f5d mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE.  Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise.  In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.

The misleading name has lead to mistakes in the past, see for example
commits 5265047ac3 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f ("mm, mempolicy:
migrate_to_node should only migrate to node").

Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.

To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage.  Both functions get described in comments.

It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly.  The number of users would be small
anyway.

Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead.  This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.

Both differences will be rectified by the next patch.

To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers.  Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Tejun Heo
ccbd59c1c1 profile: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:38 -08:00
Fabian Frederick
f3da64d1ea kernel/profile.c: use static const char instead of static char
schedstr, sleepstr and kvmstr are only used in strcmp & strlen

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Fabian Frederick
aba871f1e9 kernel/profile.c: convert printk to pr_foo()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Linus Torvalds
467a9e1633 CPU hotplug notifiers registration fixes for 3.15-rc1
The purpose of this single series of commits from Srivatsa S Bhat (with
 a small piece from Gautham R Shenoy) touching multiple subsystems that use
 CPU hotplug notifiers is to provide a way to register them that will not
 lead to deadlocks with CPU online/offline operations as described in the
 changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
 of callback registration functions).
 
 The first three commits in the series introduce the API and document it
 and the rest simply goes through the users of CPU hotplug notifiers and
 converts them to using the new method.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
 4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
 KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
 BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
 rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
 j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
 ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
 lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
 IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
 cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
 StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
 sWl//rg76nb13dFjvF+q
 =SW7Q
 -----END PGP SIGNATURE-----

Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
 "The purpose of this single series of commits from Srivatsa S Bhat
  (with a small piece from Gautham R Shenoy) touching multiple
  subsystems that use CPU hotplug notifiers is to provide a way to
  register them that will not lead to deadlocks with CPU online/offline
  operations as described in the changelog of commit 93ae4f978c ("CPU
  hotplug: Provide lockless versions of callback registration
  functions").

  The first three commits in the series introduce the API and document
  it and the rest simply goes through the users of CPU hotplug notifiers
  and converts them to using the new method"

* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  net/iucv/iucv.c: Fix CPU hotplug callback registration
  net/core/flow.c: Fix CPU hotplug callback registration
  mm, zswap: Fix CPU hotplug callback registration
  mm, vmstat: Fix CPU hotplug callback registration
  profile: Fix CPU hotplug callback registration
  trace, ring-buffer: Fix CPU hotplug callback registration
  xen, balloon: Fix CPU hotplug callback registration
  hwmon, via-cputemp: Fix CPU hotplug callback registration
  hwmon, coretemp: Fix CPU hotplug callback registration
  thermal, x86-pkg-temp: Fix CPU hotplug callback registration
  octeon, watchdog: Fix CPU hotplug callback registration
  oprofile, nmi-timer: Fix CPU hotplug callback registration
  intel-idle: Fix CPU hotplug callback registration
  clocksource, dummy-timer: Fix CPU hotplug callback registration
  drivers/base/topology.c: Fix CPU hotplug callback registration
  acpi-cpufreq: Fix CPU hotplug callback registration
  zsmalloc: Fix CPU hotplug callback registration
  scsi, fcoe: Fix CPU hotplug callback registration
  scsi, bnx2fc: Fix CPU hotplug callback registration
  scsi, bnx2i: Fix CPU hotplug callback registration
  ...
2014-04-07 14:55:46 -07:00
Paul Gortmaker
c96d6660dc kernel: audit/fix non-modular users of module_init in core code
Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular.  So using module_init as an
alias for __initcall can be somewhat misleading.

Fix these up now, so that we can relocate module_init from init.h into
module.h in the future.  If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.

The audit targets the following module_init users for change:
 kernel/user.c                  obj-y
 kernel/kexec.c                 bool KEXEC (one instance per arch)
 kernel/profile.c               bool PROFILING
 kernel/hung_task.c             bool DETECT_HUNG_TASK
 kernel/sched/stats.c           bool SCHEDSTATS
 kernel/user_namespace.c        bool USER_NS

Note that direct use of __initcall is discouraged, vs.  one of the
priority categorized subgroups.  As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e.  slightly earlier).  However no observable impact of that
difference has been observed during testing.

Also, two instances of missing ";" at EOL are fixed in kexec.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:07 -07:00
Srivatsa S. Bhat
c270a81719 profile: Fix CPU hotplug callback registration
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

	cpu_notifier_register_begin();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	/* Note the use of the double underscored version of the API */
	__register_cpu_notifier(&foobar_cpu_notifier);

	cpu_notifier_register_done();

Fix the profile code by using this latter form of callback registration.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20 13:43:48 +01:00
Johannes Weiner
e97ca8e5b8 mm: fix GFP_THISNODE callers and clarify
GFP_THISNODE is for callers that implement their own clever fallback to
remote nodes.  It restricts the allocation to the specified node and
does not invoke reclaim, assuming that the caller will take care of it
when the fallback fails, e.g.  through a subsequent allocation request
without GFP_THISNODE set.

However, many current GFP_THISNODE users only want the node exclusive
aspect of the flag, without actually implementing their own fallback or
triggering reclaim if necessary.  This results in things like page
migration failing prematurely even when there is easily reclaimable
memory available, unless kswapd happens to be running already or a
concurrent allocation attempt triggers the necessary reclaim.

Convert all callsites that don't implement their own fallback strategy
to __GFP_THISNODE.  This restricts the allocation a single node too, but
at the same time allows the allocator to enter the slowpath, wake
kswapd, and invoke direct reclaim if necessary, to make the allocation
happen when memory is full.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-10 17:26:19 -07:00
Paul Gortmaker
0db0628d90 kernel: delete __cpuinit usage from all core kernel files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the uses of the __cpuinit macros from C files in
the core kernel directories (kernel, init, lib, mm, and include)
that don't really have a specific maintainer.

[1] https://lkml.org/lkml/2013/5/20/589

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-14 19:36:59 -04:00
David Howells
271a15eabe proc: Supply PDE attribute setting accessor functions
Supply accessor functions to set attributes in proc_dir_entry structs.

The following are supplied: proc_set_size() and proc_set_user().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
cc: linuxppc-dev@lists.ozlabs.org
cc: linux-media@vger.kernel.org
cc: netdev@vger.kernel.org
cc: linux-wireless@vger.kernel.org
cc: linux-pci@vger.kernel.org
cc: netfilter-devel@vger.kernel.org
cc: alsa-devel@alsa-project.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01 17:29:18 -04:00
Al Viro
fbd387aea0 create_proc_cpu_mask() doesn't need an argument...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:35 -04:00
Frederic Weisbecker
ba6fdda46b profiling: Remove unused timer hook
The last remaining user was oprofile and its use has been
removed a while ago in commit bc078e4eab
("oprofile: convert oprofile from timer_hook to hrtimer").

There doesn't seem to be any upstream user of this hook
for about two years now. And I'm not even aware of any out of
tree user.

Let's remove it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1356191991-2251-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-24 15:37:26 +01:00
Nadia Yvette Chambers
6d49e352ae propagate name change to comments in kernel source
I've legally changed my name with New York State, the US Social Security
Administration, et al. This patch propagates the name change and change
in initials and login to comments in the kernel source as well.

Signed-off-by: Nadia Yvette Chambers <nyc@holomorphy.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-06 10:39:54 +01:00
Paul Gortmaker
9984de1a5a kernel: Map most files to use export.h instead of module.h
The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else.  Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

  -#include <linux/module.h>
  +#include <linux/export.h>

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:12 -04:00
Rakib Mullick
6f7bd76f05 kernel/profile.c: remove some duplicate code from profile_hits()
profile_hits() has a common check for prof_on and prof_buffer regardless
of SMP or !SMP.  So, remove some duplicate code by splitting profile_hits
into two.

[akpm@linux-foundation.org: make do_profile_hits static]
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Jesper Juhl
559fa6e76b profile: Use vzalloc() rather than vmalloc() & memset()
There's no reason to memset() manually when we have vzalloc().

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: William Irwin <wli@holomorphy.com>
LKML-Reference: <alpine.LNX.2.00.1010302150310.1572@swampdragon.chaosbits.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-31 09:47:26 +01:00
Arnd Bergmann
6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Lee Schermerhorn
3dd6b5fb43 numa: in-kernel profiling: use cpu_to_mem() for per cpu allocations
In kernel profiling requires that we be able to allocate "local" memory
for each cpu.  Use "cpu_to_mem()" instead of "cpu_to_node()" to support
memoryless nodes.

Depends on the "numa_mem_id()" patch.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:57 -07:00
Akinobu Mita
80b5184cc5 kernel/: convert cpu notifier to return encapsulate errno value
By the previous modification, the cpu notifier can return encapsulate
errno value.  This converts the cpu notifiers for kernel/*.c

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Hugh Dickins
16a2164bb0 profile: fix stats and data leakage
If the kernel is large or the profiling step small, /proc/profile
leaks data and readprofile shows silly stats, until readprofile -r
has reset the buffer: clear the prof_buffer when it is vmalloc()ed.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-14 19:45:06 -07:00
Alexey Dobriyan
583a22e7c1 kernel/profile.c: Switch /proc/irq/prof_cpu_mask to seq_file
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 20:15:40 +02:00
Mel Gorman
b62f495dad profile: suppress warning about large allocations when profile=1 is specified
When profile= is used, a large buffer is allocated early at boot.  This
can be larger than what the page allocator can provide so it prints a
warning.  However, the caller is able to handle the situation so this
patch suppresses the warning.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-29 19:10:36 -07:00
Mel Gorman
6484eb3e2a page allocator: do not check NUMA node ID when the caller knows the node is valid
Callers of alloc_pages_node() can optionally specify -1 as a node to mean
"allocate from the current node".  However, a number of the callers in
fast paths know for a fact their node is valid.  To avoid a comparison and
branch, this patch adds alloc_pages_exact_node() that only checks the nid
with VM_BUG_ON().  Callers that know their node is valid are then
converted.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Paul Mundt <lethal@linux-sh.org>	[for the SLOB NUMA bits]
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:32 -07:00
Yinghai Lu
38c7fed2f5 x86: remove some alloc_bootmem_cpumask_var calling
Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11 19:27:07 +03:00
Hugh Dickins
acd895795d profiling: fix broken profiling regression
Impact: fix broken /proc/profile on UP machines

Commit c309b917ca "cpumask: convert
kernel/profile.c" broke profiling.  prof_cpu_mask was previously
initialized to CPU_MASK_ALL, but left uninitialized in that commit.
We need to copy cpu_possible_mask (cpu_online_mask is not enough).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:50:37 +01:00
KOSAKI Motohiro
26e5438e4b profile: don't include <asm/ptrace.h> twice.
Currently, kernel/profile.c include <asm/ptrace.h> twice.  It can be
removed.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:14 -08:00
Rusty Russell
c309b917ca cpumask: convert kernel/profile.c
Impact: Reduce kernel memory usage, use new cpumask API.

Avoid a static cpumask_t for prof_cpu_mask, and an on-stack cpumask_t
in prof_cpu_mask_write_proc.  Both become cpumask_var_t.

prof_cpu_mask is only allocated when profiling is on, but the NULL
checks are optimized out by gcc for the !CPUMASK_OFFSTACK case.

Also removed some strange and unnecessary casts.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-01-01 10:12:27 +10:30
Rusty Russell
33edcf133b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-12-30 08:02:35 +10:30
Rusty Russell
29c0177e6a cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers.
Impact: change calling convention of existing cpumask APIs

Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.

These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
2008-12-13 21:20:25 +10:30
Ingo Molnar
b8307db247 Merge commit 'v2.6.28-rc7' into tracing/core 2008-12-04 09:07:19 +01:00
Al Viro
8419641450 cpuinit fixes in kernel/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 10:03:37 -08:00
Andrew Morton
60a5151320 profiling: clean up profile_nop()
Impact: cleanup

No point in inlining this.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-19 10:19:16 +01:00
Rakib Mullick
e270219f43 kernel/profile.c: fix section mismatch warning
Impact: fix section mismatch warning in kernel/profile.c

Here, profile_nop function has been called from a non-init function
create_hash_tables(void). Which generetes a section mismatch warning.
Previously, create_hash_tables(void) was a init function. So, removing
__init from create_hash_tables(void) requires profile_nop to be
non-init.

This patch makes profile_nop function inline and fixes the
following warning:

 WARNING: vmlinux.o(.text+0x6ebb6): Section mismatch in reference from
 the function create_hash_tables() to the function
 .init.text:profile_nop()
 The function create_hash_tables() references
 the function __init profile_nop().
 This is often because create_hash_tables lacks a __init
 annotation or the annotation of profile_nop is wrong.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-18 08:49:24 +01:00
Paul Mundt
ce05fcc30e kernel/profile: fix profile_init() section mismatch
profile_init() calls in to alloc_bootmem() on early initialization.  While
alloc_bootmem() is __init, the reference itself is safe in that it is
tucked below a !slab_is_available() check.  So, flag profile_init() as
__ref.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Dave Hansen
22b8ce9470 profiling: dynamically enable readprofile at runtime
Way too often, I have a machine that exhibits some kind of crappy
behavior.  The CPU looks wedged in the kernel or it is spending way too
much system time and I wonder what is responsible.

I try to run readprofile.  But, of course, Ubuntu doesn't enable it by
default.  Dang!

The reason we boot-time enable it is that it takes a big bufffer that we
generally can only bootmem alloc.  But, does it hurt to at least try and
runtime-alloc it?

To use:
echo 2 > /sys/kernel/profile

Then run readprofile like normal.

This should fix the compile issue with allmodconfig.  I've compile-tested
on a bunch more configs now including a few more architectures.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:31 -07:00
Adrian Bunk
b03f6489f9 build kernel/profile.o only when requested
Build kernel/profile.o only if CONFIG_PROFILING is enabled.

This makes CONFIG_PROFILING=n kernels smaller.

As a bonus, some profile_tick() calls and one branch from schedule() are
now eliminated with CONFIG_PROFILING=n (but I doubt these are
measurable effects).

This patch changes the effects of CONFIG_PROFILING=n, but I don't think
having more than two choices would be the better choice.

This patch also adds the name of the first parameter to the prototypes
of profile_{hits,tick}() since I anyway had to add them for the dummy
functions.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Jens Axboe
15c8b6c1aa on_each_cpu(): kill unused 'retry' parameter
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26 11:24:38 +02:00
Denis V. Lunev
c33fff0afb kernel: use non-racy method for proc entries creation
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:22 -07:00
Matthew Wilcox
a655020753 kernel: Remove unnecessary inclusions of asm/semaphore.h
None of these files use any of the functionality promised by
asm/semaphore.h.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:17:04 -04:00
Jesper Juhl
f8db694e46 Nuke a duplicate include from profile.c
Remove duplicate inclusion of linux/profile.h from kernel/profile.c

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:34 -08:00
Paolo Ciarrocchi
1ad82fd547 debug: clean up kernel/profile.c
Before:
 total: 25 errors, 13 warnings, 602 lines checked

 After:
 total: 0 errors, 2 warnings, 601 lines checked

No code changed:

kernel/profile.o:
   text    data     bss     dec     hex filename
   3048     236      24    3308     cec profile.o.before
   3048     236      24    3308     cec profile.o.after
 md5:
   2501d64748a4d350dffb11203e2a5182  profile.o.before.asm
   2501d64748a4d350dffb11203e2a5182  profile.o.after.asm

Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:33 +01:00
Mel Gorman
b3da2a73ff sched: document profile=sleep requiring CONFIG_SCHEDSTATS
profile=sleep only works if CONFIG_SCHEDSTATS is set. This patch notes
the limitation in Documentation/kernel-parameters.txt and prints a
warning at boot-time if profile=sleep is used without CONFIG_SCHEDSTAT.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Adrian Bunk
b012d346c0 make kernel/profile.c:time_hook static
{,un}register_timer_hook() is the API that should be used.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Christoph Lameter
4199cfa02b Memoryless nodes: Allow profiling data to fall back to other nodes
Processors on memoryless nodes must be able to fall back to remote nodes in
order to get a profiling buffer.  This may lead to excessive NUMA traffic but
I think we should allow this rather than failing.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Bob Picco <bob.picco@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@skynet.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:58 -07:00
Peter Chubb
cd5bfea278 fix compilation with gcc 4.2
gcc-4.2 is a lot more picky about its symbol handling.  EXPORT_SYMBOL no
longer works on symbols that are undefined or defined with static scope.

For example, with CONFIG_PROFILE off, I see:

  kernel/profile.c:206: error: __ksymtab_profile_event_unregister causes a section type conflict
  kernel/profile.c:205: error: __ksymtab_profile_event_register causes a section type conflict

This patch moves the EXPORTs inside the #ifdef CONFIG_PROFILE, so we
only try to export symbols that are defined.

Also, in kernel/kprobes.c there's an EXPORT_SYMBOL_GPL() for
jprobes_return, which if CONFIG_JPROBES is undefined is a static
inline and gives the same error.

And in drivers/acpi/resources/rsxface.c, there's an
ACPI_EXPORT_SYMBOPL() for a static symbol. If it's static, it's not
accessible from outside the compilation unit, so should bot be exported.

These three changes allow building a zx1_defconfig kernel with gcc 4.2
on IA64.

[akpm@linux-foundation.org: export jpobe_return properly]
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-11 15:47:42 -07:00
Alexey Dobriyan
e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Rafael J. Wysocki
8bb7844286 Add suspend-related notifications for CPU hotplug
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress.  This
patch introduces such notifications and causes them to be used during
suspend and resume transitions.  It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:56 -07:00