Commit graph

15102 commits

Author SHA1 Message Date
Linus Torvalds
a0b1c42951 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking update from David Miller:

 1) Checkpoint/restarted TCP sockets now can properly propagate the TCP
    timestamp offset.  From Andrey Vagin.

 2) VMWARE VM VSOCK layer, from Andy King.

 3) Much improved support for virtual functions and SR-IOV in bnx2x,
    from Ariel ELior.

 4) All protocols on ipv4 and ipv6 are now network namespace aware, and
    all the compatability checks for initial-namespace-only protocols is
    removed.  Thanks to Tom Parkin for helping deal with the last major
    holdout, L2TP.

 5) IPV6 support in netpoll and network namespace support in pktgen,
    from Cong Wang.

 6) Multiple Registration Protocol (MRP) and Multiple VLAN Registration
    Protocol (MVRP) support, from David Ward.

 7) Compute packet lengths more accurately in the packet scheduler, from
    Eric Dumazet.

 8) Use per-task page fragment allocator in skb_append_datato_frags(),
    also from Eric Dumazet.

 9) Add support for connection tracking labels in netfilter, from
    Florian Westphal.

10) Fix default multicast group joining on ipv6, and add anti-spoofing
    checks to 6to4 and 6rd.  From Hannes Frederic Sowa.

11) Make ipv4/ipv6 fragmentation memory limits more reasonable in modern
    times, rearrange inet frag datastructures for better cacheline
    locality, and move more operations outside of locking.  From Jesper
    Dangaard Brouer.

12) Instead of strict master <--> slave relationships, allow arbitrary
    scenerios with "upper device lists".  From Jiri Pirko.

13) Improve rate limiting accuracy in TBF and act_police, also from Jiri
    Pirko.

14) Add a BPF filter netfilter match target, from Willem de Bruijn.

15) Orphan and delete a bunch of pre-historic networking drivers from
    Paul Gortmaker.

16) Add TSO support for GRE tunnels, from Pravin B SHelar.  Although
    this still needs some minor bug fixing before it's %100 correct in
    all cases.

17) Handle unresolved IPSEC states like ARP, with a resolution packet
    queue.  From Steffen Klassert.

18) Remove TCP Appropriate Byte Count support (ABC), from Stephen
    Hemminger.  This was long overdue.

19) Support SO_REUSEPORT, from Tom Herbert.

20) Allow locking a socket BPF filter, so that it cannot change after a
    process drops capabilities.

21) Add VLAN filtering to bridge, from Vlad Yasevich.

22) Bring ipv6 on-par with ipv4 and do not cache neighbour entries in
    the ipv6 routes, from YOSHIFUJI Hideaki.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1538 commits)
  ipv6: fix race condition regarding dst->expires and dst->from.
  net: fix a wrong assignment in skb_split()
  ip_gre: remove an extra dst_release()
  ppp: set qdisc_tx_busylock to avoid LOCKDEP splat
  atl1c: restore buffer state
  net: fix a build failure when !CONFIG_PROC_FS
  net: ipv4: fix waring -Wunused-variable
  net: proc: fix build failed when procfs is not configured
  Revert "xen: netback: remove redundant xenvif_put"
  net: move procfs code to net/core/net-procfs.c
  qmi_wwan, cdc-ether: add ADU960S
  bonding: set sysfs device_type to 'bond'
  bonding: fix bond_release_all inconsistencies
  b44: use netdev_alloc_skb_ip_align()
  xen: netback: remove redundant xenvif_put
  net: fec: Do a sanity check on the gpio number
  ip_gre: propogate target device GSO capability to the tunnel device
  ip_gre: allow CSUM capable devices to handle packets
  bonding: Fix initialize after use for 3ad machine state spinlock
  bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
  ...
2013-02-20 18:58:50 -08:00
Linus Torvalds
8793422fd9 ACPI and power management updates for 3.9-rc1
- Rework of the ACPI namespace scanning code from Rafael J. Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.
 
 - ACPI power resources handling and ACPI device PM update from
   Rafael J. Wysocki.
 
 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng
   with contributions from Aaron Lu, Chao Guan, Jesper Juhl, and
   Tim Gardner.
 
 - Support for Intel Lynxpoint LPSS from Mika Westerberg.
 
 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.
 
 - cpuidle fixes and cleanups from Daniel Lezcano.
 
 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri
   with contributions from Stratos Karafotis and Rickard Andersson.
 
 - Intel P-states driver for Sandy Bridge processors from
   Dirk Brandewie.
 
 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.
 
 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.
 
 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.
 
 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.
 
 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.
 
 - Support for "lightweight suspend" from Zhang Rui.
 
 - Removal of the deprecated power trace API from Paul Gortmaker.
 
 - Assorted updates from Andreas Fleig, Colin Ian King,
   Davidlohr Bueso, Joseph Salisbury, Kees Cook, Li Fei,
   Nishanth Menon, ShuoX Liu, Srinivas Pandruvada, Tejun Heo,
   Thomas Renninger, and Yasuaki Ishimatsu.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJRIsArAAoJEKhOf7ml8uNsD6MP/j7C4NA+GTq6RdwoJt+Yki0K
 9Ep8I4pEuRFoN/oskv24EyQhpGJIk6UxWcJ/DWFBc+1VhmKORta7k2Idv/wlJA77
 s7AcDveA9xcDh+TVfbh87TeuiMSXiSdDZbiaQO+wMizWJAF3F84AnjiAqqqyQcSK
 bA5/Siz/vWlt9PyYDaQtHTVE4lpvPuVcQdYewsdaH2PsmUjvIg/TUzg28CTrdyvv
 eHOdBK9R0/OLQLhzRbL0VOGJ//wEl+HJRO0QEhTKPgdQ1e/VH/4Zu5WSzF8P/x4C
 s2f8U4IKQqulDuDHXtpMpelFm7hRWgsOqZLkcyXLs+0dvSM9CTPO6P0ZaImxUctk
 5daHWEsXUnCErDQawt1mcZP8l6qnxofMQIfLXyPVzvlSnHyToTmrtXa1v2u4AuL/
 hOo4MYWsFNUmRdtGFFGlExGgEDZ4G5NwiYjRBl/6XJ3v4nhnnMbuzxP8scpoe5m1
 8tjroJHZFUUs/mFU/H+oRbHzSzXPmp1sddNaTg4OpVmTn3DDh6ljnFhiItd1Ndw0
 5ldVbSe6ETq5RoK0TbzvQOeVpa9F3JfqbrXLQPqfd2iz/No41LQYG1uShRYuXKuA
 wfEcc+c9VMd3FILu05pGwBnU8VS9VbxTYMz7xDxg6b29Ywnb7u+Q1ycCk2gFYtkS
 E2oZDuyewTJxaskzYsNr
 =wijn
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:

 - Rework of the ACPI namespace scanning code from Rafael J.  Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.

 - ACPI power resources handling and ACPI device PM update from Rafael
   J Wysocki.

 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng with
   contributions from Aaron Lu, Chao Guan, Jesper Juhl, and Tim Gardner.

 - Support for Intel Lynxpoint LPSS from Mika Westerberg.

 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.

 - cpuidle fixes and cleanups from Daniel Lezcano.

 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri with
   contributions from Stratos Karafotis and Rickard Andersson.

 - Intel P-states driver for Sandy Bridge processors from Dirk
   Brandewie.

 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.

 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.

 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.

 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.

 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.

 - Support for "lightweight suspend" from Zhang Rui.

 - Removal of the deprecated power trace API from Paul Gortmaker.

 - Assorted updates from Andreas Fleig, Colin Ian King, Davidlohr Bueso,
   Joseph Salisbury, Kees Cook, Li Fei, Nishanth Menon, ShuoX Liu,
   Srinivas Pandruvada, Tejun Heo, Thomas Renninger, and Yasuaki
   Ishimatsu.

* tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (267 commits)
  PM idle: remove global declaration of pm_idle
  unicore32 idle: delete stray pm_idle comment
  openrisc idle: delete pm_idle
  mn10300 idle: delete pm_idle
  microblaze idle: delete pm_idle
  m32r idle: delete pm_idle, and other dead idle code
  ia64 idle: delete pm_idle
  cris idle: delete idle and pm_idle
  ARM64 idle: delete pm_idle
  ARM idle: delete pm_idle
  blackfin idle: delete pm_idle
  sparc idle: rename pm_idle to sparc_idle
  sh idle: rename global pm_idle to static sh_idle
  x86 idle: rename global pm_idle to static x86_idle
  APM idle: register apm_cpu_idle via cpuidle
  cpufreq / intel_pstate: Add kernel command line option disable intel_pstate.
  cpufreq / intel_pstate: Change to disallow module build
  tools/power turbostat: display SMI count by default
  intel_idle: export both C1 and C1E
  ACPI / hotplug: Fix concurrency issues and memory leaks
  ...
2013-02-20 11:26:56 -08:00
Linus Torvalds
9ae46e6702 Merge branch 'for-3.9-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cpuset changes from Tejun Heo:

 - Synchornization has seen a lot of changes with focus on decoupling
   cpuset synchronization from cgroup internal locking.

   After this change, there only remain a couple of mostly trivial
   dependencies on cgroup_lock outside cgroup core proper.  cgroup_lock
   is scheduled to be unexported in this devel cycle.

   This will finally remove the fragile locking order around cgroup
   (cgroup locking wants to / should be one of the outermost but yet has
   been acquired from deep inside individual controllers).

 - At this point, Li is most knowlegeable with cpuset and taking over
   the maintainership of cpuset.

* 'for-3.9-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: drop spurious retval assignment in proc_cpuset_show()
  cpuset: fix RCU lockdep splat
  cpuset: update MAINTAINERS
  cpuset: remove cpuset->parent
  cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()
  cpuset: replace cgroup_mutex locking with cpuset internal locking
  cpuset: schedule hotplug propagation from cpuset_attach() if the cpuset is empty
  cpuset: pin down cpus and mems while a task is being attached
  cpuset: make CPU / memory hotplug propagation asynchronous
  cpuset: drop async_rebuild_sched_domains()
  cpuset: don't nest cgroup_mutex inside get_online_cpus()
  cpuset: reorganize CPU / memory hotplug handling
  cpuset: cleanup cpuset[_can]_attach()
  cpuset: introduce cpuset_for_each_child()
  cpuset: introduce CS_ONLINE
  cpuset: introduce ->css_on/offline()
  cpuset: remove fast exit path from remove_tasks_in_empty_cpuset()
  cpuset: remove unused cpuset_unlock()
2013-02-20 09:18:31 -08:00
Linus Torvalds
502b24c23b Merge branch 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "Nothing too drastic.

   - Removal of synchronize_rcu() from userland visible paths.

   - Various fixes and cleanups from Li.

   - cgroup_rightmost_descendant() added which will be used by cpuset
     changes (it will be a separate pull request)."

* 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fail if monitored file and event_control are in different cgroup
  cgroup: fix cgroup_rmdir() vs close(eventfd) race
  cpuset: fix cpuset_print_task_mems_allowed() vs rename() race
  cgroup: fix exit() vs rmdir() race
  cgroup: remove bogus comments in cgroup_diput()
  cgroup: remove synchronize_rcu() from cgroup_diput()
  cgroup: remove duplicate RCU free on struct cgroup
  sched: remove redundant NULL cgroup check in task_group_path()
  sched: split out css_online/css_offline from tg creation/destruction
  cgroup: initialize cgrp->dentry before css_alloc()
  cgroup: remove a NULL check in cgroup_exit()
  cgroup: fix bogus kernel warnings when cgroup_create() failed
  cgroup: remove synchronize_rcu() from rebind_subsystems()
  cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}()
  cgroup: use new hashtable implementation
  cgroups: fix cgroup_event_listener error handling
  cgroups: move cgroup_event_listener.c to tools/cgroup
  cgroup: implement cgroup_rightmost_descendant()
  cgroup: remove unused dummy cgroup_fork_callbacks()
2013-02-20 09:16:21 -08:00
Sha Zhengju
1c3e826482 sched/core: Remove the obsolete and unused nr_uninterruptible() function
Signed-off-by: Sha Zhengju <handai.szj@taobao.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1361351678-8065-1-git-send-email-handai.szj@taobao.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-20 11:39:24 +01:00
Ingo Molnar
ff1fb5f6b4 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull two fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-20 11:26:21 +01:00
Linus Torvalds
ece8e0b2f9 Merge branch 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull async changes from Tejun Heo:
 "These are followups for the earlier deadlock issue involving async
  ending up waiting for itself through block requesting module[1].  The
  following changes are made by these commits.

   - Instead of requesting default elevator on each request_queue init,
     block now requests it once early during boot.

   - Kmod triggers warning if invoked from an async worker.

   - Async synchronization implementation has been reimplemented.  It's
     a lot simpler now."

* 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  async: initialise list heads to fix crash
  async: replace list of active domains with global list of pending items
  async: keep pending tasks on async_domain and remove async_pending
  async: use ULLONG_MAX for infinity cookie value
  async: bring sanity to the use of words domain and running
  async, kmod: warn on synchronous request_module() from async workers
  block: don't request module during elevator init
  init, block: try to load default elevator module early during boot
2013-02-19 22:10:26 -08:00
Linus Torvalds
67cb104b4c Merge branch 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue changes from Tejun Heo:
 "A lot of reorganization is going on mostly to prepare for worker pools
  with custom attributes so that workqueue can replace custom pool
  implementations in places including writeback and btrfs and make CPU
  assignment in crypto more flexible.

  workqueue evolved from purely per-cpu design and implementation, so
  there are a lot of assumptions regarding being bound to CPUs and even
  unbound workqueues are implemented as an extension of the model -
  workqueues running on the special unbound CPU.  Bulk of changes this
  round are about promoting worker_pools as the top level abstraction
  replacing global_cwq (global cpu workqueue).  At this point, I'm
  fairly confident about getting custom worker pools working pretty soon
  and ready for the next merge window.

  Lai's patches are replacing the convoluted mb() dancing workqueue has
  been doing with much simpler mechanism which only depends on
  assignment atomicity of long.  For details, please read the commit
  message of 0b3dae68ac ("workqueue: simplify is-work-item-queued-here
  test").  While the change ends up adding one pointer to struct
  delayed_work, the inflation in percentage is less than five percent
  and it decouples delayed_work logic a lot more cleaner from usual work
  handling, removes the unusual memory barrier dancing, and allows for
  further simplification, so I think the trade-off is acceptable.

  There will be two more workqueue related pull requests and there are
  some shared commits among them.  I'll write further pull requests
  assuming this pull request is pulled first."

* 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (37 commits)
  workqueue: un-GPL function delayed_work_timer_fn()
  workqueue: rename cpu_workqueue to pool_workqueue
  workqueue: reimplement is_chained_work() using current_wq_worker()
  workqueue: fix is_chained_work() regression
  workqueue: pick cwq instead of pool in __queue_work()
  workqueue: make get_work_pool_id() cheaper
  workqueue: move nr_running into worker_pool
  workqueue: cosmetic update in try_to_grab_pending()
  workqueue: simplify is-work-item-queued-here test
  workqueue: make work->data point to pool after try_to_grab_pending()
  workqueue: add delayed_work->wq to simplify reentrancy handling
  workqueue: make work_busy() test WORK_STRUCT_PENDING first
  workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END
  workqueue: post global_cwq removal cleanups
  workqueue: rename nr_running variables
  workqueue: remove global_cwq
  workqueue: remove worker_pool->gcwq
  workqueue: replace for_each_worker_pool() with for_each_std_worker_pool()
  workqueue: make freezing/thawing per-pool
  workqueue: make hotplug processing per-pool
  ...
2013-02-19 22:01:33 -08:00
Linus Torvalds
1eaec8212e Merge branch 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
 "This is part of on-going cleanups to remove / minimize usages of
  workqueue interfaces which are deprecated and/or misleading.

  This round drops a number of usages of [delayed_]work_pending(), which
  are dangerous as they lack any form of synchronization and thus often
  lead to buggy / unnecessary code.  There are a couple legitimate use
  cases in kernel.  Hopefully, they can be converted and
  [delayed_]work_pending() can be removed completely.  Even if not,
  removing most of misuses should make it more difficult to find
  examples of misuses and thus slow down growth of them.

  These changes are independent from other workqueue changes."

* 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  wimax/i2400m: fix i2400m->wake_tx_skb handling
  kprobes: fix wait_for_kprobe_optimizer()
  ipw2x00: simplify scan_event handling
  video/exynos: don't use [delayed_]work_pending()
  tty/max3100: don't use [delayed_]work_pending()
  x86/mce: don't use [delayed_]work_pending()
  rfkill: don't use [delayed_]work_pending()
  wl1251: don't use [delayed_]work_pending()
  thinkpad_acpi: don't use [delayed_]work_pending()
  mwifiex: don't use [delayed_]work_pending()
  sja1000: don't use [delayed_]work_pending()
2013-02-19 21:58:52 -08:00
Linus Torvalds
121027a7a6 Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two x86 kernel build changes from Ingo Molnar:
 "The first change modifies how 'make oldconfig' works on cross-bitness
  situations on x86.  It was felt the new behavior of preserving the
  bitness of the .config is more logical.  This is a leftover of the
  merge.

  The second change eliminates a Perl warning.  (There's another, more
  complete fix resulting of this warning fix, which second fix in flight
  to you via the kbuild tree, which will remove the timeconst.pl script
  altogether.)"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timeconst.pl: Eliminate Perl warning
  x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT
2013-02-19 19:12:03 -08:00
Linus Torvalds
5800700f66 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
 "Main changes:

   - Multiple MSI support added to the APIC, PCI and AHCI code - acked
     by all relevant maintainers, by Alexander Gordeev.

     The advantage is that multiple AHCI ports can have multiple MSI
     irqs assigned, and can thus spread to multiple CPUs.

     [ Drivers can make use of this new facility via the
       pci_enable_msi_block_auto() method ]

   - x86 IOAPIC code from interrupt remapping cleanups from Joerg
     Roedel:

     These patches move all interrupt remapping specific checks out of
     the x86 core code and replaces the respective call-sites with
     function pointers.  As a result the interrupt remapping code is
     better abstraced from x86 core interrupt handling code.

   - Various smaller improvements, fixes and cleanups."

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess
  x86, kvm: Fix intialization warnings in kvm.c
  x86, irq: Move irq_remapped out of x86 core code
  x86, io_apic: Introduce eoi_ioapic_pin call-back
  x86, msi: Introduce x86_msi.compose_msi_msg call-back
  x86, irq: Introduce setup_remapped_irq()
  x86, irq: Move irq_remapped() check into free_remapped_irq
  x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq()
  x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core
  x86, irq: Add data structure to keep AMD specific irq remapping information
  x86, irq: Move irq_remapping_enabled declaration to iommu code
  x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin
  x86, io_apic: Move irq_remapping_enabled checks out of check_timer()
  x86, io_apic: Convert setup_ioapic_entry to function pointer
  x86, io_apic: Introduce set_affinity function pointer
  x86, msi: Use IRQ remapping specific setup_msi_irqs routine
  x86, hpet: Introduce x86_msi_ops.setup_hpet_msi
  x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging
  x86, io_apic: Introduce x86_io_apic_ops.disable()
  x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume
  ...
2013-02-19 19:07:27 -08:00
Linus Torvalds
266d7ad7f4 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer changes from Ingo Molnar:
 "Main changes:

   - ntp: Add CONFIG_RTC_SYSTOHC: a generic RTC driver facility
     complementing the existing CONFIG_RTC_HCTOSYS, which uses NTP to
     keep the hardware clock updated.

   - posix-timers: Fix clock_adjtime to always return timex data on
     success.  This is changing the ABI, but no breakage was expected
     and found - caution is warranted nevertheless.

   - platform persistent clock improvements/cleanups.

   - clockevents: refactor timer broadcast handling to be more generic
     and less duplicated with matching architecture code (mostly ARM
     motivated.)

   - various fixes and cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/x86/hpet: Use HPET_COUNTER to specify the hpet counter in vread_hpet()
  posix-cpu-timers: Fix nanosleep task_struct leak
  clockevents: Fix generic broadcast for FEAT_C3STOP
  time, Fix setting of hardware clock in NTP code
  hrtimer: Prevent hrtimer_enqueue_reprogram race
  clockevents: Add generic timer broadcast function
  clockevents: Add generic timer broadcast receiver
  timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCK
  x86/time/rtc: Don't print extended CMOS year when reading RTC
  x86: Select HAS_PERSISTENT_CLOCK on x86
  timekeeping: Add CONFIG_HAS_PERSISTENT_CLOCK option
  rtc: Skip the suspend/resume handling if persistent clock exist
  timekeeping: Add persistent_clock_exist flag
  posix-timers: Fix clock_adjtime to always return timex data on success
  Round the calculated scale factor in set_cyc2ns_scale()
  NTP: Add a CONFIG_RTC_SYSTOHC configuration
  MAINTAINERS: Update John Stultz's email
  time: create __getnstimeofday for WARNless calls
2013-02-19 19:05:45 -08:00
Linus Torvalds
bcbd818c06 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull preparatory smp/hotplug patches from Ingo Molnar:
 "Some early preparatory changes for the WIP hotplug rework by Thomas
  Gleixner."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  stop_machine: Use smpboot threads
  stop_machine: Store task reference in a separate per cpu variable
  smpboot: Allow selfparking per cpu threads
2013-02-19 19:04:55 -08:00
Linus Torvalds
d652e1eb8e Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "Main changes:

   - scheduler side full-dynticks (user-space execution is undisturbed
     and receives no timer IRQs) preparation changes that convert the
     cputime accounting code to be full-dynticks ready, from Frederic
     Weisbecker.

   - Initial sched.h split-up changes, by Clark Williams

   - select_idle_sibling() performance improvement by Mike Galbraith:

        " 1 tbench pair (worst case) in a 10 core + SMT package:

          pre   15.22 MB/sec 1 procs
          post 252.01 MB/sec 1 procs "

  - sched_rr_get_interval() ABI fix/change.  We think this detail is not
    used by apps (so it's not an ABI in practice), but lets keep it
    under observation.

  - misc RT scheduling cleanups, optimizations"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
  cputime: Remove irqsave from seqlock readers
  sched, powerpc: Fix sched.h split-up build failure
  cputime: Restore CPU_ACCOUNTING config defaults for PPC64
  sched/rt: Move rt specific bits into new header file
  sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
  sched: Move sched.h sysctl bits into separate header
  sched: Fix signedness bug in yield_to()
  sched: Fix select_idle_sibling() bouncing cow syndrome
  sched/rt: Further simplify pick_rt_task()
  sched/rt: Do not account zero delta_exec in update_curr_rt()
  cputime: Safely read cputime of full dynticks CPUs
  kvm: Prepare to add generic guest entry/exit callbacks
  cputime: Use accessors to read task cputime stats
  cputime: Allow dynamic switch between tick/virtual based cputime accounting
  cputime: Generic on-demand virtual cputime accounting
  cputime: Move default nsecs_to_cputime() to jiffies based cputime file
  cputime: Librarize per nsecs resolution cputime definitions
  cputime: Avoid multiplication overflow on utime scaling
  context_tracking: Export context state for generic vtime
  ...

Fix up conflict in kernel/context_tracking.c due to comment additions.
2013-02-19 18:19:48 -08:00
Linus Torvalds
8f55cea410 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar:
 "There are lots of improvements, the biggest changes are:

  Main kernel side changes:

   - Improve uprobes performance by adding 'pre-filtering' support, by
     Oleg Nesterov.

   - Make some POWER7 events available in sysfs, equivalent to what was
     done on x86, from Sukadev Bhattiprolu.

   - tracing updates by Steve Rostedt - mostly misc fixes and smaller
     improvements.

   - Use perf/event tracing to report PCI Express advanced errors, by
     Tony Luck.

   - Enable northbridge performance counters on AMD family 15h, by Jacob
     Shin.

   - This tracing commit:

        tracing: Remove the extra 4 bytes of padding in events

     changes the ABI.  All involved parties (PowerTop in particular)
     seem to agree that it's safe to do now with the introduction of
     libtraceevent, but the devil is in the details ...

  Main tooling side changes:

   - Add 'event group view', from Namyung Kim:

     To use it, 'perf record' should group events when recording.  And
     then perf report parses the saved group relation from file header
     and prints them together if --group option is provided.  You can
     use the 'perf evlist' command to see event group information:

        $ perf record -e '{ref-cycles,cycles}' noploop 1
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]

        $ perf evlist --group
        {ref-cycles,cycles}

     With this example, default perf report will show you each event
     separately.

     You can use --group option to enable event group view:

        $ perf report --group
        ...
        # group: {ref-cycles,cycles}
        # ========
        # Samples: 7K of event 'anon group { ref-cycles, cycles }'
        # Event count (approx.): 6876107743
        #
        #         Overhead  Command      Shared Object                      Symbol
        # ................  .......  .................  ..........................
            99.84%  99.76%  noploop  noploop            [.] main
             0.07%   0.00%  noploop  ld-2.15.so         [.] strcmp
             0.03%   0.00%  noploop  [kernel.kallsyms]  [k] timerqueue_del
             0.03%   0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
             0.02%   0.00%  noploop  [kernel.kallsyms]  [k] account_user_time
             0.01%   0.00%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
             0.00%   0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
             0.00%   0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
             0.00%   0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
             0.00%   0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
             0.00%   0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time

     As you can see the Overhead column now contains both of ref-cycles
     and cycles and header line shows group information also - 'anon
     group { ref-cycles, cycles }'.  The output is sorted by period of
     group leader first.

   - Initial GTK+ annotate browser, from Namhyung Kim.

   - Add option for runtime switching perf data file in perf report,
     just press 's' and a menu with the valid files found in the current
     directory will be presented, from Feng Tang.

   - Add support to display whole group data for raw columns, from Jiri
     Olsa.

   - Add per processor socket count aggregation in perf stat, from
     Stephane Eranian.

   - Add interval printing in 'perf stat', from Stephane Eranian.

   - 'perf test' improvements

   - Add support for wildcards in tracepoint system name, from Jiri
     Olsa.

   - Add anonymous huge page recognition, from Joshua Zhu.

   - perf build-id cache now can show DSOs present in a perf.data file
     that are not in the cache, to integrate with build-id servers being
     put in place by organizations such as Fedora.

   - perf top now shares more of the evsel config/creation routines with
     'record', paving the way for further integration like 'top'
     snapshots, etc.

   - perf top now supports DWARF callchains.

   - Fix mmap limitations on 32-bit, fix from David Miller.

   - 'perf bench numa mem' NUMA performance measurement suite

   - ... and lots of fixes, performance improvements, cleanups and other
     improvements I failed to list - see the shortlog and git log for
     details."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits)
  perf/x86/amd: Enable northbridge performance counters on AMD family 15h
  perf/hwbp: Fix cleanup in case of kzalloc failure
  perf tools: Fix build with bison 2.3 and older.
  perf tools: Limit unwind support to x86 archs
  perf annotate: Make it to be able to skip unannotatable symbols
  perf gtk/annotate: Fail early if it can't annotate
  perf gtk/annotate: Show source lines with gray color
  perf gtk/annotate: Support multiple event annotation
  perf ui/gtk: Implement basic GTK2 annotation browser
  perf annotate: Fix warning message on a missing vmlinux
  perf buildid-cache: Add --update option
  uprobes/perf: Avoid uprobe_apply() whenever possible
  uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE
  uprobes/perf: Teach trace_uprobe/perf code to pre-filter
  uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's
  uprobes: Introduce uprobe_apply()
  perf: Introduce hw_perf_event->tp_target and ->tp_list
  uprobes/perf: Always increment trace_uprobe->nhit
  uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe
  uprobes/tracing: Introduce is_trace_uprobe_enabled()
  ...
2013-02-19 17:49:41 -08:00
Linus Torvalds
b7133a9a10 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq core changes from Ingo Molnar:
 "The biggest changes are the IRQ-work and printk changes from Frederic
  Weisbecker, which prepare the code for 'full dynticks' (the ability to
  stop or slow down the periodic tick arbitrarily, not just in idle time
  as today):

   - Don't stop tick with irq works pending.  This fix is generally
     useful and concerns archs that can't raise self IPIs.

   - Flush irq works before CPU offlining.

   - Introduce "lazy" irq works that can wait for the next tick to be
     executed, unless it's stopped.

   - Implement klogd wake up using irq work.  This removes the ad-hoc
     printk_tick()/printk_needs_cpu() hooks and make it working even in
     dynticks mode.

   - Cleanups and fixes."

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Export enable/disable_percpu_irq()
  arch Kconfig: Remove references to IRQ_PER_CPU
  irq_work: Remove return value from the irq_work_queue() function
  genirq: Avoid deadlock in spurious handling
  printk: Wake up klogd using irq_work
  irq_work: Make self-IPIs optable
  irq_work: Warn if there's still work on cpu_down
  irq_work: Flush work on CPU_DYING
  irq_work: Don't stop the tick with pending works
  nohz: Add API to check tick state
  irq_work: Remove CONFIG_HAVE_IRQ_WORK
  irq_work: Fix racy check on work pending flag
  irq_work: Fix racy IRQ_WORK_BUSY flag setting
2013-02-19 17:47:58 -08:00
Linus Torvalds
e84cf5d0fd Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
 "SRCU changes:

   - These include debugging aids, updates that move towards the goal of
     permitting srcu_read_lock() and srcu_read_unlock() to be used from
     idle and offline CPUs, and a few small fixes.

  Changes to rcutorture and to RCU documentation:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/188

  Enhancements to uniprocessor handling in tiny RCU:

   - Posted to LKML at https://lkml.org/lkml/2013/1/27/2

  Tag RCU callbacks with grace-period number to simplify callback
  advancement:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/203

  Miscellaneous fixes:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/204"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
  srcu: Update synchronize_srcu_expedited()'s comments
  srcu: Update synchronize_srcu()'s comments
  srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
  srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
  srcu: Simple cleanup for cleanup_srcu_struct()
  srcu: Add might_sleep() annotation to synchronize_srcu()
  srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
  rcu: Allow rcutorture to be built at low optimization levels
  rcu: Make rcutorture's shuffler task shuffle recently added tasks
  rcu: Allow TREE_PREEMPT_RCU on UP systems
  rcu: Provide RCU CPU stall warnings for tiny RCU
  context_tracking: Add comments on interface and internals
  rcu: Remove obsolete Kconfig option from comment
  rcu: Remove unused code originally used for context tracking
  rcu: Consolidate debugging Kconfig options
  rcu: Correct 'optimized' to 'optimize' in header comment
  rcu: Trace callback acceleration
  rcu: Tag callback lists with corresponding grace-period number
  rcutorture: Don't compare ptr with 0
  ...
2013-02-19 17:45:20 -08:00
Konstantin Khlebnikov
1438ade567 workqueue: un-GPL function delayed_work_timer_fn()
commit d8e794dfd5 ("workqueue: set
delayed_work->timer function on initialization") exports function
delayed_work_timer_fn() only for GPL modules. This makes delayed-works
unusable for non-GPL modules, because initialization macro now requires
GPL symbol. For example schedule_delayed_work() available for non-GPL.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # 3.7
2013-02-19 10:09:13 -08:00
Thomas Gleixner
fe2b05f7ca futex: Revert "futex: Mark get_robust_list as deprecated"
This reverts commit ec0c4274e3.

get_robust_list() is in use and a removal would break existing user
space. With the permission checks in place it's not longer a security
hole. Remove the deprecation warnings.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: akpm@linux-foundation.org
Cc: paul.gortmaker@windriver.com
Cc: davej@redhat.com
Cc: keescook@chromium.org
Cc: stable@vger.kernel.org
Cc: ebiederm@xmission.com
2013-02-19 08:43:38 +01:00
Thomas Gleixner
a6c0c943a1 ntp: Make ntp_lock raw
seconds_overflow() is called from hard interrupt context even on
Preempt-RT. This requires the lock to be a raw_spinlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:43:16 +01:00
Ben Greear
c054060683 lockdep: Print more info when MAX_LOCK_DEPTH is exceeded
This helps debug cases where a lock is acquired over and
over without being released.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Greear <greearb@candelatech.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1360176979-4421-1-git-send-email-greearb@candelatech.com
[ Changed the printout ordering. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:42:44 +01:00
Namhyung Kim
c06b4f1947 watchdog: Use local_clock for get_timestamp()
The get_timestamp() function is always called with current cpu,
thus using local_clock() would be more appropriate and it makes
the code shorter and cleaner IMHO.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1356576585-28782-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:42:40 +01:00
Srivatsa S. Bhat
f86f755482 lockdep: Rename print_unlock_inbalance_bug() to print_unlock_imbalance_bug()
Fix the typo in the function name (s/inbalance/imbalance)

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: rostedt@goodmis.org
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/20130108130547.32733.79507.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:42:39 +01:00
Thomas Gleixner
cdc4e86b58 cputime: Remove irqsave from seqlock readers
The reader side code has no requirement to disable interrupts while
sampling data. The sequence counter is enough to ensure consistency.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:05:53 +01:00
David S. Miller
6338a53a2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net into net
Pull in 'net' to take in the bug fixes that didn't make it into
3.8-final.

Also, deal with the semantic conflict of the change made to
net/ipv6/xfrm6_policy.c   A missing rt6->n neighbour release
was added to 'net', but in 'net-next' we no longer cache the
neighbour entries in the ipv6 routes so that change is not
appropriate there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 23:34:21 -05:00
Steven Rostedt (Red Hat)
8c189ea64e ftrace: Call ftrace cleanup module notifier after all other notifiers
Commit: c1bf08ac "ftrace: Be first to run code modification on modules"

changed ftrace module notifier's priority to INT_MAX in order to
process the ftrace nops before anything else could touch them
(namely kprobes). This was the correct thing to do.

Unfortunately, the ftrace module notifier also contains the ftrace
clean up code. As opposed to the set up code, this code should be
run *after* all the module notifiers have run in case a module is doing
correct clean-up and unregisters its ftrace hooks. Basically, ftrace
needs to do clean up on module removal, as it needs to know about code
being removed so that it doesn't try to modify that code. But after it
removes the module from its records, if a ftrace user tries to remove
a probe, that removal will fail due as the record of that code segment
no longer exists.

Nothing really bad happens if the probe removal is called after ftrace
did the clean up, but the ftrace removal function will return an error.
Correct code (such as kprobes) will produce a WARN_ON() if it fails
to remove the probe. As people get annoyed by frivolous warnings, it's
best to do the ftrace clean up after everything else.

By splitting the ftrace_module_notifier into two notifiers, one that
does the module load setup that is run at high priority, and the other
that is called for module clean up that is run at low priority, the
problem is solved.

Cc: stable@vger.kernel.org
Reported-by: Frank Ch. Eigler <fche@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-02-18 23:09:26 -05:00
Chris Metcalf
36a5df85e9 genirq: Export enable/disable_percpu_irq()
These functions are used by the tilegx onchip network driver, and it's
useful to be able to load that driver as a module.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Link: http://lkml.kernel.org/r/201302012043.r11KhNZF024371@farm-0021.internal.tilera.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-18 21:42:25 +01:00
Li Zefan
f169007b27 cgroup: fail if monitored file and event_control are in different cgroup
If we pass fd of memory.usage_in_bytes of cgroup A to cgroup.event_control
of cgroup B, then we won't get memory usage notification from A but B!

What's worse, if A and B are in different mount hierarchy, we'll end up
accessing NULL pointer!

Disallow this kind of invalid usage.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-18 09:31:35 -08:00
Li Zefan
810cbee4fa cgroup: fix cgroup_rmdir() vs close(eventfd) race
commit 205a872bd6 ("cgroup: fix lockdep
warning for event_control") solved a deadlock by introducing a new
bug.

Move cgrp->event_list to a temporary list doesn't mean you can traverse
this list locklessly, because at the same time cgroup_event_wake() can
be called and remove the event from the list. The result of this race
is disastrous.

We adopt the way how kvm irqfd code implements race-free event removal,
which is now described in the comments in cgroup_event_wake().

v3:
- call eventfd_signal() no matter it's eventfd close or cgroup removal
that removes the cgroup event.

Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-18 09:17:24 -08:00
Li Zefan
63f43f55c9 cpuset: fix cpuset_print_task_mems_allowed() vs rename() race
rename() will change dentry->d_name. The result of this race can
be worse than seeing partially rewritten name, but we might access
a stale pointer because rename() will re-allocate memory to hold
a longer name.

It's safe in the protection of dentry->d_lock.

v2: check NULL dentry before acquiring dentry lock.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
2013-02-18 09:08:15 -08:00
Li Zefan
71b5707e11 cgroup: fix exit() vs rmdir() race
In cgroup_exit() put_css_set_taskexit() is called without any lock,
which might lead to accessing a freed cgroup:

thread1                           thread2
---------------------------------------------
exit()
  cgroup_exit()
    put_css_set_taskexit()
      atomic_dec(cgrp->count);
                                   rmdir();
      /* not safe !! */
      check_for_release(cgrp);

rcu_read_lock() can be used to make sure the cgroup is alive.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
2013-02-18 09:08:10 -08:00
H. Peter Anvin
70730bca13 kernel: Replace timeconst.pl with a bc script
bc is the standard tool for multi-precision arithmetic.  We switched
to Perl because akpm reported a hard-to-reproduce build hang, which
was very odd because affected and unaffected machines were all running
the same version of GNU bc.

Unfortunately switching to Perl required a really ugly "canning"
mechanism to support Perl < 5.8 installations lacking the Math::BigInt
module.

It was recently pointed out to me that some very old versions of GNU
make had problems with pipes in subshells, which was indeed the
construct used in the Makefile rules in that version of the patch;
Perl didn't need it so switching to Perl fixed the problem for
unrelated reasons.  With the problem (hopefully) root-caused, we can
switch back to bc and do the arbitrary-precision arithmetic naturally.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-02-16 23:17:25 +01:00
Vineet Gupta
bf14e3b979 sysctl: Enable PARISC "unaligned-trap" to be used cross-arch
PARISC defines /proc/sys/kernel/unaligned-trap to runtime toggle
unaligned access emulation.

The exact mechanics of enablig/disabling are still arch specific, we can
make the sysctl usable by other arches.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
2013-02-15 23:16:05 +05:30
Vladimir Davydov
686855f5d8 sched: add wait_for_completion_io[_timeout]
The only difference between wait_for_completion[_timeout]() and
wait_for_completion_io[_timeout]() is that the latter calls
io_schedule_timeout() instead of schedule_timeout() so that the caller
is accounted as waiting for IO, not just sleeping.

These functions can be used for correct iowait time accounting when the
completion struct is actually used for waiting for IO (e.g. completion
of a bio request in the block layer).

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-02-15 16:45:06 +01:00
Rafael J. Wysocki
7113fe74c1 Merge branch 'pm-assorted'
* pm-assorted:
  suspend: enable freeze timeout configuration through sys
  ACPI: enable ACPI SCI during suspend
  PM: Introduce suspend state PM_SUSPEND_FREEZE
  PM / Runtime: Add new helper function: pm_runtime_active()
  PM / tracing: remove deprecated power trace API
  PM: don't use [delayed_]work_pending()
  PM / Domains: don't use [delayed_]work_pending()
2013-02-15 13:58:54 +01:00
Stanislaw Gruszka
e6c42c295e posix-cpu-timers: Fix nanosleep task_struct leak
The trinity fuzzer triggered a task_struct reference leak via
clock_nanosleep with CPU_TIMERs. do_cpu_nanosleep() calls
posic_cpu_timer_create(), but misses a corresponding
posix_cpu_timer_del() which leads to the task_struct reference leak.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20130215100810.GF4392@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-15 11:41:56 +01:00
Daniel Baluta
02e176af92 perf/hwbp: Fix cleanup in case of kzalloc failure
Obviously this is a typo and could result in memory leaks if kzalloc
fails on a given cpu.

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1360186160-7566-1-git-send-email-dbaluta@ixiacom.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-02-14 17:06:39 -03:00
Thomas Gleixner
9f4646d283 Merge branch 'fortglx/3.9/time' of git://git.linaro.org/people/jstultz/linux into timers/core 2013-02-14 19:46:10 +01:00
Thomas Gleixner
14e568e78f stop_machine: Use smpboot threads
Use the smpboot thread infrastructure. Mark the stopper thread
selfparking and park it after it has finished the take_cpu_down()
work.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.686315164@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-14 15:29:38 +01:00
Thomas Gleixner
860a0ffaa3 stop_machine: Store task reference in a separate per cpu variable
To allow the stopper thread being managed by the smpboot thread
infrastructure separate out the task storage from the stopper data
structure.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.626690384@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-14 15:29:37 +01:00
Thomas Gleixner
7d7e499f73 smpboot: Allow selfparking per cpu threads
The stop machine threads are still killed when a cpu goes offline. The
reason is that the thread is used to bring the cpu down, so it can't
be parked along with the other per cpu threads.

Allow a per cpu thread to be excluded from automatic parking, so it
can park itself once it's done

Add a create callback function as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Arjan van de Veen <arjan@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Weinberger <rw@linutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130131120741.553993267@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-14 15:29:37 +01:00
Al Viro
d64008a8f3 burying unused conditionals
__ARCH_WANT_SYS_RT_SIGACTION,
__ARCH_WANT_SYS_RT_SIGSUSPEND,
__ARCH_WANT_COMPAT_SYS_RT_SIGSUSPEND,
__ARCH_WANT_COMPAT_SYS_SCHED_RR_GET_INTERVAL - not used anymore
CONFIG_GENERIC_{SIGALTSTACK,COMPAT_RT_SIG{ACTION,QUEUEINFO,PENDING,PROCMASK}} -
can be assumed always set.
2013-02-14 09:21:15 -05:00
Al Viro
e9b04b5b67 make do_sigaltstack() static
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-14 09:21:15 -05:00
Tejun Heo
112202d909 workqueue: rename cpu_workqueue to pool_workqueue
workqueue has moved away from global_cwqs to worker_pools and with the
scheduled custom worker pools, wforkqueues will be associated with
pools which don't have anything to do with CPUs.  The workqueue code
went through significant amount of changes recently and mass renaming
isn't likely to hurt much additionally.  Let's replace 'cpu' with
'pool' so that it reflects the current design.

* s/struct cpu_workqueue_struct/struct pool_workqueue/
* s/cpu_wq/pool_wq/
* s/cwq/pwq/

This patch is purely cosmetic.

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-13 19:29:12 -08:00
Tejun Heo
8d03ecfe47 workqueue: reimplement is_chained_work() using current_wq_worker()
is_chained_work() was added before current_wq_worker() and implemented
its own ham-fisted way of finding out whether %current is a workqueue
worker - it iterates through all possible workers.

Drop the custom implementation and reimplement using
current_wq_worker().

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-13 19:29:10 -08:00
Tejun Heo
1dd638149f workqueue: fix is_chained_work() regression
c9e7cf273f ("workqueue: move busy_hash from global_cwq to
worker_pool") incorrectly converted is_chained_work() to use
get_gcwq() inside for_each_gcwq_cpu() while removing get_gcwq().

As cwq might not exist for all possible workqueue CPUs, @cwq can be
NULL and the following cwq deferences can lead to oops.

Fix it by using for_each_cwq_cpu() instead, which is the better one to
use anyway as we only need to check pools that the wq is associated
with.

Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-13 19:29:07 -08:00
Steven Rostedt
f431b634f2 tracing/syscalls: Allow archs to ignore tracing compat syscalls
The tracing of ia32 compat system calls has been a bit of a pain as they
use different system call numbers than the 64bit equivalents.

I wrote a simple 'lls' program that lists files. I compiled it as a i686
ELF binary and ran it under a x86_64 box. This is the result:

echo 0 > /debug/tracing/tracing_on
echo 1 > /debug/tracing/events/syscalls/enable
echo 1 > /debug/tracing/tracing_on ; ./lls ; echo 0 > /debug/tracing/tracing_on

grep lls /debug/tracing/trace

[.. skipping calls before TS_COMPAT is set ...]

             lls-1127  [005] d...   936.409188: sys_recvfrom(fd: 0, ubuf: 4d560fc4, size: 0, flags: 8048034, addr: 8, addr_len: f7700420)
             lls-1127  [005] d...   936.409190: sys_recvfrom -> 0x8a77000
             lls-1127  [005] d...   936.409211: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
             lls-1127  [005] d...   936.409215: sys_lgetxattr -> 0xf76ff000
             lls-1127  [005] d...   936.409223: sys_dup2(oldfd: 4d55ae9b, newfd: 4)
             lls-1127  [005] d...   936.409228: sys_dup2 -> 0xfffffffffffffffe
             lls-1127  [005] d...   936.409236: sys_newfstat(fd: 4d55b085, statbuf: 80000)
             lls-1127  [005] d...   936.409242: sys_newfstat -> 0x3
             lls-1127  [005] d...   936.409243: sys_removexattr(pathname: 3, name: ffcd0060)
             lls-1127  [005] d...   936.409244: sys_removexattr -> 0x0
             lls-1127  [005] d...   936.409245: sys_lgetxattr(pathname: 0, name: 19614, value: 1, size: 2)
             lls-1127  [005] d...   936.409248: sys_lgetxattr -> 0xf76e5000
             lls-1127  [005] d...   936.409248: sys_newlstat(filename: 3, statbuf: 19614)
             lls-1127  [005] d...   936.409249: sys_newlstat -> 0x0
             lls-1127  [005] d...   936.409262: sys_newfstat(fd: f76fb588, statbuf: 80000)
             lls-1127  [005] d...   936.409279: sys_newfstat -> 0x3
             lls-1127  [005] d...   936.409279: sys_close(fd: 3)
             lls-1127  [005] d...   936.421550: sys_close -> 0x200
             lls-1127  [005] d...   936.421558: sys_removexattr(pathname: 3, name: ffcd00d0)
             lls-1127  [005] d...   936.421560: sys_removexattr -> 0x0
             lls-1127  [005] d...   936.421569: sys_lgetxattr(pathname: 4d564000, name: 1b1abc, value: 5, size: 802)
             lls-1127  [005] d...   936.421574: sys_lgetxattr -> 0x4d564000
             lls-1127  [005] d...   936.421575: sys_capget(header: 4d70f000, dataptr: 1000)
             lls-1127  [005] d...   936.421580: sys_capget -> 0x0
             lls-1127  [005] d...   936.421580: sys_lgetxattr(pathname: 4d710000, name: 3000, value: 3, size: 812)
             lls-1127  [005] d...   936.421589: sys_lgetxattr -> 0x4d710000
             lls-1127  [005] d...   936.426130: sys_lgetxattr(pathname: 4d713000, name: 2abc, value: 3, size: 32)
             lls-1127  [005] d...   936.426141: sys_lgetxattr -> 0x4d713000
             lls-1127  [005] d...   936.426145: sys_newlstat(filename: 3, statbuf: f76ff3f0)
             lls-1127  [005] d...   936.426146: sys_newlstat -> 0x0
             lls-1127  [005] d...   936.431748: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)

Obviously I'm not calling newfstat with a fd of 4d55b085. The calls are
obviously incorrect, and confusing.

Other efforts have been made to fix this:

https://lkml.org/lkml/2012/3/26/367

But the real solution is to rewrite the syscall internals and come up
with a fixed solution. One that doesn't require all the kluge that the
current solution has.

Thus for now, instead of outputting incorrect data, simply ignore them.
With this patch the changes now have:

 #> grep lls /debug/tracing/trace
 #>

Compat system calls simply are not traced. If users need compat
syscalls, then they should just use the raw syscall tracepoints.

For an architecture to make their compat syscalls ignored, it must
define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS (done in asm/ftrace.h) and also
define an arch_trace_is_compat_syscall() function that will return true
if the current task should ignore tracing the syscall.

I want to stress that this change does not affect actual syscalls in any
way, shape or form. It is only used within the tracing system and
doesn't interfere with the syscall logic at all. The changes are
consolidated nicely into trace_syscalls.c and asm/ftrace.h.

I had to make one small modification to asm/thread_info.h and that was
to remove the include of asm/ftrace.h. As asm/ftrace.h required the
current_thread_info() it was causing include hell. That include was
added back in 2008 when the function graph tracer was added:

 commit caf4b323 "tracing, x86: add low level support for ftrace return tracing"

It does not need to be included there.

Link: http://lkml.kernel.org/r/1360703939.21867.99.camel@gandalf.local.home

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-02-12 17:46:28 -05:00
Eric W. Biederman
6e6668845f kernel/pid.c: reenable interrupts when alloc_pid() fails because init has exited
We're forgetting to reenable local interrupts on an error path.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reported-by: Josh Boyer <jwboyer@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-12 14:34:00 -08:00
Thomas Gleixner
86c8ead593 Merge branch 'timers/for-arm' into timers/core 2013-02-12 20:22:56 +01:00
Mark Rutland
5d1d9a29bc clockevents: Fix generic broadcast for FEAT_C3STOP
Commit 12ad100046: "clockevents: Add generic timer broadcast function"
made tick_device_uses_broadcast set up the generic broadcast function
for dummy devices (where !tick_device_is_functional(dev)), but neglected
to set up the broadcast function for devices that stop in low power
states (with the CLOCK_EVT_FEAT_C3STOP flag).

When these devices enter low power states they will not have the generic
broadcast function assigned, and will bring down the system when an
attempt is made to broadcast to them.

This patch ensures that the broadcast function is also assigned for
devices which require broadcast in low power states.

Reported-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: nico@linaro.org
Cc: Marc.Zyngier@arm.com
Cc: Will.Deacon@arm.com
Cc: santosh.shilimkar@ti.com
Cc: john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-12 20:22:28 +01:00