1. 26 Jan, 2015 40 commits
    • Sreekanth Reddy's avatar
      Revert "[SCSI] mpt3sas: Remove phys on topology change" · 95bc632b
      Sreekanth Reddy authored
      commit 2311ce4d upstream.
      
      This reverts commit 963ba22b
      ("mpt3sas: Remove phys on topology change")
      
      Reverting the previous mpt3sas drives patch changes,
      since we will observe below issue
      
      Issue:
      Drives connected Enclosure/Expander will unregister with
      SCSI Transport Layer, if any one remove and add expander
      cable with in DMD (Device Missing Delay) time period or
      even any one power-off and power-on the Enclosure with in
      the DMD period.
      Signed-off-by: default avatarSreekanth Reddy <Sreekanth.Reddy@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      95bc632b
    • Sreekanth Reddy's avatar
      Revert "[SCSI] mpt2sas: Remove phys on topology change." · ba77da8b
      Sreekanth Reddy authored
      commit 81a89c2d upstream.
      
      This reverts commit 3520f9c7
      ("mpt2sas: Remove phys on topology change")
      
      Reverting the previous mpt2sas drives patch changes,
      since we will observe below issue
      
      Issue:
      Drives connected Enclosure/Expander will unregister with
      SCSI Transport Layer, if any one remove and add expander
      cable with in DMD (Device Missing Delay) time period or
      even any one power-off and power-on the Enclosure with in
      the DMD period.
      Signed-off-by: default avatarSreekanth Reddy <Sreekanth.Reddy@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ba77da8b
    • Nicholas Bellinger's avatar
      iscsi-target: Fail connection on short sendmsg writes · 32f04f1f
      Nicholas Bellinger authored
      commit 6bf6ca75 upstream.
      
      This patch changes iscsit_do_tx_data() to fail on short writes
      when kernel_sendmsg() returns a value different than requested
      transfer length, returning -EPIPE and thus causing a connection
      reset to occur.
      
      This avoids a potential bug in the original code where a short
      write would result in kernel_sendmsg() being called again with
      the original iovec base + length.
      
      In practice this has not been an issue because iscsit_do_tx_data()
      is only used for transferring 48 byte headers + 4 byte digests,
      along with seldom used control payloads from NOPIN + TEXT_RSP +
      REJECT with less than 32k of data.
      
      So following Al's audit of iovec consumers, go ahead and fail
      the connection on short writes for now, and remove the bogus
      logic ahead of his proper upstream fix.
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      32f04f1f
    • Thomas Gleixner's avatar
      genirq: Prevent proc race against freeing of irq descriptors · b3bc17d5
      Thomas Gleixner authored
      commit c291ee62 upstream.
      
      Since the rework of the sparse interrupt code to actually free the
      unused interrupt descriptors there exists a race between the /proc
      interfaces to the irq subsystem and the code which frees the interrupt
      descriptor.
      
      CPU0				CPU1
      				show_interrupts()
      				  desc = irq_to_desc(X);
      free_desc(desc)
        remove_from_radix_tree();
        kfree(desc);
      				  raw_spinlock_irq(&desc->lock);
      
      /proc/interrupts is the only interface which can actively corrupt
      kernel memory via the lock access. /proc/stat can only read from freed
      memory. Extremly hard to trigger, but possible.
      
      The interfaces in /proc/irq/N/ are not affected by this because the
      removal of the proc file is serialized in procfs against concurrent
      readers/writers. The removal happens before the descriptor is freed.
      
      For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
      as the descriptor is never freed. It's merely cleared out with the irq
      descriptor lock held. So any concurrent proc access will either see
      the old correct value or the cleared out ones.
      
      Protect the lookup and access to the irq descriptor in
      show_interrupts() with the sparse_irq_lock.
      
      Provide kstat_irqs_usr() which is protecting the lookup and access
      with sparse_irq_lock and switch /proc/stat to use it.
      
      Document the existing kstat_irqs interfaces so it's clear that the
      caller needs to take care about protection. The users of these
      interfaces are either not affected due to SPARSE_IRQ=n or already
      protected against removal.
      
      Fixes: 1f5a5b87 "genirq: Implement a sane sparse_irq allocator"
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b3bc17d5
    • Thomas Gleixner's avatar
      tick/powerclamp: Remove tick_nohz_idle abuse · 1ff5b9fb
      Thomas Gleixner authored
      commit a5fd9733 upstream.
      
      commit 4dbd2771 "tick: export nohz tick idle symbols for module
      use" was merged via the thermal tree without an explicit ack from the
      relevant maintainers.
      
      The exports are abused by the intel powerclamp driver which implements
      a fake idle state from a sched FIFO task. This causes all kinds of
      wreckage in the NOHZ core code which rightfully assumes that
      tick_nohz_idle_enter/exit() are only called from the idle task itself.
      
      Recent changes in the NOHZ core lead to a failure of the powerclamp
      driver and now people try to hack completely broken and backwards
      workarounds into the NOHZ core code. This is completely unacceptable
      and just papers over the real problem. There are way more subtle
      issues lurking around the corner.
      
      The real solution is to fix the powerclamp driver by rewriting it with
      a sane concept, but that's beyond the scope of this.
      
      So the only solution for now is to remove the calls into the core NOHZ
      code from the powerclamp trainwreck along with the exports.
      
      Fixes: d6d71ee4 "PM: Introduce Intel PowerClamp Driver"
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Pan Jacob jun <jacob.jun.pan@intel.com>
      Cc: LKP <lkp@01.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412181110110.17382@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1ff5b9fb
    • Dominique Leuenberger's avatar
      hp_accel: Add support for HP ZBook 15 · 666f1afd
      Dominique Leuenberger authored
      commit 6583659e upstream.
      
      HP ZBook 15 laptop needs a non-standard mapping (x_inverted).
      
      BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=905329Signed-off-by: default avatarDominique Leuenberger <dimstar@opensuse.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      666f1afd
    • Arik Nemtsov's avatar
      cfg80211: avoid mem leak on driver hint set · dd16b23e
      Arik Nemtsov authored
      commit 34f05f54 upstream.
      
      In the already-set and intersect case of a driver-hint, the previous
      wiphy regdomain was not freed before being reset with a copy of the
      cfg80211 regdomain.
      
      [js: backport to 3.12]
      Signed-off-by: default avatarArik Nemtsov <arikx.nemtsov@intel.com>
      Acked-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dd16b23e
    • Chris Wilson's avatar
      drm/i915: Force the CS stall for invalidate flushes · dbe21f1d
      Chris Wilson authored
      commit add284a3 upstream.
      
      In order to act as a full command barrier by itself, we need to tell the
      pipecontrol to actually stall the command streamer while the flush runs.
      We require the full command barrier before operations like
      MI_SET_CONTEXT, which currently rely on a prior invalidate flush.
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=83677
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dbe21f1d
    • Chris Wilson's avatar
      drm/i915: Invalidate media caches on gen7 · 41d3d4b1
      Chris Wilson authored
      commit 148b83d0 upstream.
      
      In the gen7 pipe control there is an extra bit to flush the media
      caches, so let's set it during cache invalidation flushes.
      
      v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec.
      
      Cc: Simon Farnsworth <simon@farnz.org.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      41d3d4b1
    • Daniel Vetter's avatar
      drm/i915: Don't complain about stolen conflicts on gen3 · c5fb31cc
      Daniel Vetter authored
      commit 0b6d24c0 upstream.
      
      Apparently stuff works that way on those machines.
      
      I agree with Chris' concern that this is a bit risky but imo worth a
      shot in -next just for fun. Afaics all these machines have the pci
      resources allocated like that by the BIOS, so I suspect that it's all
      ok.
      
      This regression goes back to
      
      commit eaba1b8f
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu Jul 4 12:28:35 2013 +0100
      
          drm/i915: Verify that our stolen memory doesn't conflict
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76983
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71031Tested-by: default avatarlu hua <huax.lu@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Tested-by: default avatarPaul Menzel <paulepanter@users.sourceforge.net>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c5fb31cc
    • Alex Deucher's avatar
      drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw · a866f33e
      Alex Deucher authored
      commit 410cce2a upstream.
      
      The check was already in place in the dp mode_valid check, but
      radeon_dp_get_dp_link_clock() never returned the high clock
      mode_valid was checking for because that function clipped the
      clock based on the hw capabilities.  Add an explicit check
      in the mode_valid function.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=87172Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a866f33e
    • Alex Deucher's avatar
      drm/radeon: check the right ring in radeon_evict_flags() · c02ca53c
      Alex Deucher authored
      commit 5e5c21ca upstream.
      
      Check the that ring we are using for copies is functional
      rather than the GFX ring.  On newer asics we use the DMA
      ring for bo moves.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c02ca53c
    • Alex Deucher's avatar
      drm/radeon: work around a hw bug in MGCG on CIK · 5d531ee5
      Alex Deucher authored
      commit 4bb62c95 upstream.
      
      Always need to set bit 0 of RLC_CGTT_MGCG_OVERRIDE
      to avoid unreliable doorbell updates in some cases.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5d531ee5
    • Alex Deucher's avatar
      drm/radeon: fix typo in CI dpm disable · 2ae5f6c0
      Alex Deucher authored
      commit 129acb7c upstream.
      
      Need to disable DS, not enable it when disabling dpm.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2ae5f6c0
    • Tetsuo Handa's avatar
      drm/ttm: Avoid memory allocation from shrinker functions. · fa13c23e
      Tetsuo Handa authored
      commit 881fdaa5 upstream.
      
      Andrew Morton wrote:
      > On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
      >
      > > Andrew Morton wrote:
      > > > Poor ttm guys - this is a bit of a trap we set for them.
      > >
      > > Commit a91576d7 ("drm/ttm: Pass GFP flags in order to avoid deadlock.")
      > > changed to use sc->gfp_mask rather than GFP_KERNEL.
      > >
      > > -       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
      > > -                       GFP_KERNEL);
      > > +       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
      > >
      > > But this bug is caused by sc->gfp_mask containing some flags which are not
      > > in GFP_KERNEL, right? Then, I think
      > >
      > > -       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
      > > +       pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL);
      > >
      > > would hide this bug.
      > >
      > > But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)
      >
      > Well no - ttm_page_pool_free() should stop calling kmalloc altogether.
      > Just do
      >
      > 	struct page *pages_to_free[16];
      >
      > and rework the code to free 16 pages at a time.  Easy.
      
      Well, ttm code wants to process 512 pages at a time for performance.
      Memory footprint increased by 512 * sizeof(struct page *) buffer is
      only 4096 bytes. What about using static buffer like below?
      ----------
      >From d3cb5393c9c8099d6b37e769f78c31af1541fe8c Mon Sep 17 00:00:00 2001
      From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Date: Thu, 13 Nov 2014 22:21:54 +0900
      Subject: drm/ttm: Avoid memory allocation from shrinker functions.
      
      Commit a91576d7 ("drm/ttm: Pass GFP flags in order to avoid
      deadlock.") caused BUG_ON() due to sc->gfp_mask containing flags
      which are not in GFP_KERNEL.
      
        https://bugzilla.kernel.org/show_bug.cgi?id=87891
      
      Changing from sc->gfp_mask to (sc->gfp_mask & GFP_KERNEL) would
      avoid the BUG_ON(), but avoiding memory allocation from shrinker
      function is better and reliable fix.
      
      Shrinker function is already serialized by global lock, and
      clean up function is called after shrinker function is unregistered.
      Thus, we can use static buffer when called from shrinker function
      and clean up function.
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fa13c23e
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Fix fence event code · fc350ec8
      Thomas Hellstrom authored
      commit 89669e7a upstream.
      
      The commit "vmwgfx: Rework fence event action" introduced a number of bugs
      that are fixed with this commit:
      
      a) A forgotten return stateemnt.
      b) An if statement with identical branches.
      Reported-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJakob Bornecrantz <jakob@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fc350ec8
    • Akash Goel's avatar
      drm/i915: Resolving the memory region conflict for Stolen area · b743bb06
      Akash Goel authored
      commit 3617dc96 upstream.
      
      There is a conflict seen when requesting the kernel to reserve
      the physical space used for the stolen area. This is because
      some BIOS are wrapping the stolen area in the root PCI bus, but have
      an off-by-one error. As a workaround we retry the reservation with an
      offset of 1 instead of 0.
      
      v2: updated commit message & the comment in source file (Daniel)
      Signed-off-by: default avatarAkash Goel <akash.goel@intel.com>
      Reviewed-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Tested-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b743bb06
    • Konstantin Khlebnikov's avatar
      ACPI / osl: speedup grace period in acpi_os_map_cleanup · b3959c77
      Konstantin Khlebnikov authored
      commit 74b51ee1 upstream.
      
      ACPI maintains cache of ioremap regions to speed up operations and
      access to them from irq context where ioremap() calls aren't allowed.
      This code abuses synchronize_rcu() on unmap path for synchronization
      with fast-path in acpi_os_read/write_memory which uses this cache.
      
      Since v3.10 CPUs are allowed to enter idle state even if they have RCU
      callbacks queued, see commit c0f4dfd4
      ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
      That change caused problems with nvidia proprietary driver which calls
      acpi_os_map/unmap_generic_address several times during initialization.
      Each unmap calls synchronize_rcu and adds significant delay. Totally
      initialization is slowed for a couple of seconds and that is enough to
      trigger timeout in hardware, gpu decides to "fell off the bus". Widely
      spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.
      
      This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
      which is much faster.
      
      Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-and-tested-by: default avatarAlexander Monakov <amonakov@gmail.com>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b3959c77
    • Dan Carpenter's avatar
      netfilter: ipset: small potential read beyond the end of buffer · 4d871c60
      Dan Carpenter authored
      commit 2196937e upstream.
      
      We could be reading 8 bytes into a 4 byte buffer here.  It seems
      harmless but adding a check is the right thing to do and it silences a
      static checker warning.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4d871c60
    • Jay Vosburgh's avatar
      net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding · e5924c95
      Jay Vosburgh authored
      [ Upstream commit 2c26d34b ]
      
      When using VXLAN tunnels and a sky2 device, I have experienced
      checksum failures of the following type:
      
      [ 4297.761899] eth0: hw csum failure
      [...]
      [ 4297.765223] Call Trace:
      [ 4297.765224]  <IRQ>  [<ffffffff8172f026>] dump_stack+0x46/0x58
      [ 4297.765235]  [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50
      [ 4297.765238]  [<ffffffff8161c1a0>] ? skb_push+0x40/0x40
      [ 4297.765240]  [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0
      [ 4297.765243]  [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950
      [ 4297.765246]  [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360
      
      	These are reliably reproduced in a network topology of:
      
      container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch
      
      	When VXLAN encapsulated traffic is received from a similarly
      configured peer, the above warning is generated in the receive
      processing of the encapsulated packet.  Note that the warning is
      associated with the container eth0.
      
              The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
      because the packet is an encapsulated Ethernet frame, the checksum
      generated by the hardware includes the inner protocol and Ethernet
      headers.
      
      	The receive code is careful to update the skb->csum, except in
      __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
      calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
      to skip over the Ethernet header, but does not update skb->csum when
      doing so.
      
      	This patch resolves the problem by adding a call to
      skb_postpull_rcsum to update the skb->csum after the call to
      eth_type_trans.
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e5924c95
    • Govindarajulu Varadarajan's avatar
      enic: fix rx skb checksum · 73c0b665
      Govindarajulu Varadarajan authored
      [ Upstream commit 17e96834 ]
      
      Hardware always provides compliment of IP pseudo checksum. Stack expects
      whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
      
      This causes checksum error in nf & ovs.
      
      kernel: qg-19546f09-f2: hw csum failure
      kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF          O--------------   3.10.0-123.8.1.el7.x86_64 #1
      kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
      kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
      kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
      kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
      kernel: Call Trace:
      kernel: <IRQ>  [<ffffffff815e237b>] dump_stack+0x19/0x1b
      kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
      kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
      kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
      kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
      kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
      kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
      kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
      kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
      kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
      
      Hardware verifies IP & tcp/udp header checksum but does not provide payload
      checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
      
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Stefan Assmann <sassmann@redhat.com>
      Reported-by: default avatarSunil Choudhary <schoudha@redhat.com>
      Signed-off-by: default avatarGovindarajulu Varadarajan <_govind@gmx.com>
      Reviewed-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      73c0b665
    • Jiri Pirko's avatar
      team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin · c18e1d2c
      Jiri Pirko authored
      [ Upstream commit b0d11b42 ]
      
      This patch is fixing a race condition that may cause setting
      count_pending to -1, which results in unwanted big bulk of arp messages
      (in case of "notify peers").
      
      Consider following scenario:
      
      count_pending == 2
         CPU0                                           CPU1
      					team_notify_peers_work
      					  atomic_dec_and_test (dec count_pending to 1)
      					  schedule_delayed_work
       team_notify_peers
         atomic_add (adding 1 to count_pending)
      					team_notify_peers_work
      					  atomic_dec_and_test (dec count_pending to 1)
      					  schedule_delayed_work
      					team_notify_peers_work
      					  atomic_dec_and_test (dec count_pending to 0)
         schedule_delayed_work
      					team_notify_peers_work
      					  atomic_dec_and_test (dec count_pending to -1)
      
      Fix this race by using atomic_dec_if_positive - that will prevent
      count_pending running under 0.
      
      Fixes: fc423ff0 ("team: add peer notification")
      Fixes: 492b200e  ("team: add support for sending multicast rejoins")
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c18e1d2c
    • Eric Dumazet's avatar
      alx: fix alx_poll() · cbac9213
      Eric Dumazet authored
      [ Upstream commit 7a05dc64 ]
      
      Commit d75b1ade ("net: less interrupt masking in NAPI") uncovered
      wrong alx_poll() behavior.
      
      A NAPI poll() handler is supposed to return exactly the budget when/if
      napi_complete() has not been called.
      
      It is also supposed to return number of frames that were received, so
      that netdev_budget can have a meaning.
      
      Also, in case of TX pressure, we still have to dequeue received
      packets : alx_clean_rx_irq() has to be called even if
      alx_clean_tx_irq(alx) returns false, otherwise device is half duplex.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: d75b1ade ("net: less interrupt masking in NAPI")
      Reported-by: default avatarOded Gabbay <oded.gabbay@amd.com>
      Bisected-by: default avatarOded Gabbay <oded.gabbay@amd.com>
      Tested-by: default avatarOded Gabbay <oded.gabbay@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cbac9213
    • Herbert Xu's avatar
      tcp: Do not apply TSO segment limit to non-TSO packets · dc5e8861
      Herbert Xu authored
      [ Upstream commit 843925f3 ]
      
      Thomas Jarosch reported IPsec TCP stalls when a PMTU event occurs.
      
      In fact the problem was completely unrelated to IPsec.  The bug is
      also reproducible if you just disable TSO/GSO.
      
      The problem is that when the MSS goes down, existing queued packet
      on the TX queue that have not been transmitted yet all look like
      TSO packets and get treated as such.
      
      This then triggers a bug where tcp_mss_split_point tells us to
      generate a zero-sized packet on the TX queue.  Once that happens
      we're screwed because the zero-sized packet can never be removed
      by ACKs.
      
      Fixes: 1485348d ("tcp: Apply device TSO segment limit earlier")
      Reported-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      
      Cheers,
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dc5e8861
    • Thomas Graf's avatar
      net: Reset secmark when scrubbing packet · 00c47292
      Thomas Graf authored
      [ Upstream commit b8fb4e06 ]
      
      skb_scrub_packet() is called when a packet switches between a context
      such as between underlay and overlay, between namespaces, or between
      L3 subnets.
      
      While we already scrub the packet mark, connection tracking entry,
      and cached destination, the security mark/context is left intact.
      
      It seems wrong to inherit the security context of a packet when going
      from overlay to underlay or across forwarding paths.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Acked-by: default avatarFlavio Leitner <fbl@sysclose.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      00c47292
    • Toshiaki Makita's avatar
      net: Fix stacked vlan offload features computation · 6e1a8eb4
      Toshiaki Makita authored
      [ Upstream commit 796f2da8 ]
      
      When vlan tags are stacked, it is very likely that the outer tag is stored
      in skb->vlan_tci and skb->protocol shows the inner tag's vlan_proto.
      Currently netif_skb_features() first looks at skb->protocol even if there
      is the outer tag in vlan_tci, thus it incorrectly retrieves the protocol
      encapsulated by the inner vlan instead of the inner vlan protocol.
      This allows GSO packets to be passed to HW and they end up being
      corrupted.
      
      Fixes: 58e998c6 ("offloading: Force software GSO for multiple vlan tags.")
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6e1a8eb4
    • Prashant Sreedharan's avatar
      tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts · e6f7e1dc
      Prashant Sreedharan authored
      [ Upstream commit 05b0aa57 ]
      
      During driver load in tg3_init_one, if the driver detects DMA activity before
      intializing the chip tg3_halt is called. As part of tg3_halt interrupts are
      disabled using routine tg3_disable_ints. This routine was using mailbox value
      which was not initialized (default value is 0). As a result driver was writing
      0x00000001 to pci config space register 0, which is the vendor id / device id.
      
      This driver bug was exposed because of the commit a7877b17a667 (PCI: Check only
      the Vendor ID to identify Configuration Request Retry). Also this issue is only
      seen in older generation chipsets like 5722 because config space write to offset
      0 from driver is possible. The newer generation chips ignore writes to offset 0.
      Also without commit a7877b17a667, for these older chips when a GRC reset is
      issued the Bootcode would reprogram the vendor id/device id, which is the reason
      this bug was masked earlier.
      
      Fixed by initializing the interrupt mailbox registers before calling tg3_halt.
      
      Please queue for -stable.
      Reported-by: default avatarNils Holland <nholland@tisys.org>
      Reported-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e6f7e1dc
    • stephen hemminger's avatar
      in6: fix conflict with glibc · ef260858
      stephen hemminger authored
      [ Upstream commit 6d08acd2 ]
      
      Resolve conflicts between glibc definition of IPV6 socket options
      and those defined in Linux headers. Looks like earlier efforts to
      solve this did not cover all the definitions.
      
      It resolves warnings during iproute2 build.
      Please consider for stable as well.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ef260858
    • Thomas Graf's avatar
      netlink: Don't reorder loads/stores before marking mmap netlink frame as available · d7ab02d6
      Thomas Graf authored
      [ Upstream commit a18e6a18 ]
      
      Each mmap Netlink frame contains a status field which indicates
      whether the frame is unused, reserved, contains data or needs to
      be skipped. Both loads and stores may not be reordeded and must
      complete before the status field is changed and another CPU might
      pick up the frame for use. Use an smp_mb() to cover needs of both
      types of callers to netlink_set_status(), callers which have been
      reading data frame from the frame, and callers which have been
      filling or releasing and thus writing to the frame.
      
      - Example code path requiring a smp_rmb():
        memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len);
        netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED);
      
      - Example code path requiring a smp_wmb():
        hdr->nm_uid	= from_kuid(sk_user_ns(sk), NETLINK_CB(skb).creds.uid);
        hdr->nm_gid	= from_kgid(sk_user_ns(sk), NETLINK_CB(skb).creds.gid);
        netlink_frame_flush_dcache(hdr);
        netlink_set_status(hdr, NL_MMAP_STATUS_VALID);
      
      Fixes: f9c228 ("netlink: implement memory mapped recvmsg()")
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d7ab02d6
    • David Miller's avatar
      netlink: Always copy on mmap TX. · 837f719b
      David Miller authored
      [ Upstream commit 4682a035 ]
      
      Checking the file f_count and the nlk->mapped count is not completely
      sufficient to prevent the mmap'd area contents from changing from
      under us during netlink mmap sendmsg() operations.
      
      Be careful to sample the header's length field only once, because this
      could change from under us as well.
      
      Fixes: 5fd96123 ("netlink: implement memory mapped sendmsg()")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      837f719b
    • Linus Torvalds's avatar
      mm: Don't count the stack guard page towards RLIMIT_STACK · 3e89bd36
      Linus Torvalds authored
      commit 690eac53 upstream.
      
      Commit fee7e49d ("mm: propagate error from stack expansion even for
      guard page") made sure that we return the error properly for stack
      growth conditions.  It also theorized that counting the guard page
      towards the stack limit might break something, but also said "Let's see
      if anybody notices".
      
      Somebody did notice.  Apparently android-x86 sets the stack limit very
      close to the limit indeed, and including the guard page in the rlimit
      check causes the android 'zygote' process problems.
      
      So this adds the (fairly trivial) code to make the stack rlimit check be
      against the actual real stack size, rather than the size of the vma that
      includes the guard page.
      Reported-and-tested-by: default avatarChih-Wei Huang <cwhuang@android-x86.org>
      Cc: Jay Foad <jay.foad@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3e89bd36
    • Linus Torvalds's avatar
      mm: propagate error from stack expansion even for guard page · 88d43e17
      Linus Torvalds authored
      commit fee7e49d upstream.
      
      Jay Foad reports that the address sanitizer test (asan) sometimes gets
      confused by a stack pointer that ends up being outside the stack vma
      that is reported by /proc/maps.
      
      This happens due to an interaction between RLIMIT_STACK and the guard
      page: when we do the guard page check, we ignore the potential error
      from the stack expansion, which effectively results in a missing guard
      page, since the expected stack expansion won't have been done.
      
      And since /proc/maps explicitly ignores the guard page (commit
      d7824370: "mm: fix up some user-visible effects of the stack guard
      page"), the stack pointer ends up being outside the reported stack area.
      
      This is the minimal patch: it just propagates the error.  It also
      effectively makes the guard page part of the stack limit, which in turn
      measn that the actual real stack is one page less than the stack limit.
      
      Let's see if anybody notices.  We could teach acct_stack_growth() to
      allow an extra page for a grow-up/grow-down stack in the rlimit test,
      but I don't want to add more complexity if it isn't needed.
      Reported-and-tested-by: default avatarJay Foad <jay.foad@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      88d43e17
    • Vlastimil Babka's avatar
      mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed · 72af0d9a
      Vlastimil Babka authored
      commit 9e5e3661 upstream.
      
      Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
      stuck in a busy loop with nothing left to balance, but
      kswapd_try_to_sleep() failing to sleep.  Their analysis found the cause
      to be a combination of several factors:
      
      1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait
      
      2. The process has been killed (by OOM in this case), but has not yet been
         scheduled to remove itself from the waitqueue and die.
      
      3. kswapd checks for throttled processes in prepare_kswapd_sleep():
      
              if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
                      wake_up(&pgdat->pfmemalloc_wait);
      		return false; // kswapd will not go to sleep
      	}
      
         However, for a process that was already killed, wake_up() does not remove
         the process from the waitqueue, since try_to_wake_up() checks its state
         first and returns false when the process is no longer waiting.
      
      4. kswapd is running on the same CPU as the only CPU that the process is
         allowed to run on (through cpus_allowed, or possibly single-cpu system).
      
      5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
         encounters no voluntary preemption points and repeatedly fails
         prepare_kswapd_sleep(), blocking the process from running and removing
         itself from the waitqueue, which would let kswapd sleep.
      
      So, the source of the problem is that we prevent kswapd from going to
      sleep until there are processes waiting on the pfmemalloc_wait queue,
      and a process waiting on a queue is guaranteed to be removed from the
      queue only when it gets scheduled.  This was done to make sure that no
      process is left sleeping on pfmemalloc_wait when kswapd itself goes to
      sleep.
      
      However, it isn't necessary to postpone kswapd sleep until the
      pfmemalloc_wait queue actually empties.  To prevent processes from being
      left sleeping, it's actually enough to guarantee that all processes
      waiting on pfmemalloc_wait queue have been woken up by the time we put
      kswapd to sleep.
      
      This patch therefore fixes this issue by substituting 'wake_up' with
      'wake_up_all' and removing 'return false' in the code snippet from
      prepare_kswapd_sleep() above.  Note that if any process puts itself in
      the queue after this waitqueue_active() check, or after the wake up
      itself, it means that the process will also wake up kswapd - and since
      we are under prepare_to_wait(), the wake up won't be missed.  Also we
      update the comment prepare_kswapd_sleep() to hopefully more clearly
      describe the races it is preventing.
      
      Fixes: 5515061d ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      72af0d9a
    • Krzysztof Kozlowski's avatar
      mmc: sdhci: Fix sleep in atomic after inserting SD card · 0cd8eb43
      Krzysztof Kozlowski authored
      commit 2836766a upstream.
      
      Sleep in atomic context happened on Trats2 board after inserting or
      removing SD card because mmc_gpio_get_cd() was called under spin lock.
      
      Fix this by moving card detection earlier, before acquiring spin lock.
      The mmc_gpio_get_cd() call does not have to be protected by spin lock
      because it does not access any sdhci internal data.
      The sdhci_do_get_cd() call access host flags (SDHCI_DEVICE_DEAD). After
      moving it out side of spin lock it could theoretically race with driver
      removal but still there is no actual protection against manual card
      eject.
      
      Dmesg after inserting SD card:
      [   41.663414] BUG: sleeping function called from invalid context at drivers/gpio/gpiolib.c:1511
      [   41.670469] in_atomic(): 1, irqs_disabled(): 128, pid: 30, name: kworker/u8:1
      [   41.677580] INFO: lockdep is turned off.
      [   41.681486] irq event stamp: 61972
      [   41.684872] hardirqs last  enabled at (61971): [<c0490ee0>] _raw_spin_unlock_irq+0x24/0x5c
      [   41.693118] hardirqs last disabled at (61972): [<c04907ac>] _raw_spin_lock_irq+0x18/0x54
      [   41.701190] softirqs last  enabled at (61648): [<c0026fd4>] __do_softirq+0x234/0x2c8
      [   41.708914] softirqs last disabled at (61631): [<c00273a0>] irq_exit+0xd0/0x114
      [   41.716206] Preemption disabled at:[<  (null)>]   (null)
      [   41.721500]
      [   41.722985] CPU: 3 PID: 30 Comm: kworker/u8:1 Tainted: G        W      3.18.0-rc5-next-20141121 #883
      [   41.732111] Workqueue: kmmcd mmc_rescan
      [   41.735945] [<c0014d2c>] (unwind_backtrace) from [<c0011c80>] (show_stack+0x10/0x14)
      [   41.743661] [<c0011c80>] (show_stack) from [<c0489d14>] (dump_stack+0x70/0xbc)
      [   41.750867] [<c0489d14>] (dump_stack) from [<c0228b74>] (gpiod_get_raw_value_cansleep+0x18/0x30)
      [   41.759628] [<c0228b74>] (gpiod_get_raw_value_cansleep) from [<c03646e8>] (mmc_gpio_get_cd+0x38/0x58)
      [   41.768821] [<c03646e8>] (mmc_gpio_get_cd) from [<c036d378>] (sdhci_request+0x50/0x1a4)
      [   41.776808] [<c036d378>] (sdhci_request) from [<c0357934>] (mmc_start_request+0x138/0x268)
      [   41.785051] [<c0357934>] (mmc_start_request) from [<c0357cc8>] (mmc_wait_for_req+0x58/0x1a0)
      [   41.793469] [<c0357cc8>] (mmc_wait_for_req) from [<c0357e68>] (mmc_wait_for_cmd+0x58/0x78)
      [   41.801714] [<c0357e68>] (mmc_wait_for_cmd) from [<c0361c00>] (mmc_io_rw_direct_host+0x98/0x124)
      [   41.810480] [<c0361c00>] (mmc_io_rw_direct_host) from [<c03620f8>] (sdio_reset+0x2c/0x64)
      [   41.818641] [<c03620f8>] (sdio_reset) from [<c035a3d8>] (mmc_rescan+0x254/0x2e4)
      [   41.826028] [<c035a3d8>] (mmc_rescan) from [<c003a0e0>] (process_one_work+0x180/0x3f4)
      [   41.833920] [<c003a0e0>] (process_one_work) from [<c003a3bc>] (worker_thread+0x34/0x4b0)
      [   41.841991] [<c003a3bc>] (worker_thread) from [<c003fed8>] (kthread+0xe4/0x104)
      [   41.849285] [<c003fed8>] (kthread) from [<c000f268>] (ret_from_fork+0x14/0x2c)
      [   42.038276] mmc0: new high speed SDHC card at address 1234
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 94144a46 ("mmc: sdhci: add get_cd() implementation")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0cd8eb43
    • Stefan Roese's avatar
      spi: fsl: Fix problem with multi message transfers · 44189ec1
      Stefan Roese authored
      commit 4302a596 upstream.
      
      When used via spidev with more than one messages to tranfer via
      SPI_IOC_MESSAGE the current implementation would return with
      -EINVAL, since bits_per_word and speed_hz are set in all
      transfer structs. And in the 2nd loop status will stay at
      -EINVAL as its not overwritten again via fsl_spi_setup_transfer().
      
      This patch changes this behavious by first checking if one of
      the messages uses different settings. If this is the case
      the function will return with -EINVAL. If not, the messages
      are transferred correctly.
      Signed-off-by: default avatarStefan Roese <sr@denx.de>
      Signed-off-by: default avatarMark Brown <broonie@linaro.org>
      Cc: Esben Haabendal <esbenhaabendal@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      44189ec1
    • Jiri Olsa's avatar
      perf session: Do not fail on processing out of order event · 0f069ee1
      Jiri Olsa authored
      commit f61ff6c0 upstream.
      
      Linus reported perf report command being interrupted due to processing
      of 'out of order' event, with following error:
      
        Timestamp below last timeslice flush
        0x5733a8 [0x28]: failed to process type: 3
      
      I could reproduce the issue and in my case it was caused by one CPU
      (mmap) being behind during record and userspace mmap reader seeing the
      data after other CPUs data were already stored.
      
      This is expected under some circumstances because we need to limit the
      number of events that we queue for reordering when we receive a
      PERF_RECORD_FINISHED_ROUND or when we force flush due to memory
      pressure.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1417016371-30249-1-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      [zhangzhiqiang: backport to 3.10:
       - adjust context
       - commit f61ff6c0 struct events_stats was defined in tools/perf/util/event.h
         while 3.10 stable defined in tools/perf/util/hist.h.
       - 3.10 stable there is no pr_oe_time() which used for debug.
       - After the above adjustments, becomes same to the original patch:
         https://github.com/torvalds/linux/commit/f61ff6c06dc8f32c7036013ad802c899ec590607
      ]
      Signed-off-by: default avatarZhiqiang Zhang <zhangzhiqiang.zhang@huawei.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0f069ee1
    • Jiri Olsa's avatar
      perf: Fix events installation during moving group · a1f8e247
      Jiri Olsa authored
      commit 9fc81d87 upstream.
      
      We allow PMU driver to change the cpu on which the event
      should be installed to. This happened in patch:
      
        e2d37cd2 ("perf: Allow the PMU driver to choose the CPU on which to install events")
      
      This patch also forces all the group members to follow
      the currently opened events cpu if the group happened
      to be moved.
      
      This and the change of event->cpu in perf_install_in_context()
      function introduced in:
      
        0cda4c02 ("perf: Introduce perf_pmu_migrate_context()")
      
      forces group members to change their event->cpu,
      if the currently-opened-event's PMU changed the cpu
      and there is a group move.
      
      Above behaviour causes problem for breakpoint events,
      which uses event->cpu to touch cpu specific data for
      breakpoints accounting. By changing event->cpu, some
      breakpoints slots were wrongly accounted for given
      cpu.
      
      Vinces's perf fuzzer hit this issue and caused following
      WARN on my setup:
      
         WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
         Can't find any breakpoint slot
         [...]
      
      This patch changes the group moving code to keep the event's
      original cpu.
      Reported-by: default avatarVince Weaver <vince@deater.net>
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a1f8e247
    • Jiri Olsa's avatar
      perf/x86/intel/uncore: Make sure only uncore events are collected · 5f34dcba
      Jiri Olsa authored
      commit af91568e upstream.
      
      The uncore_collect_events functions assumes that event group
      might contain only uncore events which is wrong, because it
      might contain any type of events.
      
      This bug leads to uncore framework touching 'not' uncore events,
      which could end up all sorts of bugs.
      
      One was triggered by Vince's perf fuzzer, when the uncore code
      touched breakpoint event private event space as if it was uncore
      event and caused BUG:
      
         BUG: unable to handle kernel paging request at ffffffff82822068
         IP: [<ffffffff81020338>] uncore_assign_events+0x188/0x250
         ...
      
      The code in uncore_assign_events() function was looking for
      event->hw.idx data while the event was initialized as a
      breakpoint with different members in event->hw union.
      
      This patch forces uncore_collect_events() to collect only uncore
      events.
      Reported-by: default avatarVince Weaver <vince@deater.net>
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yan, Zheng <zheng.z.yan@intel.com>
      Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5f34dcba
    • Thomas Petazzoni's avatar
      ARM: mvebu: disable I/O coherency on non-SMP situations on Armada 370/375/38x/XP · 39ca76b2
      Thomas Petazzoni authored
      commit e5535545 upstream.
      
      Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
      38x and Armada XP requires a certain number of conditions:
      
       - On Armada 370, the cache policy must be set to write-allocate.
      
       - On Armada 375, 38x and XP, the cache policy must be set to
         write-allocate, the pages must be mapped with the shareable
         attribute, and the SMP bit must be set
      
      Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
      are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
      of these conditions are met. With Armada 370, the situation is worse:
      since the processor is single core, regardless of whether CONFIG_SMP
      or !CONFIG_SMP is used, the cache policy will be set to write-back by
      the kernel and not write-allocate.
      
      Since solving this problem turns out to be quite complicated, and we
      don't want to let users with a mainline kernel known to have
      infrequent but existing data corruptions, this commit proposes to
      simply disable hardware I/O coherency in situations where it is known
      not to work.
      
      And basically, the is_smp() function of the kernel tells us whether it
      is OK to enable hardware I/O coherency or not, so this commit slightly
      refactors the coherency_type() function to return
      COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
      type of the coherency fabric in the other case.
      
      Thanks to this, the I/O coherency fabric will no longer be used at all
      in !CONFIG_SMP configurations. It will continue to be used in
      CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
      (which are multiple cores processors), but will no longer be used on
      Armada 370 (which is a single core processor).
      
      In the process, it simplifies the implementation of the
      coherency_type() function, and adds a missing call to of_node_put().
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Fixes: e60304f8 ("arm: mvebu: Add hardware I/O Coherency support")
      Acked-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.comSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      39ca76b2
    • Pavel Machek's avatar
      Revert "ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo" · f7ab6b67
      Pavel Machek authored
      commit 4bf9636c upstream.
      
      Commit 9fc2105a ("ARM: 7830/1: delay: don't bother reporting
      bogomips in /proc/cpuinfo") breaks audio in python, and probably
      elsewhere, with message
      
        FATAL: cannot locate cpu MHz in /proc/cpuinfo
      
      I'm not the first one to hit it, see for example
      
        https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/
        https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1
      
      Reading original changelog, I have to say "Stop breaking working setups.
      You know who you are!".
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f7ab6b67