1. 02 Sep, 2014 4 commits
    • Tejun Heo's avatar
      percpu: move region iterations out of pcpu_[de]populate_chunk() · a93ace48
      Tejun Heo authored
      Previously, pcpu_[de]populate_chunk() were called with the range which
      may contain multiple target regions in it and
      pcpu_[de]populate_chunk() iterated over the regions.  This has the
      benefit of batching up cache flushes for all the regions; however,
      we're planning to add more bookkeeping logic around [de]population to
      support atomic allocations and this delegation of iterations gets in
      the way.
      
      This patch moves the region iterations out of
      pcpu_[de]populate_chunk() into its callers - pcpu_alloc() and
      pcpu_reclaim() - so that we can later add logic to track more states
      around them.  This change may make cache and tlb flushes more frequent
      but multi-region [de]populations are rare anyway and if this actually
      becomes a problem, it's not difficult to factor out cache flushes as
      separate callbacks which are directly invoked from percpu.c.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      a93ace48
    • Tejun Heo's avatar
      percpu: move common parts out of pcpu_[de]populate_chunk() · dca49645
      Tejun Heo authored
      percpu-vm and percpu-km implement separate versions of
      pcpu_[de]populate_chunk() and some part which is or should be common
      are currently in the specific implementations.  Make the following
      changes.
      
      * Allocate area clearing is moved from the pcpu_populate_chunk()
        implementations to pcpu_alloc().  This makes percpu-km's version
        noop.
      
      * Quick exit tests in pcpu_[de]populate_chunk() of percpu-vm are moved
        to their respective callers so that they are applied to percpu-km
        too.  This doesn't make any meaningful difference as both functions
        are noop for percpu-km; however, this is more consistent and will
        help implementing atomic allocation support.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      dca49645
    • Tejun Heo's avatar
      percpu: remove @may_alloc from pcpu_get_pages() · cdb4cba5
      Tejun Heo authored
      pcpu_get_pages() creates the temp pages array if not already allocated
      and returns the pointer to it.  As the function is called from both
      [de]population paths and depopulation can only happen after at least
      one successful population, the param doesn't make any difference - the
      allocation will always happen on the population path anyway.
      
      Remove @may_alloc from pcpu_get_pages().  Also, add an lockdep
      assertion pcpu_alloc_mutex instead of vaguely stating that the
      exclusion is the caller's responsibility.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      cdb4cba5
    • Tejun Heo's avatar
      percpu: remove the usage of separate populated bitmap in percpu-vm · fbbb7f4e
      Tejun Heo authored
      percpu-vm uses pcpu_get_pages_and_bitmap() to acquire temp pages array
      and populated bitmap and uses the two during [de]population.  The temp
      bitmap is used only to build the new bitmap that is copied to
      chunk->populated after the operation succeeds; however, the new bitmap
      can be trivially set after success without using the temp bitmap.
      
      This patch removes the temp populated bitmap usage from percpu-vm.c.
      
      * pcpu_get_pages_and_bitmap() is renamed to pcpu_get_pages() and no
        longer hands out the temp bitmap.
      
      * @populated arugment is dropped from all the related functions.
        @populated updates in pcpu_[un]map_pages() are dropped.
      
      * Two loops in pcpu_map_pages() are merged.
      
      * pcpu_[de]populated_chunk() modify chunk->populated bitmap directly
        from @page_start and @page_end after success.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      fbbb7f4e
  2. 16 Aug, 2014 1 commit
  3. 15 Aug, 2014 4 commits
    • Tejun Heo's avatar
      percpu: perform tlb flush after pcpu_map_pages() failure · 849f5169
      Tejun Heo authored
      If pcpu_map_pages() fails midway, it unmaps the already mapped pages.
      Currently, it doesn't flush tlb after the partial unmapping.  This may
      be okay in most cases as the established mapping hasn't been used at
      that point but it can go wrong and when it goes wrong it'd be
      extremely difficult to track down.
      
      Flush tlb after the partial unmapping.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      849f5169
    • Tejun Heo's avatar
      percpu: fix pcpu_alloc_pages() failure path · f0d27965
      Tejun Heo authored
      When pcpu_alloc_pages() fails midway, pcpu_free_pages() is invoked to
      free what has already been allocated.  The invocation is across the
      whole requested range and pcpu_free_pages() will try to free all
      non-NULL pages; unfortunately, this is incorrect as
      pcpu_get_pages_and_bitmap(), unlike what its comment suggests, doesn't
      clear the pages array and thus the array may have entries from the
      previous invocations making the partial failure path free incorrect
      pages.
      
      Fix it by open-coding the partial freeing of the already allocated
      pages.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      f0d27965
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · c9d26423
      Linus Torvalds authored
      Pull more ACPI and power management updates from Rafael Wysocki:
       "These are a couple of regression fixes, cpuidle menu governor
        optimizations, fixes for ACPI proccessor and battery drivers,
        hibernation fix to avoid problems related to the e820 memory map,
        fixes for a few cpufreq drivers and a new version of the suspend
        profiling tool analyze_suspend.py.
      
        Specifics:
      
         - Fix for an ACPI-based device hotplug regression introduced in 3.14
           that causes a kernel panic to trigger when memory hot-remove is
           attempted with CONFIG_ACPI_HOTPLUG_MEMORY unset from Tang Chen
      
         - Fix for a cpufreq regression introduced in 3.16 that triggers a
           "sleeping function called from invalid context" bug in
           dev_pm_opp_init_cpufreq_table() from Stephen Boyd
      
         - ACPI battery driver fix for a warning message added in 3.16 that
           prints silly stuff sometimes from Mariusz Ceier
      
         - Hibernation fix for safer handling of mismatches in the 820 memory
           map between the configurations during image creation and during the
           subsequent restore from Chun-Yi Lee
      
         - ACPI processor driver fix to handle CPU hotplug notifications
           correctly during system suspend/resume from Lan Tianyu
      
         - Series of four cpuidle menu governor cleanups that also should
           speed it up a bit from Mel Gorman
      
         - Fixes for the speedstep-smi, integrator, cpu0 and arm_big_little
           cpufreq drivers from Hans Wennborg, Himangi Saraogi, Markus
           Pargmann and Uwe Kleine-König
      
         - Version 3.0 of the analyze_suspend.py suspend profiling tool from
           Todd E Brandt"
      
      * tag 'pm+acpi-3.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / battery: Fix warning message in acpi_battery_get_state()
        PM / tools: analyze_suspend.py: update to v3.0
        cpufreq: arm_big_little: fix module license spec
        cpufreq: speedstep-smi: fix decimal printf specifiers
        ACPI / hotplug: Check scan handlers in acpi_scan_hot_remove()
        cpufreq: OPP: Avoid sleeping while atomic
        cpufreq: cpu0: Do not print error message when deferring
        cpufreq: integrator: Use set_cpus_allowed_ptr
        PM / hibernate: avoid unsafe pages in e820 reserved regions
        ACPI / processor: Make acpi_cpu_soft_notify() process CPU FROZEN events
        cpuidle: menu: Lookup CPU runqueues less
        cpuidle: menu: Call nr_iowait_cpu less times
        cpuidle: menu: Use ktime_to_us instead of reinventing the wheel
        cpuidle: menu: Use shifts when calculating averages where possible
      c9d26423
    • Linus Torvalds's avatar
      Merge tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · a11c5c9e
      Linus Torvalds authored
      Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas:
       "Part two of the PCI changes for v3.17:
      
          - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)
      
        It's a mechanical change that removes uses of the
        DEFINE_PCI_DEVICE_TABLE macro.  I waited until later in the merge
        window to reduce conflicts, but it's possible you'll still see a few"
      
      * tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
      a11c5c9e
  4. 14 Aug, 2014 31 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 179c0ac6
      Linus Torvalds authored
      Pull Sparc fixes from David Miller:
       "Hook up the memfd syscall, and properly claim all PCI resources
        discovered when building the PCI device tree"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Hook up memfd_create system call.
        sparc64: Properly claim resources as each PCI bus is probed.
        sparc64: Skip bogus PCI bridge ranges.
        sparc64: Expand PCI bridge probing debug logging.
      179c0ac6
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ad15afb8
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "I'm sending this out, in particular, to get the iwlwifi fix
        propagated:
      
         1) Fix build due to missing include in i40e driver, from Lucas
            Tanure.
      
         2) Memory leak in openvswitch port allocation, from Chirstoph Jaeger.
      
         3) Check DMA mapping errors in myri10ge, from Stanislaw Gruszka.
      
         4) Fix various deadlock scenerios in sunvnet driver, from Sowmini
            Varadhan.
      
         5) Fix cxgb4i build failures with incompatible Kconfig settings of
            the driver vs ipv6, from Anish Bhatt.
      
         6) Fix generation of ACK packet timestamps in the presence of TSO
            which will be split up, from Willem de Bruijn.
      
         7) Don't enable sched scan in iwlwifi driver, it causes firmware
            crashes in some revisions.  From Emmanuel Grumbach.
      
         8) Revert a macvlan simplification that causes crashes.
      
         9) Handle RTT calculations properly in the presence of repair'd SKBs,
            from Andrey Vagin.
      
        10) SIT tunnel lookup uses wrong device index in compares, from
            Shmulik Ladkani.
      
        11) Handle MTU reductions in TCP properly for ipv4 mapped ipv6
            sockets, from Neal Cardwell.
      
        12) Add missing annotations in rhashtable code, from Thomas Graf.
      
        13) Fix false interpretation of two RTOs as being from the same TCP
            loss event in the FRTO code, from Neal Cardwell"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
        netlink: Annotate RCU locking for seq_file walker
        rhashtable: fix annotations for rht_for_each_entry_rcu()
        rhashtable: unexport and make rht_obj() static
        rhashtable: RCU annotations for next pointers
        tcp: fix ssthresh and undo for consecutive short FRTO episodes
        tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic
        tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced()
        sit: Fix ipip6_tunnel_lookup device matching criteria
        net: ethernet: ibm: ehea: Remove duplicate object from Makefile
        net: xgene: Check negative return value of xgene_enet_get_ring_size()
        tcp: don't use timestamp from repaired skb-s to calculate RTT (v2)
        net: xilinx: Remove .owner field for driver
        Revert "macvlan: simplify the structure port"
        iwlwifi: mvm: disable scheduled scan to prevent firmware crash
        xen-netback: remove loop waiting function
        xen-netback: don't stop dealloc kthread too early
        xen-netback: move NAPI add/remove calls
        xen-netback: fix debugfs entry creation
        xen-netback: fix debugfs write length check
        net-timestamp: fix missing tcp fragmentation cases
        ...
      ad15afb8
    • David S. Miller's avatar
      Merge tag 'master-2014-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · a61ebdfd
      David S. Miller authored
      John W. Linville says:
      
      ====================
      pull request: wireless 2014-08-14
      
      Please pull this batch of fixes intended for the 3.17 stream...
      
      Arend van Spriel brings two brcmfmac fixes, one which fixes a memory
      leak and one which corrects some merge damage.
      
      Emmanuel Grumbach fixes Linus's iwlwifi firmware-related log spam.
      
      Rickard Strandqvist does some proper NULL termination after a call
      to strncpy.
      
      Ronald Wahl corrects a carl9170 problem with sending URBs with the
      wrong endpoint type (resulting in log spam).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a61ebdfd
    • Thomas Graf's avatar
      netlink: Annotate RCU locking for seq_file walker · 9ce12eb1
      Thomas Graf authored
      Silences the following sparse warnings:
      net/netlink/af_netlink.c:2926:21: warning: context imbalance in 'netlink_seq_start' - wrong count at exit
      net/netlink/af_netlink.c:2972:13: warning: context imbalance in 'netlink_seq_stop' - unexpected unlock
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ce12eb1
    • Thomas Graf's avatar
      rhashtable: fix annotations for rht_for_each_entry_rcu() · 93f56081
      Thomas Graf authored
      Call rcu_deference_raw() directly from within rht_for_each_entry_rcu()
      as list_for_each_entry_rcu() does.
      
      Fixes the following sparse warnings:
      net/netlink/af_netlink.c:2906:25:    expected struct rhash_head const *__mptr
      net/netlink/af_netlink.c:2906:25:    got struct rhash_head [noderef] <asn:4>*<noident>
      
      Fixes: e341694e ("netlink: Convert netlink_lookup() to use RCU protected hash table")
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93f56081
    • Thomas Graf's avatar
      rhashtable: unexport and make rht_obj() static · c91eee56
      Thomas Graf authored
      No need to export rht_obj(), all inner to outer object translations
      occur internally. It was intended to be used with rht_for_each() which
      now primarily serves as the iterator for rhashtable_remove_pprev() to
      effectively flush and free the full table.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c91eee56
    • Thomas Graf's avatar
      rhashtable: RCU annotations for next pointers · 5300fdcb
      Thomas Graf authored
      Properly annotate next pointers as access is RCU protected in
      the lookup path.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5300fdcb
    • Neal Cardwell's avatar
      tcp: fix ssthresh and undo for consecutive short FRTO episodes · 0c9ab092
      Neal Cardwell authored
      Fix TCP FRTO logic so that it always notices when snd_una advances,
      indicating that any RTO after that point will be a new and distinct
      loss episode.
      
      Previously there was a very specific sequence that could cause FRTO to
      fail to notice a new loss episode had started:
      
      (1) RTO timer fires, enter FRTO and retransmit packet 1 in write queue
      (2) receiver ACKs packet 1
      (3) FRTO sends 2 more packets
      (4) RTO timer fires again (should start a new loss episode)
      
      The problem was in step (3) above, where tcp_process_loss() returned
      early (in the spot marked "Step 2.b"), so that it never got to the
      logic to clear icsk_retransmits. Thus icsk_retransmits stayed
      non-zero. Thus in step (4) tcp_enter_loss() would see the non-zero
      icsk_retransmits, decide that this RTO is not a new episode, and
      decide not to cut ssthresh and remember the current cwnd and ssthresh
      for undo.
      
      There were two main consequences to the bug that we have
      observed. First, ssthresh was not decreased in step (4). Second, when
      there was a series of such FRTO (1-4) sequences that happened to be
      followed by an FRTO undo, we would restore the cwnd and ssthresh from
      before the entire series started (instead of the cwnd and ssthresh
      from before the most recent RTO). This could result in cwnd and
      ssthresh being restored to values much bigger than the proper values.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Fixes: e33099f9 ("tcp: implement RFC5682 F-RTO")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c9ab092
    • Hannes Frederic Sowa's avatar
      tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic · a26552af
      Hannes Frederic Sowa authored
      tcp_tw_recycle heavily relies on tcp timestamps to build a per-host
      ordering of incoming connections and teardowns without the need to
      hold state on a specific quadruple for TCP_TIMEWAIT_LEN, but only for
      the last measured RTO. To do so, we keep the last seen timestamp in a
      per-host indexed data structure and verify if the incoming timestamp
      in a connection request is strictly greater than the saved one during
      last connection teardown. Thus we can verify later on that no old data
      packets will be accepted by the new connection.
      
      During moving a socket to time-wait state we already verify if timestamps
      where seen on a connection. Only if that was the case we let the
      time-wait socket expire after the RTO, otherwise normal TCP_TIMEWAIT_LEN
      will be used. But we don't verify this on incoming SYN packets. If a
      connection teardown was less than TCP_PAWS_MSL seconds in the past we
      cannot guarantee to not accept data packets from an old connection if
      no timestamps are present. We should drop this SYN packet. This patch
      closes this loophole.
      
      Please note, this patch does not make tcp_tw_recycle in any way more
      usable but only adds another safety check:
      Sporadic drops of SYN packets because of reordering in the network or
      in the socket backlog queues can happen. Users behing NAT trying to
      connect to a tcp_tw_recycle enabled server can get caught in blackholes
      and their connection requests may regullary get dropped because hosts
      behind an address translator don't have synchronized tcp timestamp clocks.
      tcp_tw_recycle cannot work if peers don't have tcp timestamps enabled.
      
      In general, use of tcp_tw_recycle is disadvised.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a26552af
    • Neal Cardwell's avatar
      tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced() · 4fab9071
      Neal Cardwell authored
      Make sure we use the correct address-family-specific function for
      handling MTU reductions from within tcp_release_cb().
      
      Previously AF_INET6 sockets were incorrectly always using the IPv6
      code path when sometimes they were handling IPv4 traffic and thus had
      an IPv4 dst.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarWillem de Bruijn <willemb@google.com>
      Fixes: 563d34d0 ("tcp: dont drop MTU reduction indications")
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fab9071
    • Shmulik Ladkani's avatar
      sit: Fix ipip6_tunnel_lookup device matching criteria · bc8fc7b8
      Shmulik Ladkani authored
      As of 4fddbf5d ("sit: strictly restrict incoming traffic to tunnel link device"),
      when looking up a tunnel, tunnel's underlying interface (t->parms.link)
      is verified to match incoming traffic's ingress device.
      
      However the comparison was incorrectly based on skb->dev->iflink.
      
      Instead, dev->ifindex should be used, which correctly represents the
      interface from which the IP stack hands the ipip6 packets.
      
      This allows setting up sit tunnels bound to vlan interfaces (otherwise
      incoming ipip6 traffic on the vlan interface was dropped due to
      ipip6_tunnel_lookup match failure).
      Signed-off-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc8fc7b8
    • Andreas Ruprecht's avatar
      net: ethernet: ibm: ehea: Remove duplicate object from Makefile · 3b3e0ea8
      Andreas Ruprecht authored
      In the Makefile, ehea_phyp.o is included twice in the list of
      object files compile into ehea.o.
      
      This change removes one instance.
      Signed-off-by: default avatarAndreas Ruprecht <rupran@einserver.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b3e0ea8
    • Tobias Klauser's avatar
      net: xgene: Check negative return value of xgene_enet_get_ring_size() · 9b9ba821
      Tobias Klauser authored
      xgene_enet_get_ring_size() returns a negative value in case of an error,
      but its only caller in xgene_enet_create_desc_ring() currently uses the
      return value directly as u32. Instead, check for a negative value first and
      error out in case. Also move the call to xgene_enet_get_ring_size() before
      devm_kzalloc() so we don't need to free anything in the error path.
      
      This fixes the following issue reported by the Coverity Scanner:
      
      ** CID 1231336:  Improper use of negative value  (NEGATIVE_RETURNS)
      /drivers/net/ethernet/apm/xgene/xgene_enet_main.c: 596 in xgene_enet_create_desc_ring()
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b9ba821
    • Andrey Vagin's avatar
      tcp: don't use timestamp from repaired skb-s to calculate RTT (v2) · 9d186cac
      Andrey Vagin authored
      We don't know right timestamp for repaired skb-s. Wrong RTT estimations
      isn't good, because some congestion modules heavily depends on it.
      
      This patch adds the TCPCB_REPAIRED flag, which is included in
      TCPCB_RETRANS.
      
      Thanks to Eric for the advice how to fix this issue.
      
      This patch fixes the warning:
      [  879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
      [  879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
      [  879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  879.568177]  0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
      [  879.568776]  0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
      [  879.569386]  ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
      [  879.569982] Call Trace:
      [  879.570264]  [<ffffffff817aa2d2>] dump_stack+0x4d/0x66
      [  879.570599]  [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0
      [  879.570935]  [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20
      [  879.571292]  [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380
      [  879.571614]  [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710
      [  879.571958]  [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370
      [  879.572315]  [<ffffffff81657459>] release_sock+0x89/0x1d0
      [  879.572642]  [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860
      [  879.573000]  [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80
      [  879.573352]  [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40
      [  879.573678]  [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20
      [  879.574031]  [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0
      [  879.574393]  [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b
      [  879.574730] ---[ end trace a17cbc38eb8c5c00 ]---
      
      v2: moving setting of skb->when for repaired skb-s in tcp_write_xmit,
          where it's set for other skb-s.
      
      Fixes: 431a9124 ("tcp: timestamp SYN+DATA messages")
      Fixes: 740b0f18 ("tcp: switch rtt estimations to usec resolution")
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d186cac
    • Michal Simek's avatar
      net: xilinx: Remove .owner field for driver · fdd42e44
      Michal Simek authored
      There is no need to init .owner field.
      
      Based on the patch from Peter Griffin <peter.griffin@linaro.org>
      "mmc: remove .owner field for drivers using module_platform_driver"
      
      This patch removes the superflous .owner field for drivers which
      use the module_platform_driver API, as this is overriden in
      platform_driver_register anyway."
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdd42e44
    • David S. Miller's avatar
      Revert "macvlan: simplify the structure port" · 5e3c516b
      David S. Miller authored
      This reverts commit a188a54d.
      
      It causes crashes
      
      ====================
      [   80.643286] BUG: unable to handle kernel NULL pointer dereference at 0000000000000878
      [   80.670103] IP: [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.691289] PGD 22c102067 PUD 235bf0067 PMD 0
      [   80.706611] Oops: 0002 [#1] SMP
      [   80.717836] Modules linked in: macvlan nfsd lockd nfs_acl exportfs auth_rpcgss sunrpc oid_registry ioatdma ixgbe(-) mdio igb dca
      [   80.757935] CPU: 37 PID: 6724 Comm: rmmod Not tainted 3.16.0-net-next-08-12-2014-FCoE+ #1
      [   80.785688] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
      [   80.820310] task: ffff880235a9eae0 ti: ffff88022e844000 task.ti: ffff88022e844000
      [   80.845770] RIP: 0010:[<ffffffff810832e4>]  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.875326] RSP: 0018:ffff88022e847b28  EFLAGS: 00010046
      [   80.893251] RAX: 0000000000037a6a RBX: 0000000000000878 RCX: 0000000000000000
      [   80.917187] RDX: ffff880235a9eae0 RSI: 0000000000000001 RDI: ffffffff810832db
      [   80.941125] RBP: ffff88022e847b58 R08: 0000000000000000 R09: 0000000000000000
      [   80.965056] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022e847b70
      [   80.988994] R13: 0000000000000000 R14: ffff88022e847be8 R15: ffffffff81ebe440
      [   81.012929] FS:  00007fab90b07700(0000) GS:ffff88043f7a0000(0000) knlGS:0000000000000000
      [   81.040400] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   81.059757] CR2: 0000000000000878 CR3: 0000000235a42000 CR4: 00000000001407e0
      [   81.083689] Stack:
      [   81.090739]  ffff880235a9eae0 0000000000000878 ffff88022e847b70 0000000000000000
      [   81.116253]  ffff88022e847be8 ffffffff81ebe440 ffff88022e847b98 ffffffff810847f1
      [   81.141766]  ffff88022e847b78 0000000000000286 ffff880234200000 0000000000000000
      [   81.167282] Call Trace:
      [   81.175768]  [<ffffffff810847f1>] __cancel_work_timer+0x31/0x170
      [   81.195985]  [<ffffffff8108494b>] cancel_work_sync+0xb/0x10
      [   81.214769]  [<ffffffffa015ae68>] macvlan_port_destroy+0x28/0x60 [macvlan]
      [   81.237844]  [<ffffffffa015b930>] macvlan_uninit+0x40/0x50 [macvlan]
      [   81.259209]  [<ffffffff816bf6e2>] rollback_registered_many+0x1a2/0x2c0
      [   81.281140]  [<ffffffff816bf81a>] unregister_netdevice_many+0x1a/0xb0
      [   81.302786]  [<ffffffffa015a4ff>] macvlan_device_event+0x1ef/0x240 [macvlan]
      [   81.326439]  [<ffffffff8108a13d>] notifier_call_chain+0x4d/0x70
      [   81.346366]  [<ffffffff8108a201>] raw_notifier_call_chain+0x11/0x20
      [   81.367439]  [<ffffffff816bf25b>] call_netdevice_notifiers_info+0x3b/0x70
      [   81.390228]  [<ffffffff816bf2a1>] call_netdevice_notifiers+0x11/0x20
      [   81.411587]  [<ffffffff816bf6bd>] rollback_registered_many+0x17d/0x2c0
      [   81.433518]  [<ffffffff816bf925>] unregister_netdevice_queue+0x75/0x110
      [   81.455735]  [<ffffffff816bfb2b>] unregister_netdev+0x1b/0x30
      [   81.475094]  [<ffffffffa0039b50>] ixgbe_remove+0x170/0x1d0 [ixgbe]
      [   81.495886]  [<ffffffff813512a2>] pci_device_remove+0x32/0x60
      [   81.515246]  [<ffffffff814c75c4>] __device_release_driver+0x64/0xd0
      [   81.536321]  [<ffffffff814c76f8>] driver_detach+0xc8/0xd0
      [   81.554530]  [<ffffffff814c656e>] bus_remove_driver+0x4e/0xa0
      [   81.573888]  [<ffffffff814c828b>] driver_unregister+0x2b/0x60
      [   81.593246]  [<ffffffff8135143e>] pci_unregister_driver+0x1e/0xa0
      [   81.613749]  [<ffffffffa005db18>] ixgbe_exit_module+0x1c/0x2e [ixgbe]
      [   81.635401]  [<ffffffff810e738b>] SyS_delete_module+0x15b/0x1e0
      [   81.655334]  [<ffffffff8187a395>] ? sysret_check+0x22/0x5d
      [   81.673833]  [<ffffffff810abd2d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
      [   81.696339]  [<ffffffff8132bfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [   81.717985]  [<ffffffff8187a369>] system_call_fastpath+0x16/0x1b
      [   81.738199] Code: 00 48 83 3d 6e bb da 00 00 48 89 c2 0f 84 67 01 00 00 fa 66 0f 1f 44 00 00 49 89 14 24 e8 b5 4b 02 00 45 84 ed 0f 85 ac 00 00 00 <f0> 0f ba 2b 00 72 1d 31 c0 48 8b 5d d8 4c 8b 65 e0 4c 8b 6d e8
      [   81.807026] RIP  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   81.828468]  RSP <ffff88022e847b28>
      [   81.840384] CR2: 0000000000000878
      [   81.851731] ---[ end trace 9f6c7232e3464e11 ]---
      ====================
      
      This bug could be triggered by these steps:
      
      modprobe ixgbe ; modprobe macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:00 macvlan0 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:01 macvlan1 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:02 macvlan2 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:03 macvlan3 type macvlan
      rmmod ixgbe
      Reported-by: default avatar"Keller, Jacob E" <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e3c516b
    • Linus Torvalds's avatar
      Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 899552d6
      Linus Torvalds authored
      Pull misc kbuild updates from Michal Marek:
       "This is the non-critical part of kbuild for 3.17-rc1:
      
         - make help hint to use make -s with make kernelrelease et al.
         - moved a kbuild document to Documentation/kbuild where it belongs
         - four new Coccinelle scripts, one dropped and one fixed
         - new make kselftest target to run various tests on the kernel"
      
      * 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kbuild: kselftest - new make target to build and run kernel selftests
        Coccinelle: Script to replace if and BUG with BUG_ON
        Coccinelle: Script to detect incorrect argument to sizeof
        Coccinelle: Script to use ARRAY_SIZE instead of division of two sizeofs
        Coccinelle: Script to detect cast after memory allocation
        coccinelle/null: solve parse error
        Documentation: headers_install.txt is part of kbuild
        kbuild: make -s should be used with kernelrelease/kernelversion/image_name
      899552d6
    • Linus Torvalds's avatar
      Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 3b7b3e6e
      Linus Torvalds authored
      Pull kbuild updates from Michal Marek:
       - make clean also considers $(extra-m) and $(extra-) to be consistent
       - cleanup and fixes in scripts/Makefile.host
       - allow to override the name of the Python 2 executable with make
         PYTHON=... (only needed for ia64 in practice)
       - option to split debugingo into *.dwo files to save disk space if the
         compiler supports it (CONFIG_DEBUG_INFO_SPLIT)
       - option to use dwarf4 debuginfo if the compiler supports it
         (CONFIG_DEBUG_INFO_DWARF4)
       - fix for disabling certain warnings with clang
       - fix for unneeded rebuild with dash when a command contains
         backslashes
      
      * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kbuild: Fix handling of backslashes in *.cmd files
        kbuild, LLVMLinux: Supress warnings unless W=1-3
        Kbuild: Add a option to enable dwarf4 v2
        kbuild: Support split debug info v4
        kbuild: allow to override Python command name
        kbuild: clean-up and bug fix of scripts/Makefile.host
        kbuild: clean up scripts/Makefile.host
        kbuild: drop shared library support from Makefile.host
        kbuild: fix a bug of C++ host program handling
        kbuild: fix a typo in scripts/Makefile.host
        scripts/Makefile.clean: clean also $(extra-m) and $(extra-)
      3b7b3e6e
    • Linus Torvalds's avatar
      Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · e3b1fd56
      Linus Torvalds authored
      Pull infiniband/rdma updates from Roland Dreier:
       "Main set of InfiniBand/RDMA updates for 3.17 merge window:
      
         - MR reregistration support
         - MAD support for RMPP in userspace
         - iSER and SRP initiator updates
         - ocrdma hardware driver updates
         - other fixes..."
      
      * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits)
        IB/srp: Fix return value check in srp_init_module()
        RDMA/ocrdma: report asic-id in query device
        RDMA/ocrdma: Update sli data structure for endianness
        RDMA/ocrdma: Obtain SL from device structure
        RDMA/uapi: Include socket.h in rdma_user_cm.h
        IB/srpt: Handle GID change events
        IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0]
        IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0]
        RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf()
        IPoIB: Remove unnecessary test for NULL before debugfs_remove()
        IB/mad: Add user space RMPP support
        IB/mad: add new ioctl to ABI to support new registration options
        IB/mad: Add dev_notice messages for various umad/mad registration failures
        IB/mad: Update module to [pr|dev]_* style print messages
        IB/ipoib: Avoid multicast join attempts with invalid P_key
        IB/umad: Update module to [pr|dev]_* style print messages
        IB/ipoib: Avoid flushing the workqueue from worker context
        IB/ipoib: Use P_Key change event instead of P_Key polling mechanism
        IB/ipath: Add P_Key change event support
        mlx4_core: Add support for secure-host and SMP firewall
        ...
      e3b1fd56
    • John Stultz's avatar
      timekeeping: Another fix to the VSYSCALL_OLD update_vsyscall · 0680eb1f
      John Stultz authored
      Benjamin Herrenschmidt pointed out that I further missed modifying
      update_vsyscall after the wall_to_mono value was changed to a
      timespec64.  This causes issues on powerpc32, which expects a 32bit
      timespec.
      
      This patch fixes the problem by properly converting from a timespec64 to
      a timespec before passing the value on to the arch-specific vsyscall
      logic.
      
      [ Thomas is currently on vacation, but reviewed it and wanted me to send
        this fix on to you directly. ]
      
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Reported-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0680eb1f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew Morton) · f2937e45
      Linus Torvalds authored
      Merge leftovers from Andrew Morton:
       "A few leftovers.
      
        I have a bunch of OCFS2 patches which are still out for review and
        which I might sneak along after -rc1.  Partly my fault - I should send
        my review pokes out earlier"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: fix CROSS_MEMORY_ATTACH help text grammar
        drivers/mfd/rtsx_usb.c: export device table
        mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size
      f2937e45
    • Geert Uytterhoeven's avatar
    • Jeff Mahoney's avatar
      drivers/mfd/rtsx_usb.c: export device table · 18139089
      Jeff Mahoney authored
      The rtsx_usb driver contains the table for the devices it supports but
      doesn't export it.  As a result, no alias is generated and it doesn't
      get loaded automatically.
      
      Via https://bugzilla.novell.com/show_bug.cgi?id=890096Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reported-by: default avatarMarcel Witte <wittemar@googlemail.com>
      Cc: Roger Tseng <rogerable@realtek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18139089
    • David Rientjes's avatar
      mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size · 24d7cd20
      David Rientjes authored
      Memcg aligns memory.limit_in_bytes to PAGE_SIZE as part of the resource
      counter since it makes no sense to allow a partial page to be charged.
      
      As a result of the hugetlb cgroup using the resource counter, it is also
      aligned to PAGE_SIZE but makes no sense unless aligned to the size of
      the hugepage being limited.
      
      Align hugetlb cgroup limit to hugepage size.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24d7cd20
    • Emmanuel Grumbach's avatar
      iwlwifi: mvm: disable scheduled scan to prevent firmware crash · 77b2f286
      Emmanuel Grumbach authored
      There are firmwares which don't support scheduled scan.
      Disable it for now.
      Linus's system encoutered this issue.
      Thanks to David Spinadel for his help.
      Tested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      77b2f286
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 1d508f8a
      Linus Torvalds authored
      Pull more powerpc updates from Ben Herrenschmidt:
       "Here are some more powerpc bits for 3.17, essentially fixes.
      
        The biggest series, also aimed at -stable, is from Aneesh and is the
        result of weeks and weeks of debugging to find out why the heck or THP
        implementation was occasionally triggering multi-hit errors in our
        level 1 TLB.  It ended up being a combination of issues including
        subtleties as to how we should invalidate those special 'MPSS' pages
        we use to allow the use of 16M pages inside 4K/64K "base page size"
        segments (you really have to love our MMU !)
      
        Another interesting one in the "OMG" category is the series from
        Michael adding memory barriers to spin_is_locked().  That's also the
        result of many days of debugging to figure out why the semaphore code
        would occasionally crash in ways that made no sense.  It ended up
        being some creative lock stacking that was defeated by the fact that
        our locks allow a load inside the locked section to be re-ordered with
        the load of the lock value itself (I'm still of two mind about whether
        to kill that once and for all by putting a heavier barrier back into
        our lock implementation...).  The fixes come with a long explanation
        in the cset comments, feel free to read it if you feel like having a
        headache today"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
        powerpc/thp: Add tracepoints to track hugepage invalidate
        powerpc/mm: Use read barrier when creating real_pte
        powerpc/thp: Use ACCESS_ONCE when loading pmdp
        powerpc/thp: Invalidate with vpn in loop
        powerpc/thp: Handle combo pages in invalidate
        powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
        powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
        powerpc/thp: Add write barrier after updating the valid bit
        powerpc: reorder per-cpu NUMA information's initialization
        powerpc/perf/hv-24x7: Use kmem_cache_free
        powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
        powerpc: Hard disable interrupts in xmon
        powerpc: remove duplicate definition of TEXASR_FS
        powerpc/pseries: Avoid deadlock on removing ddw
        powerpc/pseries: Failure on removing device node
        powerpc/boot: Use correct zlib types for comparison
        powerpc/powernv: Interface to register/unregister opal dump region
        printk: Add function to return log buffer address and size
        powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS
        powerpc/ppc476: Disable BTAC
        ...
      1d508f8a
    • Linus Torvalds's avatar
      Merge tag 'hwspinlock-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock · 2d0c05e1
      Linus Torvalds authored
      Pull hwspinlock updates from Ohad Ben-Cohen:
       "Two small hwspinlock changes for better OMAP support, coming from
        Suman Anna"
      
      * tag 'hwspinlock-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
        hwspinlock: enable OMAP build for AM33xx, AM43xx & DRA7xx
        hwspinlock/omap: enable module before reading SYSSTATUS register
      2d0c05e1
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 311bf6d1
      Linus Torvalds authored
      Pull seccomp fix from James Morris.
      
      BUG(!spin_is_locked()) really doesn't work very well in UP
      configurations without any actual spinlock state.  Which is very much
      why we have that "assert_spin_lock()" function for this.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock
      311bf6d1
    • Roland Dreier's avatar
      Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc',... · d087f6ad
      Roland Dreier authored
      Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', 'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next
      d087f6ad
    • Wei Yongjun's avatar
      IB/srp: Fix return value check in srp_init_module() · da05be29
      Wei Yongjun authored
      In case of error, the function create_workqueue() returns NULL pointer
      not ERR_PTR().  The IS_ERR() test in the return value check should be
      replaced with NULL test.
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Acked-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      da05be29
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 82f05a08
      Linus Torvalds authored
      Pull hwmon fixes from Guenter Roeck:
       "Several bug fixes in various drivers, plus a minor cleanup in the
        tmp103 driver"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (tmp103) Remove duplicate test for I2C_FUNC_SMBUS_BYTE_DATA functionality
        hwmon: (w83793) Fix vrm write operation
        hwmon: (w83791d) Fix vrm write operation
        hwmon: (w83627hf) Fix vrm write operation
        hwmon: (vt1211) Fix vrm write operation
        hwmon: (pc87360) Fix vrm write operation
        hwmon: (lm87) Fix vrm write operation
        hwmon: (asb100) Fix vrm write operation
        hwmon: (adm1026) Fix vrm write operation
        hwmon: (adm1025) Fix vrm write operation
        hwmon: (hih6130) Fix missing hih6130->write_length setting
        hwmon: (dme1737) Prevent overflow problem when writing large limits
        hwmon: (emc6w201) Fix temperature limit range
        hwmon: (ads1015) Fix out-of-bounds array access
        hwmon: (lm92) Prevent overflow problem when writing large limits
      82f05a08