1. 28 Nov, 2013 31 commits
    • Hannes Frederic Sowa's avatar
      inet: fix possible memory corruption with UDP_CORK and UFO · 5124ae99
      Hannes Frederic Sowa authored
      [ This is a simplified -stable version of a set of upstream commits. ]
      
      This is a replacement patch only for stable which does fix the problems
      handled by the following two commits in -net:
      
      "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d74)
      "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf5)
      
      Three frames are written on a corked udp socket for which the output
      netdevice has UFO enabled.  If the first and third frame are smaller than
      the mtu and the second one is bigger, we enqueue the second frame with
      skb_append_datato_frags without initializing the gso fields. This leads
      to the third frame appended regulary and thus constructing an invalid skb.
      
      This fixes the problem by always using skb_append_datato_frags as soon
      as the first frag got enqueued to the skb without marking the packet
      as SKB_GSO_UDP.
      
      The problem with only two frames for ipv6 was fixed by "ipv6: udp
      packets following an UFO enqueued packet need also be handled by UFO"
      (2811ebac).
      
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5124ae99
    • Markus Trippelsdorf's avatar
      perf tools: Fix getrusage() related build failure on glibc trunk · a39639b4
      Markus Trippelsdorf authored
      commit 7b78f136 upstream.
      
      On a system running glibc trunk perf doesn't build:
      
          CC builtin-sched.o
      builtin-sched.c: In function ‘get_cpu_usage_nsec_parent’: builtin-sched.c:399:16: error: storage size of ‘ru’ isn’t known builtin-sched.c:403:2: error: implicit declaration of function ‘getrusage’ [-Werror=implicit-function-declaration]
          [...]
      
      Fix it by including sys/resource.h.
      Signed-off-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20120404084527.GA294@x4Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a39639b4
    • Wei Liu's avatar
      xen-netback: use jiffies_64 value to calculate credit timeout · 5e14dc95
      Wei Liu authored
      [ Upstream commit 059dfa6a ]
      
      time_after_eq() only works if the delta is < MAX_ULONG/2.
      
      For a 32bit Dom0, if netfront sends packets at a very low rate, the time
      between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2
      and the test for timer_after_eq() will be incorrect. Credit will not be
      replenished and the guest may become unable to send packets (e.g., if
      prior to the long gap, all credit was exhausted).
      
      Use jiffies_64 variant to mitigate this problem for 32bit Dom0.
      Suggested-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Jason Luan <jianhai.luan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5e14dc95
    • Peter Zijlstra's avatar
      perf: Fix perf ring buffer memory ordering · d08b0a55
      Peter Zijlstra authored
      commit bf378d34 upstream.
      
      The PPC64 people noticed a missing memory barrier and crufty old
      comments in the perf ring buffer code. So update all the comments and
      add the missing barrier.
      
      When the architecture implements local_t using atomic_long_t there
      will be double barriers issued; but short of introducing more
      conditional barrier primitives this is the best we can do.
      Reported-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Tested-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: michael@ellerman.id.au
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: anton@samba.org
      Cc: benh@kernel.crashing.org
      Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lanSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d08b0a55
    • Sergey Senozhatsky's avatar
      zram: allow request end to coincide with disksize · 498a727b
      Sergey Senozhatsky authored
      commit 75c7caf5 upstream.
      
      Pass valid_io_request() checks if request end coincides with disksize
      (end equals bound), only fail if we attempt to read beyond the bound.
      
      mkfs.ext2 produces numerous errors:
      [ 2164.632747] quiet_error: 1 callbacks suppressed
      [ 2164.633260] Buffer I/O error on device zram0, logical block 153599
      [ 2164.633265] lost page write due to I/O error on zram0
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      498a727b
    • Eric Sandeen's avatar
      ext3: return 32/64-bit dir name hash according to usage type · 3b712f13
      Eric Sandeen authored
      commit d7dab39b upstream.
      
      This is based on commit d1f5273e
      ext4: return 32/64-bit dir name hash according to usage type
      by Fan Yong <yong.fan@whamcloud.com>
      
      Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
      to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
      and telldir().  However, this causes problems if there are 32-bit hash
      collisions, since the NFSv2 server can get stuck resending the same
      entries from the directory repeatedly.
      
      Allow ext3 to return a full 64-bit hash (both major and minor) for
      telldir to decrease the chance of hash collisions.
      
      This patch does implement a new ext3_dir_llseek op, because with 64-bit
      hashes, nfs will attempt to seek to a hash "offset" which is much
      larger than ext3's s_maxbytes.  So for dx dirs, we call
      generic_file_llseek_size() with the appropriate max hash value as the
      maximum seekable size.  Otherwise we just pass through to
      generic_file_llseek().
      Patch-updated-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Patch-updated-by: default avatarEric Sandeen <sandeen@redhat.com>
      (blame us if something is not correct)
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3b712f13
    • Bernd Schubert's avatar
      nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) · 7ddeebd9
      Bernd Schubert authored
      commit 06effdbb upstream.
      
      Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
      the NFS version. NFSv2 gets 32-bit hashes only.
      
      NOTE: This patch got rather complex as Christoph asked to set the
      filp->f_mode flag in the open call or immediatly after dentry_open()
      in nfsd_open() to avoid races.
      Personally I still do not see a reason for that and in my opinion
      FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
      follows directly after nfsd_open() without a chance of races.
      Signed-off-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7ddeebd9
    • Bernd Schubert's avatar
      nfsd: rename 'int access' to 'int may_flags' in nfsd_open() · d1ccc87a
      Bernd Schubert authored
      commit 999448a8 upstream.
      
      Just rename this variable, as the next patch will add a flag and
      'access' as variable name would not be correct any more.
      Signed-off-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d1ccc87a
    • Fan Yong's avatar
      ext4: return 32/64-bit dir name hash according to usage type · 72b749f6
      Fan Yong authored
      commit d1f5273e upstream.
      
      Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
      to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
      and telldir().  However, this causes problems if there are 32-bit hash
      collisions, since the NFSv2 server can get stuck resending the same
      entries from the directory repeatedly.
      
      Allow ext4 to return a full 64-bit hash (both major and minor) for
      telldir to decrease the chance of hash collisions.  This still needs
      integration on the NFS side.
      Patch-updated-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      (blame me if something is not correct)
      Signed-off-by: default avatarFan Yong <yong.fan@whamcloud.com>
      Signed-off-by: default avatarAndreas Dilger <adilger@whamcloud.com>
      Signed-off-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      72b749f6
    • Bernd Schubert's avatar
      fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash · f3576bd5
      Bernd Schubert authored
      commit 6a8a13e0 upstream.
      
      Those flags are supposed to be set by NFS readdir() to tell ext3/ext4
      to 32bit (NFSv2) or 64bit hash values (offsets) in seekdir().
      Signed-off-by: default avatarBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f3576bd5
    • Nikhil P Rao's avatar
      PCI: fix truncation of resource size to 32 bits · b538dfee
      Nikhil P Rao authored
      commit d6776e6d upstream.
      
      _pci_assign_resource() took an int "size" argument, which meant that
      sizes larger than 4GB were truncated.  Change type to resource_size_t.
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarNikhil P Rao <nikhil.rao@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b538dfee
    • Mariusz Ceier's avatar
      davinci_emac.c: Fix IFF_ALLMULTI setup · d4bd9140
      Mariusz Ceier authored
      [ Upstream commit d69e0f7e ]
      
      When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
      emac_dev_mcast_set should only enable RX of multicasts and reset
      MACHASH registers.
      
      It does this, but afterwards it either sets up multicast MACs
      filtering or disables RX of multicasts and resets MACHASH registers
      again, rendering IFF_ALLMULTI flag useless.
      
      This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
      disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.
      
      Tested with kernel 2.6.37.
      Signed-off-by: default avatarMariusz Ceier <mceier+kernel@gmail.com>
      Acked-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d4bd9140
    • Seif Mazareeb's avatar
      net: fix cipso packet validation when !NETLABEL · 55bf9001
      Seif Mazareeb authored
      [ Upstream commit f2e5ddcc ]
      
      When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop
      forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel
      crash in an SMP system, since the CPU executing this function will
      stall /not respond to IPIs.
      
      This problem can be reproduced by running the IP Stack Integrity Checker
      (http://isic.sourceforge.net) using the following command on a Linux machine
      connected to DUT:
      
      "icmpsic -s rand -d <DUT IP address> -r 123456"
      wait (1-2 min)
      Signed-off-by: default avatarSeif Mazareeb <seif@marvell.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      55bf9001
    • Daniel Borkmann's avatar
      net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race · c310512e
      Daniel Borkmann authored
      [ Upstream commit 90c6bd34 ]
      
      In the case of credentials passing in unix stream sockets (dgram
      sockets seem not affected), we get a rather sparse race after
      commit 16e57262 ("af_unix: dont send SCM_CREDENTIALS by default").
      
      We have a stream server on receiver side that requests credential
      passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
      on each spawned/accepted socket on server side to 1 first (as it's
      not inherited), it can happen that in the time between accept() and
      setsockopt() we get interrupted, the sender is being scheduled and
      continues with passing data to our receiver. At that time SO_PASSCRED
      is neither set on sender nor receiver side, hence in cmsg's
      SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
      (== overflow{u,g}id) instead of what we actually would like to see.
      
      On the sender side, here nc -U, the tests in maybe_add_creds()
      invoked through unix_stream_sendmsg() would fail, as at that exact
      time, as mentioned, the sender has neither SO_PASSCRED on his side
      nor sees it on the server side, and we have a valid 'other' socket
      in place. Thus, sender believes it would just look like a normal
      connection, not needing/requesting SO_PASSCRED at that time.
      
      As reverting 16e57262 would not be an option due to the significant
      performance regression reported when having creds always passed,
      one way/trade-off to prevent that would be to set SO_PASSCRED on
      the listener socket and allow inheriting these flags to the spawned
      socket on server side in accept(). It seems also logical to do so
      if we'd tell the listener socket to pass those flags onwards, and
      would fix the race.
      
      Before, strace:
      
      recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
              msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
              cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
              msg_flags=0}, 0) = 5
      
      After, strace:
      
      recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
              msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
              cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
              msg_flags=0}, 0) = 5
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c310512e
    • Salva Peiró's avatar
      wanxl: fix info leak in ioctl · e6c24ff2
      Salva Peiró authored
      [ Upstream commit 2b13d06c ]
      
      The wanxl_ioctl() code fails to initialize the two padding bytes of
      struct sync_serial_settings after the ->loopback member. Add an explicit
      memset(0) before filling the structure to avoid the info leak.
      Signed-off-by: default avatarSalva Peiró <speiro@ai2.upv.es>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e6c24ff2
    • Vlad Yasevich's avatar
      sctp: Perform software checksum if packet has to be fragmented. · 8d082949
      Vlad Yasevich authored
      [ Upstream commit d2dbbba7 ]
      
      IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
      This causes problems if SCTP packets has to be fragmented and
      ipsummed has been set to PARTIAL due to checksum offload support.
      This condition can happen when retransmitting after MTU discover,
      or when INIT or other control chunks are larger then MTU.
      Check for the rare fragmentation condition in SCTP and use software
      checksum calculation in this case.
      
      CC: Fan Du <fan.du@windriver.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8d082949
    • Fan Du's avatar
      sctp: Use software crc32 checksum when xfrm transform will happen. · aa4797fc
      Fan Du authored
      [ Upstream commit 27127a82 ]
      
      igb/ixgbe have hardware sctp checksum support, when this feature is enabled
      and also IPsec is armed to protect sctp traffic, ugly things happened as
      xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
      up and pack the 16bits result in the checksum field). The result is fail
      establishment of sctp communication.
      
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      aa4797fc
    • Vlad Yasevich's avatar
      net: dst: provide accessor function to dst->xfrm · 69ef6988
      Vlad Yasevich authored
      [ Upstream commit e87b3998 ]
      
      dst->xfrm is conditionally defined.  Provide accessor funtion that
      is always available.
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      69ef6988
    • Eric Dumazet's avatar
      bnx2x: record rx queue for LRO packets · d6e066a9
      Eric Dumazet authored
      [ Upstream commit 60e66fee ]
      
      RPS support is kind of broken on bnx2x, because only non LRO packets
      get proper rx queue information. This triggers reorders, as it seems
      bnx2x like to generate a non LRO packet for segment including TCP PUSH
      flag : (this might be pure coincidence, but all the reorders I've
      seen involve segments with a PUSH)
      
      11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
      11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
      11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797>
      11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
      
      We must call skb_record_rx_queue() to properly give to RPS (and more
      generally for TX queue selection on forward path) the receive queue
      information.
      
      Similar fix is needed for skb_mark_napi_id(), but will be handled
      in a separate patch to ease stable backports.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Acked-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d6e066a9
    • Mathias Krause's avatar
      connector: use nlmsg_len() to check message length · c92d60f0
      Mathias Krause authored
      [ Upstream commit 162b2bed ]
      
      The current code tests the length of the whole netlink message to be
      at least as long to fit a cn_msg. This is wrong as nlmsg_len includes
      the length of the netlink message header. Use nlmsg_len() instead to
      fix this "off-by-NLMSG_HDRLEN" size check.
      
      Cc: stable@vger.kernel.org  # v2.6.14+
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c92d60f0
    • Salva Peiró's avatar
      farsync: fix info leak in ioctl · 5bf019eb
      Salva Peiró authored
      [ Upstream commit 96b34040 ]
      
      The fst_get_iface() code fails to initialize the two padding bytes of
      struct sync_serial_settings after the ->loopback member. Add an explicit
      memset(0) before filling the structure to avoid the info leak.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5bf019eb
    • Eric Dumazet's avatar
      l2tp: must disable bh before calling l2tp_xmit_skb() · d9aab1cf
      Eric Dumazet authored
      [ Upstream commit 455cc32b ]
      
      François Cachereul made a very nice bug report and suspected
      the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
      process context was not good.
      
      This problem was added by commit 6af88da1
      ("l2tp: Fix locking in l2tp_core.c").
      
      l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
      from other l2tp_xmit_skb() users.
      
      [  452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
      [  452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
      ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
      virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
      [  452.064012] CPU 1
      [  452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
      [  452.080015] CPU 2
      [  452.080015]
      [  452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
      [  452.080015] RIP: 0010:[<ffffffff81059f6c>]  [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
      [  452.080015] RSP: 0018:ffff88007125fc18  EFLAGS: 00000293
      [  452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
      [  452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
      [  452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
      [  452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
      [  452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
      [  452.080015] FS:  00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
      [  452.080015] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
      [  452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
      [  452.080015] Stack:
      [  452.080015]  ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
      [  452.080015]  ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
      [  452.080015]  000000000000005c 000000080000000e 0000000000000000 ffff880071170600
      [  452.080015] Call Trace:
      [  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
      [  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
      [  452.080015] Call Trace:
      [  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
      [  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.064012]
      [  452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
      [  452.064012] RIP: 0010:[<ffffffff81059f6e>]  [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
      [  452.064012] RSP: 0018:ffff8800b6e83ba0  EFLAGS: 00000297
      [  452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
      [  452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
      [  452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
      [  452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
      [  452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
      [  452.064012] FS:  00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
      [  452.064012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
      [  452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
      [  452.064012] Stack:
      [  452.064012]  ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
      [  452.064012]  ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
      [  452.064012]  0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
      [  452.064012] Call Trace:
      [  452.064012]  <IRQ>
      [  452.064012]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
      [  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
      [  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
      [  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
      [  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
      [  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
      [  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
      [  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
      [  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
      [  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
      [  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
      [  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
      [  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
      [  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
      [  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
      [  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
      [  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
      [  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
      [  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
      [  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
      [  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
      [  452.064012]  <EOI>
      [  452.064012]  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      [  452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
      [  452.064012] Call Trace:
      [  452.064012]  <IRQ>  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
      [  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
      [  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
      [  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
      [  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
      [  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
      [  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
      [  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
      [  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
      [  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
      [  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
      [  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
      [  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
      [  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
      [  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
      [  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
      [  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
      [  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
      [  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
      [  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
      [  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
      [  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
      [  452.064012]  <EOI>  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
      [  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
      [  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
      [  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
      [  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
      [  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
      [  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
      [  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
      [  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
      [  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
      [  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
      Reported-by: default avatarFrançois Cachereul <f.cachereul@alphalink.fr>
      Tested-by: default avatarFrançois Cachereul <f.cachereul@alphalink.fr>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d9aab1cf
    • Marc Kleine-Budde's avatar
      net: vlan: fix nlmsg size calculation in vlan_get_size() · 98459c1e
      Marc Kleine-Budde authored
      [ Upstream commit c33a39c5 ]
      
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      98459c1e
    • Marcelo Ricardo Leitner's avatar
      ipv6: restrict neighbor entry creation to output flow · 75770c94
      Marcelo Ricardo Leitner authored
      This patch is based on 3.2.y branch, the one used by reporter. Please let me
      know if it should be different. Thanks.
      
      The patch which introduced the regression was applied on stables:
      3.0.64 3.4.31 3.7.8 3.2.39
      
      The patch which introduced the regression was for stable trees only.
      
      ---8<---
      
      Commit 0d6a7707 "ipv6: do not create
      neighbor entries for local delivery" introduced a regression on
      which routes to local delivery would not work anymore. Like this:
      
          $ ip -6 route add local 2001::/64 dev lo
          $ ping6 -c1 2001::9
          PING 2001::9(2001::9) 56 data bytes
          ping: sendmsg: Invalid argument
      
      As this is a local delivery, that commit would not allow the creation of a
      neighbor entry and thus the packet cannot be sent.
      
      But as TPROXY scenario actually needs to avoid the neighbor entry creation only
      for input flow, this patch now limits previous patch to input flow, keeping
      output as before that patch.
      Reported-by: default avatarDebabrata Banerjee <dbavatar@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      75770c94
    • Marc Kleine-Budde's avatar
      can: dev: fix nlmsg size calculation in can_get_size() · 69c1c491
      Marc Kleine-Budde authored
      [ Upstream commit fe119a05 ]
      
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      69c1c491
    • Jiri Benc's avatar
      ipv4: fix ineffective source address selection · 4ba76ac2
      Jiri Benc authored
      [ Upstream commit 0a7e2260 ]
      
      When sending out multicast messages, the source address in inet->mc_addr is
      ignored and rewritten by an autoselected one. This is caused by a typo in
      commit 813b3b5d ("ipv4: Use caller's on-stack flowi as-is in output
      route lookups").
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4ba76ac2
    • Mathias Krause's avatar
      proc connector: fix info leaks · d60eefc0
      Mathias Krause authored
      [ Upstream commit e727ca82 ]
      
      Initialize event_data for all possible message types to prevent leaking
      kernel stack contents to userland (up to 20 bytes). Also set the flags
      member of the connector message to 0 to prevent leaking two more stack
      bytes this way.
      
      Cc: stable@vger.kernel.org  # v2.6.15+
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d60eefc0
    • Dan Carpenter's avatar
      net: heap overflow in __audit_sockaddr() · f1d515ce
      Dan Carpenter authored
      [ Upstream commit 1661bf36 ]
      
      We need to cap ->msg_namelen or it leads to a buffer overflow when we
      to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
      exploit this bug.
      
      The call tree is:
      ___sys_recvmsg()
        move_addr_to_user()
          audit_sockaddr()
            __audit_sockaddr()
      Reported-by: default avatarJüri Aedla <juri.aedla@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f1d515ce
    • Eric Dumazet's avatar
      net: do not call sock_put() on TIMEWAIT sockets · ea54bc74
      Eric Dumazet authored
      [ Upstream commit 80ad1d61 ]
      
      commit 3ab5aee7 ("net: Convert TCP & DCCP hash tables to use RCU /
      hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.
      
      We should instead use inet_twsk_put()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ea54bc74
    • Eric Dumazet's avatar
      tcp: do not forget FIN in tcp_shifted_skb() · 7e369408
      Eric Dumazet authored
      [ Upstream commit 5e8a402f ]
      
      Yuchung found following problem :
      
       There are bugs in the SACK processing code, merging part in
       tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
       skbs FIN flag. When a receiver first SACK the FIN sequence, and later
       throw away ofo queue (e.g., sack-reneging), the sender will stop
       retransmitting the FIN flag, and hangs forever.
      
      Following packetdrill test can be used to reproduce the bug.
      
      $ cat sack-merge-bug.pkt
      `sysctl -q net.ipv4.tcp_fack=0`
      
      // Establish a connection and send 10 MSS.
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +.000 bind(3, ..., ...) = 0
      +.000 listen(3, 1) = 0
      
      +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.001 < . 1:1(0) ack 1 win 1024
      +.000 accept(3, ..., ...) = 4
      
      +.100 write(4, ..., 12000) = 12000
      +.000 shutdown(4, SHUT_WR) = 0
      +.000 > . 1:10001(10000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257
      +.000 > FP. 10001:12001(2000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
      // SACK reneg
      +.050 < . 1:1(0) ack 12001 win 257
      +0 %{ print "unacked: ",tcpi_unacked }%
      +5 %{ print "" }%
      
      First, a typo inverted left/right of one OR operation, then
      code forgot to advance end_seq if the merged skb carried FIN.
      
      Bug was added in 2.6.29 by commit 832d11c5
      ("tcp: Try to restore large SKBs while SACK processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7e369408
    • Eric Dumazet's avatar
      tcp: must unclone packets before mangling them · e5704e27
      Eric Dumazet authored
      [ Upstream commit c52e2421 ]
      
      TCP stack should make sure it owns skbs before mangling them.
      
      We had various crashes using bnx2x, and it turned out gso_size
      was cleared right before bnx2x driver was populating TC descriptor
      of the _previous_ packet send. TCP stack can sometime retransmit
      packets that are still in Qdisc.
      
      Of course we could make bnx2x driver more robust (using
      ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.
      
      We have identified two points where skb_unclone() was needed.
      
      This patch adds a WARN_ON_ONCE() to warn us if we missed another
      fix of this kind.
      
      Kudos to Neal for finding the root cause of this bug. Its visible
      using small MSS.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e5704e27
  2. 26 Oct, 2013 9 commits
    • Ben Hutchings's avatar
      Linux 3.2.52 · 8b5ed99a
      Ben Hutchings authored
      8b5ed99a
    • Marc Kleine-Budde's avatar
      can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX · 9203ceb6
      Marc Kleine-Budde authored
      commit d5a7b406 upstream.
      
      In patch
      
          0d1862ea can: flexcan: fix flexcan_chip_start() on imx6
      
      the loop in flexcan_chip_start() that iterates over all mailboxes after the
      soft reset of the CAN core was removed. This loop put all mailboxes (even the
      ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63,
      this aborts any pending TX messages.
      
      After a cold boot there is random garbage in the mailboxes, which leads to
      spontaneous transmit of CAN frames during first activation. Further if the
      interface was disabled with a pending message (usually due to an error
      condition on the CAN bus), this message is retransmitted after enabling the
      interface again.
      
      This patch fixes the regression by:
      1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX
      FIFO, 8 is used by TX.
      2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that
      mailbox is aborted.
      
      Cc: Lothar Waßmann <LW@KARO-electronics.de>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      [bwh: Backported to 3.2:
       - Adjust context
       - Hardware local echo is still enabled]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9203ceb6
    • Claudiu Manoil's avatar
      gianfar: Change default HW Tx queue scheduling mode · c23e05d6
      Claudiu Manoil authored
      commit b98b8bab upstream.
      
      This is primarily to address transmission timeout occurrences, when
      multiple H/W Tx queues are being used concurrently. Because in
      the priority scheduling mode the controller does not service the
      Tx queues equally (but in ascending index order), Tx timeouts are
      being triggered rightaway for a basic test with multiple simultaneous
      connections like:
      iperf -c <server_ip> -n 100M -P 8
      
      resulting in kernel trace:
      NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue <X> timed out
      ------------[ cut here ]------------
      WARNING: at net/sched/sch_generic.c:255
      ...
      and controller reset during intense traffic, and possibly further
      complications.
      
      This patch changes the default H/W Tx scheduling setting (TXSCHED)
      for multi-queue devices, from priority scheduling mode to a weighted
      round robin mode with equal weights for all H/W Tx queues, and
      addresses the issue above.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c23e05d6
    • David Rientjes's avatar
      mm, show_mem: suppress page counts in non-blockable contexts · 7c617fda
      David Rientjes authored
      commit 4b59e6c4 upstream.
      
      On large systems with a lot of memory, walking all RAM to determine page
      types may take a half second or even more.
      
      In non-blockable contexts, the page allocator will emit a page allocation
      failure warning unless __GFP_NOWARN is specified.  In such contexts, irqs
      are typically disabled and such a lengthy delay may even result in NMI
      watchdog timeouts.
      
      To fix this, suppress the page walk in such contexts when printing the
      page allocation failure warning.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7c617fda
    • Lv Zheng's avatar
      ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() · dffcc9dc
      Lv Zheng authored
      commit 06a8566b upstream.
      
      This patch fixes the issues indicated by the test results that
      ipmi_msg_handler() is invoked in atomic context.
      
      BUG: scheduling while atomic: kipmi0/18933/0x10000100
      Modules linked in: ipmi_si acpi_ipmi ...
      CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
      Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
       ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
       ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
       0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
      Call Trace:
       <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
       [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
       [<ffffffff814c73e0>] __schedule+0x83/0x59c
       [<ffffffff81058853>] __cond_resched+0x22/0x2d
       [<ffffffff814c794b>] _cond_resched+0x14/0x1d
       [<ffffffff814c6d82>] mutex_lock+0x11/0x32
       [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
       [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
       [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
       [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
       [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
       [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
       [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
       ...
      
      Also Tony Camuso says:
      
       We were getting occasional "Scheduling while atomic" call traces
       during boot on some systems. Problem was first seen on a Cisco C210
       but we were able to reproduce it on a Cisco c220m3. Setting
       CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
       tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.
      
       =================================
       [ INFO: inconsistent lock state ]
       2.6.32-415.el6.x86_64-debug-splck #1
       ---------------------------------
       inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
       ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
        (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
       {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
         [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
         [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
         [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
         [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
         [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be
      
      The fix implemented by this change has been tested by Tony:
      
       Tested the patch in a boot loop with lockdep debug enabled and never
       saw the problem in over 400 reboots.
      Reported-and-tested-by: default avatarTony Camuso <tcamuso@redhat.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Reviewed-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      dffcc9dc
    • Ian Abbott's avatar
      staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice · 7b376352
      Ian Abbott authored
      commit 677a3156 upstream.
      
      The `insn_bits` handler `ni_65xx_dio_insn_bits()` has a `for` loop that
      currently writes (optionally) and reads back up to 5 "ports" consisting
      of 8 channels each.  It reads up to 32 1-bit channels but can only read
      and write a whole port at once - it needs to handle up to 5 ports as the
      first channel it reads might not be aligned on a port boundary.  It
      breaks out of the loop early if the next port it handles is beyond the
      final port on the card.  It also breaks out early on the 5th port in the
      loop if the first channel was aligned.  Unfortunately, it doesn't check
      that the current port it is dealing with belongs to the comedi subdevice
      the `insn_bits` handler is acting on.  That's a bug.
      
      Redo the `for` loop to terminate after the final port belonging to the
      subdevice, changing the loop variable in the process to simplify things
      a bit.  The `for` loop could now try and handle more than 5 ports if the
      subdevice has more than 40 channels, but the test `if (bitshift >= 32)`
      ensures it will break out early after 4 or 5 ports (depending on whether
      the first channel is aligned on a port boundary).  (`bitshift` will be
      between -7 and 7 inclusive on the first iteration, increasing by 8 for
      each subsequent operation.)
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [Ian Abbott: This patch applies to kernels 2.6.34.y through to 3.5.y
       inclusive.]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7b376352
    • Theodore Ts'o's avatar
      ext4: avoid hang when mounting non-journal filesystems with orphan list · 0cf99861
      Theodore Ts'o authored
      commit 0e9a9a1a upstream.
      
      When trying to mount a file system which does not contain a journal,
      but which does have a orphan list containing an inode which needs to
      be truncated, the mount call with hang forever in
      ext4_orphan_cleanup() because ext4_orphan_del() will return
      immediately without removing the inode from the orphan list, leading
      to an uninterruptible loop in kernel code which will busy out one of
      the CPU's on the system.
      
      This can be trivially reproduced by trying to mount the file system
      found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
      source tree.  If a malicious user were to put this on a USB stick, and
      mount it on a Linux desktop which has automatic mounts enabled, this
      could be considered a potential denial of service attack.  (Not a big
      deal in practice, but professional paranoids worry about such things,
      and have even been known to allocate CVE numbers for such problems.)
      
      -js: This is a fix for CVE-2013-2015.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0cf99861
    • Henrik Rydberg's avatar
      hwmon: (applesmc) Silence uninitialized warnings · c60ce91c
      Henrik Rydberg authored
      commit 0fc86eca upstream.
      
      Some error paths do not set a result, leading to the (false)
      assumption that the value may be used uninitialized. Set results for
      those paths as well.
      Signed-off-by: default avatarHenrik Rydberg <rydberg@euromail.se>
      Signed-off-by: default avatarGuenter Roeck <guenter.roeck@ericsson.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c60ce91c
    • Florian Wolter's avatar
      xhci: Fix race between ep halt and URB cancellation · 47fa3c9b
      Florian Wolter authored
      commit 526867c3 upstream.
      
      The halted state of a endpoint cannot be cleared over CLEAR_HALT from a
      user process, because the stopped_td variable was overwritten in the
      handle_stopped_endpoint() function. So the xhci_endpoint_reset() function will
      refuse the reset and communication with device can not run over this endpoint.
      https://bugzilla.kernel.org/show_bug.cgi?id=60699Signed-off-by: default avatarFlorian Wolter <wolly84@web.de>
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      47fa3c9b