1. 18 Sep, 2015 15 commits
  2. 03 Jun, 2015 3 commits
    • Willy Tarreau's avatar
      Linux 2.6.32.67 · 00b90e79
      Willy Tarreau authored
      00b90e79
    • Junling Zheng's avatar
      net: socket: Fix the wrong returns for recvmsg and sendmsg · a8226a63
      Junling Zheng authored
      Based on 08adb7da upstream.
      
      We found that after v3.10.73, recvmsg might return -EFAULT while -EINVAL
      was expected.
      
      We tested it through the recvmsg01 testcase come from LTP testsuit. It set
      msg->msg_namelen to -1 and the recvmsg syscall returned errno 14, which is
      unexpected (errno 22 is expected):
      
      recvmsg01    4  TFAIL  :  invalid socket length ; returned -1 (expected -1),
      errno 14 (expected 22)
      
      Linux mainline has no this bug for commit 08adb7da fixes it accidentally.
      However, it is too large and complex to be backported to LTS 3.10.
      
      Commit 281c9c36 (net: compat: Update get_compat_msghdr() to match
      copy_msghdr_from_user() behaviour) made get_compat_msghdr() return
      error if msg_sys->msg_namelen was negative, which changed the behaviors
      of recvmsg and sendmsg syscall in a lib32 system:
      
      Before commit 281c9c36, get_compat_msghdr() wouldn't fail and it would
      return -EINVAL in move_addr_to_user() or somewhere if msg_sys->msg_namelen
      was invalid and then syscall returned -EINVAL, which is correct.
      
      And now, when msg_sys->msg_namelen is negative, get_compat_msghdr() will
      fail and wants to return -EINVAL, however, the outer syscall will return
      -EFAULT directly, which is unexpected.
      
      This patch gets the return value of get_compat_msghdr() as well as
      copy_msghdr_from_user(), then returns this expected value if
      get_compat_msghdr() fails.
      
      Fixes: 281c9c36 (net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour)
      Signed-off-by: default avatarJunling Zheng <zhengjunling@huawei.com>
      Signed-off-by: default avatarHanbing Xu <xuhanbing@huawei.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a8226a63
    • Willy Tarreau's avatar
      net: fix incorrect backport of tcp_send_fin in 2.6.32.66 · cb162b71
      Willy Tarreau authored
      Eric forwarded this bug report happening since 2.6.32.66, found that the
      backport of commit 845704a5 ("tcp: avoid looping in tcp_send_fin()") was
      incorrect and proposed this patch to fix it. The bug was also reported by
      starlight.2015q2@binnacle.cx who confirmed the fix.
      
      > Date: Fri, 29 May 2015 09:12:45 +0000
      > From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org>
      > To: "shemminger@linux-foundation.org" <shemminger@linux-foundation.org>
      > Subject: [Bug 99161] New: 2.6.32.66 PPC Oops in tcp_send_fin
      >
      >
      > https://bugzilla.kernel.org/show_bug.cgi?id=99161
      >
      >             Bug ID: 99161
      >            Summary: 2.6.32.66 PPC Oops in tcp_send_fin
      >            Product: Networking
      >            Version: 2.5
      >     Kernel Version: 2.6.32.66
      >           Hardware: PPC-32
      >                 OS: Linux
      >               Tree: Mainline
      >             Status: NEW
      >           Severity: normal
      >           Priority: P1
      >          Component: IPV4
      >           Assignee: shemminger@linux-foundation.org
      >           Reporter: varenet@parisc-linux.org
      >         Regression: No
      >
      > I just updated my trusty old PPC box to longterm 2.6.32.66 (was running .65
      > before that with zero issue) and it started spewing oopses at me like hell
      > broke loose. This machine is primarily used as a DNS and MX (albeit under low
      > pressure).
      (...)
      
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      cb162b71
  3. 24 May, 2015 22 commits
    • Willy Tarreau's avatar
      Linux 2.6.32.66 · b4736065
      Willy Tarreau authored
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b4736065
    • Catalin Marinas's avatar
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · db91e176
      Catalin Marinas authored
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 91edd096)
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      db91e176
    • Alexey Khoroshilov's avatar
      sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND) · c8412c6c
      Alexey Khoroshilov authored
      A deadlock can be initiated by userspace via ioctl(SNDCTL_SEQ_OUTOFBAND)
      on /dev/sequencer with TMR_ECHO midi event.
      
      In this case the control flow is:
      sound_ioctl()
      -> case SND_DEV_SEQ:
         case SND_DEV_SEQ2:
           sequencer_ioctl()
           -> case SNDCTL_SEQ_OUTOFBAND:
                spin_lock_irqsave(&lock,flags);
                play_event();
                -> case EV_TIMING:
                     seq_timing_event()
                     -> case TMR_ECHO:
                          seq_copy_to_input()
                          -> spin_lock_irqsave(&lock,flags);
      
      It seems that spin_lock_irqsave() around play_event() is not necessary,
      because the only other call location in seq_startplay() makes the call
      without acquiring spinlock.
      
      So, the patch just removes spinlocks around play_event().
      By the way, it removes unreachable code in seq_timing_event(),
      since (seq_mode == SEQ_2) case is handled in the beginning.
      
      Compile tested only.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      (cherry picked from commit bc26d4d0)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c8412c6c
    • Sergei Antonov's avatar
      hfsplus: fix B-tree corruption after insertion at position 0 · 533c1137
      Sergei Antonov authored
      commit 98cf21c6 upstream.
      
      Fix B-tree corruption when a new record is inserted at position 0 in the node
      in hfs_brec_insert(). In this case a hfs_brec_update_parent() is called to
      update the parent index node (if exists) and it is passed hfs_find_data with
      a search_key containing a newly inserted key instead of the key to be updated.
      This results in an inconsistent index node. The bug reproduces on my machine
      after an extents overflow record for the catalog file (CNID=4) is inserted into
      the extents overflow B-tree. Because of a low (reserved) value of CNID=4, it
      has to become the first record in the first leaf node.
      The resulting first leaf node is correct:
      ----------------------------------------------------
      | key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
      ----------------------------------------------------
      But the parent index key0 still contains the previous key CNID=123:
      -----------------------
      | key0.CNID=123 | ... |
      -----------------------
      
      A change in hfs_brec_insert() makes hfs_brec_update_parent() work correctly
      by preventing it from getting fd->record=-1 value from __hfs_brec_find().
      
      Along the way, I removed duplicate code with unification of the if condition.
      The resulting code is equivalent to the original code because node is never 0.
      
      Also hfs_brec_update_parent() will now return an error after getting a negative
      fd->record value. However, the return value of hfs_brec_update_parent() is not
      checked anywhere in the file and I'm leaving it unchanged by this patch.
      brec.c lacks error checking after some other calls too, but this issue is of
      less importance than the one being fixed by this patch.
      
      Cc: stable@vger.kernel.org
      Cc: Joe Perches <joe@perches.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
      Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
      Cc: Anton Altaparmakov <aia21@cam.ac.uk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      533c1137
    • Mathias Krause's avatar
      posix-timers: Fix stack info leak in timer_create() · d7733a75
      Mathias Krause authored
      commit 6891c450 upstream.
      
      If userland creates a timer without specifying a sigevent info, we'll
      create one ourself, using a stack local variable. Particularly will we
      use the timer ID as sival_int. But as sigev_value is a union containing
      a pointer and an int, that assignment will only partially initialize
      sigev_value on systems where the size of a pointer is bigger than the
      size of an int. On such systems we'll copy the uninitialized stack bytes
      from the timer_create() call to userland when the timer actually fires
      and we're going to deliver the signal.
      
      Initialize sigev_value with 0 to plug the stack info leak.
      
      Found in the PaX patch, written by the PaX Team.
      
      Fixes: 5a9fa730 ("posix-timers: kill ->it_sigev_signo and...")
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: PaX Team <pageexec@freemail.hu>
      Link: http://lkml.kernel.org/r/1412456799-32339-1-git-send-email-minipli@googlemail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 3cd3a349)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d7733a75
    • Jan Kara's avatar
      scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND · b5f10e98
      Jan Kara authored
      commit 84ce0f0e upstream.
      
      When sg_scsi_ioctl() fails to prepare request to submit in
      blk_rq_map_kern() we jump to a label where we just end up copying
      (luckily zeroed-out) kernel buffer to userspace instead of reporting
      error. Fix the problem by jumping to the right label.
      
      CC: Jens Axboe <axboe@kernel.dk>
      CC: linux-scsi@vger.kernel.org
      Coverity-id: 1226871
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      
      Fixed up the, now unused, out label.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit d73b032b)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b5f10e98
    • Benjamin Coddington's avatar
      lockd: Try to reconnect if statd has moved · a3ea9448
      Benjamin Coddington authored
      commit 173b3afc upstream.
      
      If rpc.statd is restarted, upcalls to monitor hosts can fail with
      ECONNREFUSED.  In that case force a lookup of statd's new port and retry the
      upcall.
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      [bwh: Backported to 3.2: not using RPC_TASK_SOFTCONN]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 3aabe891)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a3ea9448
    • Kirill A. Shutemov's avatar
      pagemap: do not leak physical addresses to non-privileged userspace · 036783b2
      Kirill A. Shutemov authored
      commit ab676b7d upstream.
      
      As pointed by recent post[1] on exploiting DRAM physical imperfection,
      /proc/PID/pagemap exposes sensitive information which can be used to do
      attacks.
      
      This disallows anybody without CAP_SYS_ADMIN to read the pagemap.
      
      [1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
      
      [ Eventually we might want to do anything more finegrained, but for now
        this is the simple model.   - Linus ]
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Mark Seaborn <mseaborn@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [mancha security: Backported to 3.10]
      Signed-off-by: default avatarmancha security <mancha1@zoho.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 1ffc3cd9)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      036783b2
    • Jiri Pirko's avatar
      ipv4: fix nexthop attlen check in fib_nh_match · 66dff16a
      Jiri Pirko authored
      commit f76936d0 upstream.
      
      fib_nh_match does not match nexthops correctly. Example:
      
      ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
                                nexthop via 192.168.122.13 dev eth0
      ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
                                nexthop via 192.168.122.15 dev eth0
      
      Del command is successful and route is removed. After this patch
      applied, the route is correctly matched and result is:
      RTNETLINK answers: No such process
      
      Please consider this for stable trees as well.
      
      Fixes: 4e902c57 ("[IPv4]: FIB configuration using struct fib_config")
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 0aba46ad)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      66dff16a
    • Dan Carpenter's avatar
      ipvs: uninitialized data with IP_VS_IPV6 · 1d37a6b8
      Dan Carpenter authored
      commit 3b05ac38 upstream.
      
      The app_tcp_pkt_out() function expects "*diff" to be set and ends up
      using uninitialized data if CONFIG_IP_VS_IPV6 is turned on.
      
      The same issue is there in app_tcp_pkt_in().  Thanks to Julian Anastasov
      for noticing that.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      (cherry picked from commit 0ce625ba)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1d37a6b8
    • Eli Cohen's avatar
      IB/core: Avoid leakage from kernel to user space · 43318c2e
      Eli Cohen authored
      commit 377b5134 upstream.
      
      Clear the reserved field of struct ib_uverbs_async_event_desc which is
      copied to user space.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Reviewed-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      (cherry picked from commit 852acc01)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      43318c2e
    • Ian Abbott's avatar
      spi: spidev: fix possible arithmetic overflow for multi-transfer message · abe59a6a
      Ian Abbott authored
      commit f20fbaad upstream.
      
      `spidev_message()` sums the lengths of the individual SPI transfers to
      determine the overall SPI message length.  It restricts the total
      length, returning an error if too long, but it does not check for
      arithmetic overflow.  For example, if the SPI message consisted of two
      transfers and the first has a length of 10 and the second has a length
      of (__u32)(-1), the total length would be seen as 9, even though the
      second transfer is actually very long.  If the second transfer specifies
      a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could
      overrun the spidev's pre-allocated tx buffer before it reaches an
      invalid user memory address.  Fix it by checking that neither the total
      nor the individual transfer lengths exceed the maximum allowed value.
      
      Thanks to Dan Carpenter for reporting the potential integer overflow.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      [Ian Abbott: Note: original commit compares the lengths to INT_MAX
       instead of bufsiz due to changes in earlier commits.]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 7499401e)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      abe59a6a
    • Eric Dumazet's avatar
      tcp: avoid looping in tcp_send_fin() · f944afb2
      Eric Dumazet authored
      [ Upstream commit 845704a5 ]
      
      Presence of an unbound loop in tcp_send_fin() had always been hard
      to explain when analyzing crash dumps involving gigantic dying processes
      with millions of sockets.
      
      Lets try a different strategy :
      
      In case of memory pressure, try to add the FIN flag to last packet
      in write queue, even if packet was already sent. TCP stack will
      be able to deliver this FIN after a timeout event. Note that this
      FIN being delivered by a retransmit, it also carries a Push flag
      given our current implementation.
      
      By checking sk_under_memory_pressure(), we anticipate that cooking
      many FIN packets might deplete tcp memory.
      
      In the case we could not allocate a packet, even with __GFP_WAIT
      allocation, then not sending a FIN seems quite reasonable if it allows
      to get rid of this socket, free memory, and not block the process from
      eventually doing other useful work.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - Drop inapplicable change to sk_forced_wmem_schedule()
       - s/sk_under_memory_pressure(sk)/tcp_memory_pressure/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 82241580)
      [wt: backported to 2.6.32: s/TCPHDR_FIN/TCPCB_FLAG_FIN/]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f944afb2
    • Sebastian Phn's avatar
      ip_forward: Drop frames with attached skb->sk · b19feb6e
      Sebastian Phn authored
      [ Upstream commit 2ab95749 ]
      
      Initial discussion was:
      [FYI] xfrm: Don't lookup sk_policy for timewait sockets
      
      Forwarded frames should not have a socket attached. Especially
      tw sockets will lead to panics later-on in the stack.
      
      This was observed with TPROXY assigning a tw socket and broken
      policy routing (misconfigured). As a result frame enters
      forwarding path instead of input. We cannot solve this in
      TPROXY as it cannot know that policy routing is broken.
      
      v2:
      Remove useless comment
      Signed-off-by: default avatarSebastian Poehn <sebastian.poehn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit fccb908d)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b19feb6e
    • Eric Dumazet's avatar
      tcp: make connect() mem charging friendly · b49fbe0a
      Eric Dumazet authored
      [ Upstream commit 355a901e ]
      
      While working on sk_forward_alloc problems reported by Denys
      Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
      sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
      sk_forward_alloc is negative while connect is in progress.
      
      We can fix this by calling regular sk_stream_alloc_skb() both for the
      SYN packet (in tcp_connect()) and the syn_data packet in
      tcp_send_syn_data()
      
      Then, tcp_send_syn_data() can avoid copying syn_data as we simply
      can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
      
      Instead of open coding memcpy_fromiovecend(), simply use this helper.
      
      This leaves in socket write queue clean fast clone skbs.
      
      This was tested against our fastopen packetdrill tests.
      Reported-by: default avatarDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - Drop the Fast Open changes
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 3e2eb894)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b49fbe0a
    • Al Viro's avatar
      rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() · 876846f7
      Al Viro authored
      [ Upstream commit 7d985ed1 ]
      
      [I would really like an ACK on that one from dhowells; it appears to be
      quite straightforward, but...]
      
      MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
      fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
      in there.  It gets passed via flags; in fact, another such check in the same
      function is done correctly - as flags & MSG_PEEK.
      
      It had been that way (effectively disabled) for 8 years, though, so the patch
      needs beating up - that case had never been tested.  If it is correct, it's
      -stable fodder.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 10c82cd7)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      876846f7
    • Arnd Bergmann's avatar
      rds: avoid potential stack overflow · 71372d0e
      Arnd Bergmann authored
      [ Upstream commit f862e07c ]
      
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 3fe2d645)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      71372d0e
    • Alexey Kodanev's avatar
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · c5424cf0
      Alexey Kodanev authored
      [ Upstream commit b1cb59cf ]
      
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: delete now-unused 'one' variable]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 2d6dfb10)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c5424cf0
    • bingtian.ly@taobao.com's avatar
      net: avoid to hang up on sending due to sysctl configuration overflow. · 0f8a4ca1
      bingtian.ly@taobao.com authored
      commit cdda8891 upstream.
      
          I found if we write a larger than 4GB value to some sysctl
      variables, the sending syscall will hang up forever, because these
      variables are 32 bits, such large values make them overflow to 0 or
      negative.
      
          This patch try to fix overflow or prevent from zero value setup
      of below sysctl variables:
      
      net.core.wmem_default
      net.core.rmem_default
      
      net.core.rmem_max
      net.core.wmem_max
      
      net.ipv4.udp_rmem_min
      net.ipv4.udp_wmem_min
      
      net.ipv4.tcp_wmem
      net.ipv4.tcp_rmem
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarLi Yu <raise.sail@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - Adjust context
       - Delete now-unused 'zero' variable]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 98eee187)
      [wt: backported to 2.6.32: set strategy to sysctl_intvec where relevant]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0f8a4ca1
    • Michal Kubeček's avatar
      udp: only allow UFO for packets from SOCK_DGRAM sockets · 65f26669
      Michal Kubeček authored
      [ Upstream commit acf8dd0a ]
      
      If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a
      UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to
      CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer
      checksum is to be computed on segmentation. However, in this case,
      skb->csum_start and skb->csum_offset are never set as raw socket
      transmit path bypasses udp_send_skb() where they are usually set. As a
      result, driver may access invalid memory when trying to calculate the
      checksum and store the result (as observed in virtio_net driver).
      
      Moreover, the very idea of modifying the userspace provided UDP header
      is IMHO against raw socket semantics (I wasn't able to find a document
      clearly stating this or the opposite, though). And while allowing
      CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit
      too intrusive change just to handle a corner case like this. Therefore
      disallowing UFO for packets from SOCK_DGRAM seems to be the best option.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 332640b2)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      65f26669
    • Steffen Klassert's avatar
      ipv4: Don't use ufo handling on later transformed packets · 417b2efd
      Steffen Klassert authored
      We might call ip_ufo_append_data() for packets that will be IPsec
      transformed later. This function should be used just for real
      udp packets. So we check for rt->dst.header_len which is only
      nonzero on IPsec handling and call ip_ufo_append_data() just
      if rt->dst.header_len is zero.
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit c146066a)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      417b2efd
    • Matthew Thode's avatar
      net: reject creation of netdev names with colons · a7357ca5
      Matthew Thode authored
      [ Upstream commit a4176a93 ]
      
      colons are used as a separator in netdev device lookup in dev_ioctl.c
      
      Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME
      Signed-off-by: default avatarMatthew Thode <mthode@mthode.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit d501ebeb)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a7357ca5