1. 18 Apr, 2014 35 commits
    • Matthew Leach's avatar
      net: socket: error on a negative msg_namelen · 21ddf0c0
      Matthew Leach authored
      [ Upstream commit dbb490b9 ]
      
      When copying in a struct msghdr from the user, if the user has set the
      msg_namelen parameter to a negative value it gets clamped to a valid
      size due to a comparison between signed and unsigned values.
      
      Ensure the syscall errors when the user passes in a negative value.
      Signed-off-by: default avatarMatthew Leach <matthew.leach@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      21ddf0c0
    • Linus Lüssing's avatar
      bridge: multicast: enable snooping on general queries only · 9ee68ddb
      Linus Lüssing authored
      [ Upstream commit 20a599be ]
      
      Without this check someone could easily create a denial of service
      by injecting multicast-specific queries to enable the bridge
      snooping part if no real querier issuing periodic general queries
      is present on the link which would result in the bridge wrongly
      shutting down ports for multicast traffic as the bridge did not learn
      about these listeners.
      
      With this patch the snooping code is enabled upon receiving valid,
      general queries only.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9ee68ddb
    • Linus Lüssing's avatar
      bridge: multicast: add sanity check for general query destination · b8d3ded9
      Linus Lüssing authored
      [ Upstream commit 9ed973cc ]
      
      General IGMP and MLD queries are supposed to have the multicast
      link-local all-nodes address as their destination according to RFC2236
      section 9, RFC3376 section 4.1.12/9.1, RFC2710 section 8 and RFC3810
      section 5.1.15.
      
      Without this check, such malformed IGMP/MLD queries can result in a
      denial of service: The queries are ignored by most IGMP/MLD listeners
      therefore they will not respond with an IGMP/MLD report. However,
      without this patch these malformed MLD queries would enable the
      snooping part in the bridge code, potentially shutting down the
      according ports towards these hosts for multicast traffic as the
      bridge did not learn about these listeners.
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@web.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b8d3ded9
    • Eric Dumazet's avatar
      tcp: tcp_release_cb() should release socket ownership · 3e52e9e5
      Eric Dumazet authored
      [ Upstream commit c3f9b018 ]
      
      Lars Persson reported following deadlock :
      
      -000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
      -001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
      -002 |ip_local_deliver_finish(skb = 0x8BD527A0)
      -003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
      -004 |netif_receive_skb(skb = 0x8BD527A0)
      -005 |elk_poll(napi = 0x8C770500, budget = 64)
      -006 |net_rx_action(?)
      -007 |__do_softirq()
      -008 |do_softirq()
      -009 |local_bh_enable()
      -010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
      -011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
      -012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
      -013 |tcp_release_cb(sk = 0x8BE6B2A0)
      -014 |release_sock(sk = 0x8BE6B2A0)
      -015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
      -016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
      -017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
      -018 |smb_send_kvec()
      -019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
      -020 |cifs_call_async()
      -021 |cifs_async_writev(wdata = 0x87FD6580)
      -022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
      -023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
      -024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
      -025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
      -026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
      -027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
      -028 |bdi_writeback_workfn(work = 0x87E4A9CC)
      -029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
      -030 |worker_thread(__worker = 0x8B045880)
      -031 |kthread(_create = 0x87CADD90)
      -032 |ret_from_kernel_thread(asm)
      
      Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
      it is running from softirq context.
      
      Lars trace involved a NIC without RX checksum support but other points
      are problematic as well, like the prequeue stuff.
      
      Problem is triggered by a timer, that found socket being owned by user.
      
      tcp_release_cb() should call tcp_write_timer_handler() or
      tcp_delack_timer_handler() in the appropriate context :
      
      BH disabled and socket lock held, but 'owned' field cleared,
      as if they were running from timer handlers.
      
      Fixes: 6f458dfb ("tcp: improve latencies of timer triggered events")
      Reported-by: default avatarLars Persson <lars.persson@axis.com>
      Tested-by: default avatarLars Persson <lars.persson@axis.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3e52e9e5
    • Peter Boström's avatar
      vlan: Set correct source MAC address with TX VLAN offload enabled · bf63917e
      Peter Boström authored
      [ Upstream commit dd38743b ]
      
      With TX VLAN offload enabled the source MAC address for frames sent using the
      VLAN interface is currently set to the address of the real interface. This is
      wrong since the VLAN interface may be configured with a different address.
      
      The bug was introduced in commit 2205369a
      ("vlan: Fix header ops passthru when doing TX VLAN offload.").
      
      This patch sets the source address before calling the create function of the
      real interface.
      Signed-off-by: default avatarPeter Boström <peter.bostrom@netrounds.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bf63917e
    • Eric Dumazet's avatar
      pkt_sched: fq: do not hold qdisc lock while allocating memory · 36d8aca1
      Eric Dumazet authored
      [ Upstream commit 2d8d40af ]
      
      Resizing fq hash table allocates memory while holding qdisc spinlock,
      with BH disabled.
      
      This is definitely not good, as allocation might sleep.
      
      We can drop the lock and get it when needed, we hold RTNL so no other
      changes can happen at the same time.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: afe4fd06 ("pkt_sched: fq: Fair Queue packet scheduler")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      36d8aca1
    • Michael Chan's avatar
      bnx2: Fix shutdown sequence · 1780772e
      Michael Chan authored
      [ Upstream commit a8d9bc2e ]
      
      The pci shutdown handler added in:
      
          bnx2: Add pci shutdown handler
          commit 25bfb1dd
      
      created a shutdown down sequence without chip reset if the device was
      never brought up.  This can cause the firmware to shutdown the PHY
      prematurely and cause MMIO read cycles to be unresponsive.  On some
      systems, it may generate NMI in the bnx2's pci shutdown handler.
      
      The fix is to tell the firmware not to shutdown the PHY if there was
      no prior chip reset.
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1780772e
    • Sabrina Dubroca's avatar
      ipv6: don't set DST_NOCOUNT for remotely added routes · 50fb0faf
      Sabrina Dubroca authored
      [ Upstream commit c88507fb ]
      
      DST_NOCOUNT should only be used if an authorized user adds routes
      locally. In case of routes which are added on behalf of router
      advertisments this flag must not get used as it allows an unlimited
      number of routes getting added remotely.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      50fb0faf
    • Anton Nayshtut's avatar
      ipv6: Fix exthdrs offload registration. · 38de8fd0
      Anton Nayshtut authored
      [ Upstream commit d2d273ff ]
      
      Without this fix, ipv6_exthdrs_offload_init doesn't register IPPROTO_DSTOPTS
      offload, but returns 0 (as the IPPROTO_ROUTING registration actually succeeds).
      
      This then causes the ipv6_gso_segment to drop IPv6 packets with IPPROTO_DSTOPTS
      header.
      
      The issue detected and the fix verified by running MS HCK Offload LSO test on
      top of QEMU Windows guests, as this test sends IPv6 packets with
      IPPROTO_DSTOPTS.
      Signed-off-by: default avatarAnton Nayshtut <anton@swortex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      38de8fd0
    • Eric Dumazet's avatar
      net: unix: non blocking recvmsg() should not return -EINTR · e3749adb
      Eric Dumazet authored
      [ Upstream commit de144391 ]
      
      Some applications didn't expect recvmsg() on a non blocking socket
      could return -EINTR. This possibility was added as a side effect
      of commit b3ca9b02 ("net: fix multithreaded signal handling in
      unix recv routines").
      
      To hit this bug, you need to be a bit unlucky, as the u->readlock
      mutex is usually held for very small periods.
      
      Fixes: b3ca9b02 ("net: fix multithreaded signal handling in unix recv routines")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e3749adb
    • Florian Westphal's avatar
      inet: frag: make sure forced eviction removes all frags · d291aa32
      Florian Westphal authored
      [ Upstream commit e588e2f2 ]
      
      Quoting Alexander Aring:
        While fragmentation and unloading of 6lowpan module I got this kernel Oops
        after few seconds:
      
        BUG: unable to handle kernel paging request at f88bbc30
        [..]
        Modules linked in: ipv6 [last unloaded: 6lowpan]
        Call Trace:
         [<c012af4c>] ? call_timer_fn+0x54/0xb3
         [<c012aef8>] ? process_timeout+0xa/0xa
         [<c012b66b>] run_timer_softirq+0x140/0x15f
      
      Problem is that incomplete frags are still around after unload; when
      their frag expire timer fires, we get crash.
      
      When a netns is removed (also done when unloading module), inet_frag
      calls the evictor with 'force' argument to purge remaining frags.
      
      The evictor loop terminates when accounted memory ('work') drops to 0
      or the lru-list becomes empty.  However, the mem accounting is done
      via percpu counters and may not be accurate, i.e. loop may terminate
      prematurely.
      
      Alter evictor to only stop once the lru list is empty when force is
      requested.
      Reported-by: default avatarPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
      Reported-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Tested-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d291aa32
    • Erik Hugne's avatar
      tipc: don't log disabled tasklet handler errors · d80441c0
      Erik Hugne authored
      [ Upstream commit 2892505e ]
      
      Failure to schedule a TIPC tasklet with tipc_k_signal because the
      tasklet handler is disabled is not an error. It means TIPC is
      currently in the process of shutting down. We remove the error
      logging in this case.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d80441c0
    • Erik Hugne's avatar
      tipc: fix memory leak during module removal · 140490bc
      Erik Hugne authored
      [ Upstream commit 1bb8dce5 ]
      
      When the TIPC module is removed, the tasklet handler is disabled
      before all other subsystems. This will cause lingering publications
      in the name table because the node_down tasklets responsible to
      clean up publications from an unreachable node will never run.
      When the name table is shut down, these publications are detected
      and an error message is logged:
      tipc: nametbl_stop(): orphaned hash chain detected
      This is actually a memory leak, introduced with commit
      993b858e ("tipc: correct the order
      of stopping services at rmmod")
      
      Instead of just logging an error and leaking memory, we free
      the orphaned entries during nametable shutdown.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      140490bc
    • Erik Hugne's avatar
      tipc: drop subscriber connection id invalidation · fa7a24ee
      Erik Hugne authored
      [ Upstream commit edcc0511 ]
      
      When a topology server subscriber is disconnected, the associated
      connection id is set to zero. A check vs zero is then done in the
      subscription timeout function to see if the subscriber have been
      shut down. This is unnecessary, because all subscription timers
      will be cancelled when a subscriber terminates. Setting the
      connection id to zero is actually harmful because id zero is the
      identity of the topology server listening socket, and can cause a
      race that leads to this socket being closed instead.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fa7a24ee
    • Ying Xue's avatar
      tipc: fix connection refcount leak · a9d79658
      Ying Xue authored
      [ Upstream commit 4652edb7 ]
      
      When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
      connection instance, its reference count value is increased if
      it's found. But subsequently if it's found that the connection is
      closed, the work of sending message is not queued into its server
      send workqueue, and the connection reference count is not decreased.
      This will cause a reference count leak. To reproduce this problem,
      an application would need to open and closes topology server
      connections with high intensity.
      
      We fix this by immediately decrementing the connection reference
      count if a send fails due to the connection being closed.
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a9d79658
    • Ying Xue's avatar
      tipc: allow connection shutdown callback to be invoked in advance · 1f92d32f
      Ying Xue authored
      [ Upstream commit 6d4ebeb4 ]
      
      Currently connection shutdown callback function is called when
      connection instance is released in tipc_conn_kref_release(), and
      receiving packets and sending packets are running in different
      threads. Even if connection is closed by the thread of receiving
      packets, its shutdown callback may not be called immediately as
      the connection reference count is non-zero at that moment. So,
      although the connection is shut down by the thread of receiving
      packets, the thread of sending packets doesn't know it. Before
      its shutdown callback is invoked to tell the sending thread its
      connection has been closed, the sending thread may deliver
      messages by tipc_conn_sendmsg(), this is why the following error
      information appears:
      
      "Sending subscription event failed, no memory"
      
      To eliminate it, allow connection shutdown callback function to
      be called before connection id is removed in tipc_close_conn(),
      which makes the sending thread know the truth in time that its
      socket is closed so that it doesn't send message to it. We also
      remove the "Sending XXX failed..." error reporting for topology
      and config services.
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1f92d32f
    • Linus Lüssing's avatar
      bridge: multicast: add sanity check for query source addresses · f8adfb64
      Linus Lüssing authored
      [ Upstream commit 6565b9ee ]
      
      MLD queries are supposed to have an IPv6 link-local source address
      according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch
      adds a sanity check to ignore such broken MLD queries.
      
      Without this check, such malformed MLD queries can result in a
      denial of service: The queries are ignored by any MLD listener
      therefore they will not respond with an MLD report. However,
      without this patch these malformed MLD queries would enable the
      snooping part in the bridge code, potentially shutting down the
      according ports towards these hosts for multicast traffic as the
      bridge did not learn about these listeners.
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@web.de>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f8adfb64
    • Daniel Borkmann's avatar
      net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk · 607e4255
      Daniel Borkmann authored
      [ Upstream commit c485658b ]
      
      While working on ec0223ec ("net: sctp: fix sctp_sf_do_5_1D_ce to
      verify if we/peer is AUTH capable"), we noticed that there's a skb
      memory leakage in the error path.
      
      Running the same reproducer as in ec0223ec and by unconditionally
      jumping to the error label (to simulate an error condition) in
      sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
      the unfreed chunk->auth_chunk skb clone:
      
      Unreferenced object 0xffff8800b8f3a000 (size 256):
        comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
        backtrace:
          [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
          [<ffffffff81566929>] skb_clone+0x49/0xb0
          [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
          [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
          [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
          [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
          [<ffffffff815a64af>] nf_reinject+0xbf/0x180
          [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
          [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
          [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
          [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
          [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
          [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
          [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
          [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380
      
      What happens is that commit bbd0d598 clones the skb containing
      the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
      that an endpoint requires COOKIE-ECHO chunks to be authenticated:
      
        ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
        <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
        ------------------ AUTH; COOKIE-ECHO ---------------->
        <-------------------- COOKIE-ACK ---------------------
      
      When we enter sctp_sf_do_5_1D_ce() and before we actually get to
      the point where we process (and subsequently free) a non-NULL
      chunk->auth_chunk, we could hit the "goto nomem_init" path from
      an error condition and thus leave the cloned skb around w/o
      freeing it.
      
      The fix is to centrally free such clones in sctp_chunk_destroy()
      handler that is invoked from sctp_chunk_free() after all refs have
      dropped; and also move both kfree_skb(chunk->auth_chunk) there,
      so that chunk->auth_chunk is either NULL (since sctp_chunkify()
      allocs new chunks through kmem_cache_zalloc()) or non-NULL with
      a valid skb pointer. chunk->skb and chunk->auth_chunk are the
      only skbs in the sctp_chunk structure that need to be handeled.
      
      While at it, we should use consume_skb() for both. It is the same
      as dev_kfree_skb() but more appropriately named as we are not
      a device but a protocol. Also, this effectively replaces the
      kfree_skb() from both invocations into consume_skb(). Functions
      are the same only that kfree_skb() assumes that the frame was
      being dropped after a failure (e.g. for tools like drop monitor),
      usage of consume_skb() seems more appropriate in function
      sctp_chunk_destroy() though.
      
      Fixes: bbd0d598 ("[SCTP]: Implement the receive and verification of AUTH chunk")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <yasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      607e4255
    • Nikolay Aleksandrov's avatar
      net: fix for a race condition in the inet frag code · e8443124
      Nikolay Aleksandrov authored
      [ Upstream commit 24b9bf43 ]
      
      I stumbled upon this very serious bug while hunting for another one,
      it's a very subtle race condition between inet_frag_evictor,
      inet_frag_intern and the IPv4/6 frag_queue and expire functions
      (basically the users of inet_frag_kill/inet_frag_put).
      
      What happens is that after a fragment has been added to the hash chain
      but before it's been added to the lru_list (inet_frag_lru_add) in
      inet_frag_intern, it may get deleted (either by an expired timer if
      the system load is high or the timer sufficiently low, or by the
      fraq_queue function for different reasons) before it's added to the
      lru_list, then after it gets added it's a matter of time for the
      evictor to get to a piece of memory which has been freed leading to a
      number of different bugs depending on what's left there.
      
      I've been able to trigger this on both IPv4 and IPv6 (which is normal
      as the frag code is the same), but it's been much more difficult to
      trigger on IPv4 due to the protocol differences about how fragments
      are treated.
      
      The setup I used to reproduce this is: 2 machines with 4 x 10G bonded
      in a RR bond, so the same flow can be seen on multiple cards at the
      same time. Then I used multiple instances of ping/ping6 to generate
      fragmented packets and flood the machines with them while running
      other processes to load the attacked machine.
      
      *It is very important to have the _same flow_ coming in on multiple CPUs
      concurrently. Usually the attacked machine would die in less than 30
      minutes, if configured properly to have many evictor calls and timeouts
      it could happen in 10 minutes or so.
      
      An important point to make is that any caller (frag_queue or timer) of
      inet_frag_kill will remove both the timer refcount and the
      original/guarding refcount thus removing everything that's keeping the
      frag from being freed at the next inet_frag_put.  All of this could
      happen before the frag was ever added to the LRU list, then it gets
      added and the evictor uses a freed fragment.
      
      An example for IPv6 would be if a fragment is being added and is at
      the stage of being inserted in the hash after the hash lock is
      released, but before inet_frag_lru_add executes (or is able to obtain
      the lru lock) another overlapping fragment for the same flow arrives
      at a different CPU which finds it in the hash, but since it's
      overlapping it drops it invoking inet_frag_kill and thus removing all
      guarding refcounts, and afterwards freeing it by invoking
      inet_frag_put which removes the last refcount added previously by
      inet_frag_find, then inet_frag_lru_add gets executed by
      inet_frag_intern and we have a freed fragment in the lru_list.
      
      The fix is simple, just move the lru_add under the hash chain locked
      region so when a removing function is called it'll have to wait for
      the fragment to be added to the lru_list, and then it'll remove it (it
      works because the hash chain removal is done before the lru_list one
      and there's no window between the two list adds when the frag can get
      dropped). With this fix applied I couldn't kill the same machine in 24
      hours with the same setup.
      
      Fixes: 3ef0eb0d ("net: frag, move LRU list maintenance outside of
      rwlock")
      
      CC: Florian Westphal <fw@strlen.de>
      CC: Jesper Dangaard Brouer <brouer@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e8443124
    • Gerd Hoffmann's avatar
      drm/cirrus: use drm_set_preferred_mode · 7837b5ae
      Gerd Hoffmann authored
      commit 121a6a17 upstream.
      
      Explicitly set 1024x768 as default mode, so the display doesn't come up
      with the largest supported mode.
      
      While being at it drop first three drm_add_modes_noedid calls.  As
      drm_add_modes_noedid fills the mode list with modes from the database
      *up to* the specified size it is pretty pointless to call it multiple
      times with different sizes.
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7837b5ae
    • Gerd Hoffmann's avatar
      drm: add drm_set_preferred_mode · 1d2fa7e7
      Gerd Hoffmann authored
      commit 3cf70daf upstream.
      
      New helper function to set the preferred video mode.  Can be called
      after drm_add_modes_noedid if you don't want the largest supported
      video mode be used by default.
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1d2fa7e7
    • Adam Jackson's avatar
      fbdev: Make the switch from generic to native driver less alarming · aa66ee97
      Adam Jackson authored
      commit 13ba0ad4 upstream.
      
      Calling this "conflicting" just makes people think there's a problem
      when there's not.
      Signed-off-by: default avatarAdam Jackson <ajax@redhat.com>
      Reviewed-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      aa66ee97
    • Chris Wilson's avatar
      video/fb: Propagate error code from failing to unregister conflicting fb · 9419c62c
      Chris Wilson authored
      commit 46eeb2c1 upstream.
      
      If we fail to remove a conflicting fb driver, we need to abort the
      loading of the second driver to avoid likely kernel panics.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: linux-fbdev@vger.kernel.org
      Cc: dri-devel@lists.freedesktop.org
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9419c62c
    • Gu Zheng's avatar
      fb: reorder the lock sequence to fix potential dead lock · be528908
      Gu Zheng authored
      commit 3a41c5db upstream.
      
      Following commits:
      
      50e244cc fb: rework locking to fix lock ordering on takeover
      e93a9a86 fb: Yet another band-aid for fixing lockdep mess
      054430e7 fbcon: fix locking harder
      
      reworked locking to fix related lock ordering on takeover, and introduced console_lock
      into fbmem, but it seems that the new lock sequence(fb_info->lock ---> console_lock)
      is against with the one in console_callback(console_lock ---> fb_info->lock), and leads to
      a potential dead lock as following:
      
      [  601.079000] ======================================================
      [  601.079000] [ INFO: possible circular locking dependency detected ]
      [  601.079000] 3.11.0 #189 Not tainted
      [  601.079000] -------------------------------------------------------
      [  601.079000] kworker/0:3/619 is trying to acquire lock:
      [  601.079000]  (&fb_info->lock){+.+.+.}, at: [<ffffffff81397566>] lock_fb_info+0x26/0x60
      [  601.079000]
      but task is already holding lock:
      [  601.079000]  (console_lock){+.+.+.}, at: [<ffffffff8141aae3>] console_callback+0x13/0x160
      [  601.079000]
      which lock already depends on the new lock.
      
      [  601.079000]
      the existing dependency chain (in reverse order) is:
      [  601.079000]
      -> #1 (console_lock){+.+.+.}:
      [  601.079000]        [<ffffffff810dc971>] lock_acquire+0xa1/0x140
      [  601.079000]        [<ffffffff810c6267>] console_lock+0x77/0x80
      [  601.079000]        [<ffffffff81399448>] register_framebuffer+0x1d8/0x320
      [  601.079000]        [<ffffffff81cfb4c8>] efifb_probe+0x408/0x48f
      [  601.079000]        [<ffffffff8144a963>] platform_drv_probe+0x43/0x80
      [  601.079000]        [<ffffffff8144853b>] driver_probe_device+0x8b/0x390
      [  601.079000]        [<ffffffff814488eb>] __driver_attach+0xab/0xb0
      [  601.079000]        [<ffffffff814463bd>] bus_for_each_dev+0x5d/0xa0
      [  601.079000]        [<ffffffff81447e6e>] driver_attach+0x1e/0x20
      [  601.079000]        [<ffffffff81447a07>] bus_add_driver+0x117/0x290
      [  601.079000]        [<ffffffff81448fea>] driver_register+0x7a/0x170
      [  601.079000]        [<ffffffff8144a10a>] __platform_driver_register+0x4a/0x50
      [  601.079000]        [<ffffffff8144a12d>] platform_driver_probe+0x1d/0xb0
      [  601.079000]        [<ffffffff81cfb0a1>] efifb_init+0x273/0x292
      [  601.079000]        [<ffffffff81002132>] do_one_initcall+0x102/0x1c0
      [  601.079000]        [<ffffffff81cb80a6>] kernel_init_freeable+0x15d/0x1ef
      [  601.079000]        [<ffffffff8166d2de>] kernel_init+0xe/0xf0
      [  601.079000]        [<ffffffff816914ec>] ret_from_fork+0x7c/0xb0
      [  601.079000]
      -> #0 (&fb_info->lock){+.+.+.}:
      [  601.079000]        [<ffffffff810dc1d8>] __lock_acquire+0x1e18/0x1f10
      [  601.079000]        [<ffffffff810dc971>] lock_acquire+0xa1/0x140
      [  601.079000]        [<ffffffff816835ca>] mutex_lock_nested+0x7a/0x3b0
      [  601.079000]        [<ffffffff81397566>] lock_fb_info+0x26/0x60
      [  601.079000]        [<ffffffff813a4aeb>] fbcon_blank+0x29b/0x2e0
      [  601.079000]        [<ffffffff81418658>] do_blank_screen+0x1d8/0x280
      [  601.079000]        [<ffffffff8141ab34>] console_callback+0x64/0x160
      [  601.079000]        [<ffffffff8108d855>] process_one_work+0x1f5/0x540
      [  601.079000]        [<ffffffff8108e04c>] worker_thread+0x11c/0x370
      [  601.079000]        [<ffffffff81095fbd>] kthread+0xed/0x100
      [  601.079000]        [<ffffffff816914ec>] ret_from_fork+0x7c/0xb0
      [  601.079000]
      other info that might help us debug this:
      
      [  601.079000]  Possible unsafe locking scenario:
      
      [  601.079000]        CPU0                    CPU1
      [  601.079000]        ----                    ----
      [  601.079000]   lock(console_lock);
      [  601.079000]                                lock(&fb_info->lock);
      [  601.079000]                                lock(console_lock);
      [  601.079000]   lock(&fb_info->lock);
      [  601.079000]
       *** DEADLOCK ***
      
      so we reorder the lock sequence the same as it in console_callback() to
      avoid this issue. And following Tomi's suggestion, fix these similar
      issues all in fb subsystem.
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      be528908
    • Takashi Iwai's avatar
      drm: Prefer noninterlace cmdline mode unless explicitly specified · c9c5b01e
      Takashi Iwai authored
      commit c683f427 upstream.
      
      Currently drm_pick_cmdline_mode() doesn't care about the interlace
      when the given mode line has no "i" suffix.  That is, when there are
      multiple entries for the same resolution, an interlace mode might be
      picked up just depending on the assigned order, and there is no way to
      exclude it.
      
      This patch changes the logic for the mode selection, to prefer the
      noninterlace mode unless the interlace mode is explicitly given.
      When no matching mode is found, it still tries the interlace mode as
      fallback.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c9c5b01e
    • Alex Deucher's avatar
      drm/radeon: enable speaker allocation setup on dce3.2 · 21a4209d
      Alex Deucher authored
      commit 3803c8e5 upstream.
      
      Now that we disable audio while setting up the audio
      hw, we should be able to set this up without hangs.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      21a4209d
    • Alex Deucher's avatar
      drm/radeon: change audio enable logic · d6bb21f4
      Alex Deucher authored
      commit 832eafaf upstream.
      
      Disable audio around audio hw setup.  This may avoid
      hangs on certain asics.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d6bb21f4
    • Martin Koegler's avatar
      drm/cirrus: Fix cirrus drm driver for fbdev + qemu · 929476d0
      Martin Koegler authored
      commit 99d4a8ae upstream.
      
      Xorg fbdev driver requires smem_start/smem_len, otherwise
      it tries to map 0 bytes as video memory.
      
      Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=856760Signed-off-by: default avatarMartin Koegler <martin.koegler@chello.at>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      929476d0
    • Chris Wilson's avatar
      drm/i915: Undo the PIPEA quirk for i845 · 14fa4b54
      Chris Wilson authored
      commit a4945f95 upstream.
      
      The PIPEA quirk is specifically for the issue with the PIPEB PLL on
      830gm being slaved to the PIPEA PLL, and so to use PIPEB requires PIPEA
      running. i845 doesn't even have the second PLL or pipe, and enabling
      the quirk results in a blank DVO LVDS.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      14fa4b54
    • Jiri Kosina's avatar
      floppy: bail out in open() if drive is not responding to block0 read · e6732e00
      Jiri Kosina authored
      commit 7b7b68bb upstream.
      
      In case reading of block 0 during open() fails, it is not the right thing
      to let open() succeed.
      
      Fix this by introducing FD_OPEN_SHOULD_FAIL_BIT flag, and setting it in
      case the bio callback encounters an error while trying to read block 0.
      
      As a bonus, this works around certain broken userspace (blkid), which is
      not able to properly handle read()s returning IO errors. Hence be nice to
      those, and bail out during open() already; if block 0 is not readable,
      read()s are not going to provide any meaningful data anyway.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e6732e00
    • Jan Kara's avatar
      ext4: Speedup WB_SYNC_ALL pass called from sync(2) · 3042adcc
      Jan Kara authored
      commit 10542c22 upstream.
      
      When doing filesystem wide sync, there's no need to force transaction
      commit (or synchronously write inode buffer) separately for each inode
      because ext4_sync_fs() takes care of forcing commit at the end (VFS
      takes care of flushing buffer cache, respectively). Most of the time
      this slowness doesn't manifest because previous WB_SYNC_NONE writeback
      doesn't leave much to write but when there are processes aggressively
      creating new files and several filesystems to sync, the sync slowness
      can be noticeable. In the following test script sync(1) takes around 6
      minutes when there are two ext4 filesystems mounted on a standard SATA
      drive. After this patch sync takes a couple of seconds so we have about
      two orders of magnitude improvement.
      
            function run_writers
            {
              for (( i = 0; i < 10; i++ )); do
                mkdir $1/dir$i
                for (( j = 0; j < 40000; j++ )); do
                  dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
                done &
              done
            }
      
            for dir in "$@"; do
              run_writers $dir
            done
      
            sleep 40
            time sync
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3042adcc
    • Trond Myklebust's avatar
      SUNRPC: Fix potential memory scribble in xprt_free_bc_request() · 8eab668e
      Trond Myklebust authored
      commit 62835679 upstream.
      
      The call to xprt_free_allocation() will call list_del() on
      req->rq_bc_pa_list, which is not attached to a list.
      This patch moves the list_del() out of xprt_free_allocation()
      and into those callers that need it.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8eab668e
    • Trond Myklebust's avatar
      NFSv3: Fix return value of nfs3_proc_setacls · 173d76aa
      Trond Myklebust authored
      commit 8f493b9c upstream.
      
      nfs3_proc_setacls is used internally by the NFSv3 create operations
      to set the acl after the file has been created. If the operation
      fails because the server doesn't support acls, then it must return '0',
      not -EOPNOTSUPP.
      Reported-by: default avatarRussell King <linux@arm.linux.org.uk>
      Link: http://lkml.kernel.org/r/20140201010328.GI15937@n2100.arm.linux.org.uk
      Cc: Christoph Hellwig <hch@lst.de>
      Tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Acked-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      173d76aa
    • Malahal Naineni's avatar
      nfs: initialize the ACL support bits to zero. · 0eec4308
      Malahal Naineni authored
      commit a1800aca upstream.
      
      Avoid returning incorrect acl mask attributes when the server doesn't
      support ACLs.
      Signed-off-by: default avatarMalahal Naineni <malahal@us.ibm.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0eec4308
    • Jiri Slaby's avatar
      Char: ipmi_bt_sm, fix infinite loop · 1070f326
      Jiri Slaby authored
      commit a94cdd1f upstream.
      
      In read_all_bytes, we do
      
        unsigned char i;
        ...
        bt->read_data[0] = BMC2HOST;
        bt->read_count = bt->read_data[0];
        ...
        for (i = 1; i <= bt->read_count; i++)
          bt->read_data[i] = BMC2HOST;
      
      If bt->read_data[0] == bt->read_count == 255, we loop infinitely in the
      'for' loop.  Make 'i' an 'int' instead of 'char' to get rid of the
      overflow and finish the loop after 255 iterations every time.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reported-and-debugged-by: default avatarRui Hui Dian <rhdian@novell.com>
      Cc: Tomas Cech <tcech@suse.cz>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: <openipmi-developer@lists.sourceforge.net>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1070f326
  2. 13 Apr, 2014 5 commits