1. 17 Jul, 2012 1 commit
    • Neil Horman's avatar
      sctp: Fix list corruption resulting from freeing an association on a list · 2eebc1e1
      Neil Horman authored
      A few days ago Dave Jones reported this oops:
      
      [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
      [22766.295376] CPU 0
      [22766.295384] Modules linked in:
      [22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
      ffff880147c03a74
      [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
      [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
      [22766.387137] Stack:
      [22766.387140]  ffff880147c03a10
      [22766.387140]  ffffffffa169f2b6
      [22766.387140]  ffff88013ed95728
      [22766.387143]  0000000000000002
      [22766.387143]  0000000000000000
      [22766.387143]  ffff880003fad062
      [22766.387144]  ffff88013c120000
      [22766.387144]
      [22766.387145] Call Trace:
      [22766.387145]  <IRQ>
      [22766.387150]  [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0
      [sctp]
      [22766.387154]  [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
      [22766.387157]  [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
      [22766.387161]  [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [22766.387163]  [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
      [22766.387166]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387168]  [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
      [22766.387169]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387171]  [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
      [22766.387172]  [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
      [22766.387174]  [<ffffffff81590c54>] ip_rcv+0x214/0x320
      [22766.387176]  [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
      [22766.387178]  [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
      [22766.387180]  [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
      [22766.387182]  [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
      [22766.387183]  [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
      [22766.387185]  [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
      [22766.387187]  [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
      [22766.387218]  [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
      [22766.387242]  [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
      [22766.387266]  [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
      [22766.387268]  [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
      [22766.387270]  [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
      [22766.387273]  [<ffffffff810734d0>] __do_softirq+0xe0/0x420
      [22766.387275]  [<ffffffff8169826c>] call_softirq+0x1c/0x30
      [22766.387278]  [<ffffffff8101db15>] do_softirq+0xd5/0x110
      [22766.387279]  [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
      [22766.387281]  [<ffffffff81698b03>] do_IRQ+0x63/0xd0
      [22766.387283]  [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
      [22766.387283]  <EOI>
      [22766.387284]
      [22766.387285]  [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
      [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
      89 e5 48 83
      ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00
      48 89 fb
      49 89 f5 66 c1 c0 08 66 39 46 02
      [22766.387307]
      [22766.387307] RIP
      [22766.387311]  [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
      [22766.387311]  RSP <ffff880147c039b0>
      [22766.387142]  ffffffffa16ab120
      [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
      [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt
      
      It appears from his analysis and some staring at the code that this is likely
      occuring because an association is getting freed while still on the
      sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
      while a freed node corrupts part of the list.
      
      Nominally I would think that an mibalanced refcount was responsible for this,
      but I can't seem to find any obvious imbalance.  What I did note however was
      that the two places where we create an association using
      sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
      which free a newly created association after calling sctp_primitive_ASSOCIATE.
      sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
      issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
      the aforementioned hash table.  the sctp command interpreter that process side
      effects has not way to unwind previously processed commands, so freeing the
      association from the __sctp_connect or sctp_sendmsg error path would lead to a
      freed association remaining on this hash table.
      
      I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
      which allows us to proerly use hlist_unhashed to check if the node is on a
      hashlist safely during a delete.  That in turn alows us to safely call
      sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
      before freeing them, regardles of what the associations state is on the hash
      list.
      
      I noted, while I was doing this, that the __sctp_unhash_endpoint was using
      hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
      pointers to make that function work properly, so I fixed that up in a simmilar
      fashion.
      
      I attempted to test this using a virtual guest running the SCTP_RR test from
      netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
      able to recreate the problem prior to this fix, nor was I able to trigger the
      failure after (neither of which I suppose is suprising).  Given the trace above
      however, I think its likely that this is what we hit.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: davej@redhat.com
      CC: davej@redhat.com
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Sridhar Samudrala <sri@us.ibm.com>
      CC: linux-sctp@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2eebc1e1
  2. 16 Jul, 2012 1 commit
  3. 12 Jul, 2012 1 commit
  4. 11 Jul, 2012 4 commits
  5. 09 Jul, 2012 29 commits
  6. 05 Jul, 2012 1 commit
  7. 04 Jul, 2012 3 commits
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux · 9e85a6f9
      Linus Torvalds authored
      Pull fix to common clk framework from Michael Turquette:
       "The previous set of common clk fixes for -rc5 left an uninitialized
        int which could lead to bad array indexing when switching clock
        parents.  The issue is fixed with a trivial change to the code flow in
        __clk_set_parent."
      
      * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
        clk: fix parent validation in __clk_set_parent()
      9e85a6f9
    • Linus Torvalds's avatar
      Merge tag 'md-3.5-fixes' of git://neil.brown.name/md · 6c8addcb
      Linus Torvalds authored
      Pull raid10 build failure fix from NeilBrown:
       "I really shouldn't do important things late in the day.  It seems that
        I get careless."
      
      * tag 'md-3.5-fixes' of git://neil.brown.name/md:
        md/raid10: fix careless build error
      6c8addcb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 56728748
      Linus Torvalds authored
      Pull networking update from David Miller:
      
       1) Fix RX sequence number handling in mwifiex, from Stone Piao.
      
       2) Netfilter ipset mis-compares device names, fix from Florian
          Westphal.
      
       3) Fix route leak in ipv6 IPVS, from Eric Dumazet.
      
       4) NFS fixes.  Several buffer overflows in NCI layer from Dan
          Rosenberg, and release sock OOPS'er fix from Eric Dumazet.
      
       5) Fix WEP handling ath9k, we started using a bit the chip provides to
          indicate undecrypted packets but that bit turns out to be unreliable
          in certain configurations.  Fix from Felix Fietkau.
      
       6) Fix Kconfig dependency bug in wlcore, from Randy Dunlap.
      
       7) New USB IDs for rtlwifi driver from Larry Finger.
      
       8) Fix crashes in qmi_wwan usbnet driver when disconnecting, from Bjørn
          Mork.
      
       9) Gianfar driver programs coalescing settings properly in single queue
          mode, but does not do so in multi-queue mode.  Fix from Claudiu
          Manoil.
      
      10) Missing module.h include in davinci_cpdma.c, from Daniel Mack.
      
      11) Need dummy handler for IPSET_CMD_NONE otherwise we crash in ipset if
          we get this via nfnetlink, fix from Tomasz Bursztyka.
      
      12) Missing RCU unlock in nfnetlink error path, also from Tomasz.
      
      13) Fix divide by zero in igbvf when the user tries to set an RX
          coalescing value of 0 usecs, from Mitch A Williams.
      
      14) We can process SCTP sacks for the wrong transport, oops.  Fix from
          Neil Horman.
      
      15) Remove hw IP payload checksumming from e1000e driver.  This has zery
          value in our stack, and turning it on creates a very unintuitive
          restriction for users when using jumbo MTUs.
      
          Specifically, when IP payload checksums are on you cannot use both
          receive hashing offload and jumbo MTU.  Fix from Bruce Allan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
        e1000e: remove use of IP payload checksum
        sctp: be more restrictive in transport selection on bundled sacks
        igbvf: fix divide by zero
        netfilter: nfnetlink: fix missing rcu_read_unlock in nfnetlink_rcv_msg
        netfilter: ipset: fix crash if IPSET_CMD_NONE command is sent
        davinci_cpdma: include linux/module.h
        gianfar: Fix RXICr/TXICr programming for multi-queue mode
        net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.
        net: qmi_wwan: fix Oops while disconnecting
        mwifiex: fix memory leak associated with IE manamgement
        ath9k: fix panic caused by returning a descriptor we have queued for reuse
        mac80211: correct behaviour on unrecognised action frames
        ath9k: enable serialize_regmode for non-PCIE AR9287
        rtlwifi: rtl8192cu: New USB IDs
        NFC: Return from rawsock_release when sk is NULL
        iwlwifi: fix activating inactive stations
        wlcore: drop INET dependency
        ath9k: fix dynamic WEP related regression
        NFC: Prevent multiple buffer overflows in NCI
        netfilter: update location of my trees
        ...
      56728748