1. 31 Jul, 2012 12 commits
  2. 19 Jul, 2012 6 commits
  3. 18 Jul, 2012 2 commits
  4. 17 Jul, 2012 15 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a0185401
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) IPVS oops'ers:
         a) Should not reset skb->nf_bridge in forwarding hook (Lin Ming)
         b) 3.4 commit can cause ip_vs_control_cleanup to be invoked after
            the ipvs_core_ops are unregistered during rmmod (Julian ANastasov)
      
       2) ixgbevf bringup failure can crash in TX descriptor cleanup
          (Alexander Duyck)
      
       3) AX25 switch missing break statement hoses ROSE sockets (Alan Cox)
      
       4) CAIF accesses freed per-net memory (Sjur Brandeland)
      
       5) Network cgroup code has out-or-bounds accesses (Eric DUmazet), and
          accesses freed memory (Gao Feng)
      
       6) Fix a crash in SCTP reported by Dave Jones caused by freeing an
          association still on a list (Neil HOrman)
      
       7) __netdev_alloc_skb() regresses on GFP_DMA using drivers because that
          GFP flag is not being retained for the allocation (Eric Dumazet).
      
       8) Missing NULL hceck in sch_sfb netlink message parsing (Alan Cox)
      
       9) bnx2 crashes because TX index iteration is not bounded correctly
          (Michael Chan)
      
      10) IPoIB generates warnings in TCP queue collapsing (via
          skb_try_coalesce) because it does not set skb->truesize correctly
          (Eric Dumazet)
      
      11) vlan_info objects leak for the implicit vlan with ID 0 (Amir
          Hanania)
      
      12) A fix for TX time stamp handling in gianfar does not transfer socket
          ownership from one packet to another correctly, resulting in a
          socket write space imbalance (Eric Dumazet)
      
      13) Julia Lawall found several cases where we do a list iteration, and
          then at the loop termination unconditionally assume we ended up with
          real list object, rather than the list head itself (CNIC, RXRPC,
          mISDN).
      
      14) The bonding driver handles procfs moving incorrectly when a device
          it manages is moved from one namespace to another (Eric Biederman)
      
      15) Missing memory barriers in stmmac descriptor accesses result in
          various crashes (Deepak Sikri)
      
      16) Fix handling of broadcast packets in batman-adv (Simon Wunderlich)
      
      17) Properly check the sanity of sendmsg() lengths in ieee802154's
          dgram_sendmsg().  Dave Jones and others have hit and reported this
          bug (Sasha Levin)
      
      18) Some drivers (b44 and b43legacy) on 64-bit machines stopped working
          because of how netdev_alloc_skb() was adjusted.  Such drivers should
          now use alloc_skb() for obtaining bounce buffers.  (Eric Dumazet)
      
      19) atl1c mis-managed it's link state in that it stops the queue by hand
          on link down.  The generic networking takes care of that and this
          double stop locks the queue down.  So simply removing the driver's
          queue stop call fixes the problem (Cloud Ren)
      
      20) Fix out-of-memory due to mis-accounting in net_em packet scheduler
          (Eric Dumazet)
      
      21) If DCB and SR-IOV are configured at the same time in IXGBE the chip
          will hang because this is not supported (Alexander Duyck)
      
      22) A commit to stop drivers using netdev->base_addr broke the CNIC
          driver (Michael Chan)
      
      23) Timeout regression in ipset caused by an attempt to fix an overflow
          bug (Jozsef Kadlecsik).
      
      24) mac80211 minstrel code allocates memory using incorrect size
          (Thomas Huehn)
      
      25) llcp_sock_getname() needs to check for a NULL device otherwise we
          OOPS (Sasha Levin)
      
      26) mwifiex leaks memory (Bing Zhao)
      
      27) Propagate iwlwifi fix to iwlegacy, even when we're not associated
          we need to monitor for stuck queues in the watchdog handler
          (Stanislaw Geuszka)
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        ipvs: fix oops in ip_vs_dst_event on rmmod
        ipvs: fix oops on NAT reply in br_nf context
        ixgbevf: Fix panic when loading driver
        ax25: Fix missing break
        MAINTAINERS: reflect actual changes in IEEE 802.15.4 maintainership
        caif: Fix access to freed pernet memory
        net: cgroup: fix access the unallocated memory in netprio cgroup
        ixgbevf: Prevent RX/TX statistics getting reset to zero
        sctp: Fix list corruption resulting from freeing an association on a list
        net: respect GFP_DMA in __netdev_alloc_skb()
        e1000e: fix test for PHY being accessible on 82577/8/9 and I217
        e1000e: Correct link check logic for 82571 serdes
        sch_sfb: Fix missing NULL check
        bnx2: Fix bug in bnx2_free_tx_skbs().
        IPoIB: fix skb truesize underestimatiom
        net: Fix memory leak - vlan_info struct
        gianfar: fix potential sk_wmem_alloc imbalance
        drivers/net/ethernet/broadcom/cnic.c: remove invalid reference to list iterator variable
        net/rxrpc/ar-peer.c: remove invalid reference to list iterator variable
        drivers/isdn/mISDN/stack.c: remove invalid reference to list iterator variable
        ...
      a0185401
    • Linus Torvalds's avatar
      Merge tag 'single-rpmsg-3.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg · 635ac119
      Linus Torvalds authored
      Pull rpmsg fix from Ohad Ben-Cohen:
       "A single rpmsg fix for 3.5, coming from Federico Fuga, which
        eliminates the dependency on arbitrary initialization orders."
      
      * tag 'single-rpmsg-3.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
        rpmsg: fix dependency on initialization order
      635ac119
    • Linus Torvalds's avatar
      Merge branch 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · 5bb93f1a
      Linus Torvalds authored
      Pull CMA and DMA-mapping fixes from Marek Szyprowski:
       "Another set of minor fixups for recently merged Contiguous Memory
        Allocator and ARM DMA-mapping changes.  Those patches fix mysterious
        crashes on systems with CMA and Himem enabled as well as some corner
        cases caused by typical off-by-one bug."
      
      * 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
        ARM: dma-mapping: modify condition check while freeing pages
        mm: cma: fix condition check when setting global cma area
        mm: cma: don't replace lowmem pages with highmem
      5bb93f1a
    • David S. Miller's avatar
      Merge branch 'master' of git://1984.lsi.us.es/nf · 602e65a3
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      I know that we're in fairly late stage to request pulls, but the IPVS people
      pinged me with little patches with oops fixes last week.
      
      One of them was recently introduced (during the 3.4 development cycle) while
      cleaning up the IPVS netns support. They are:
      
      * Fix one regression introduced in 3.4 while cleaning up the
        netns support for IPVS, from Julian Anastasov.
      
      * Fix one oops triggered due to resetting the conntrack attached to the skb
        instead of just putting it in the forward hook, from Lin Ming. This problem
        seems to be there since 2.6.37 according to Simon Horman.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      602e65a3
    • Federico Fuga's avatar
      rpmsg: fix dependency on initialization order · 96342526
      Federico Fuga authored
      When rpmsg drivers are built into the kernel, they must not initialize
      before the rpmsg bus does, otherwise they'd trigger a BUG() in
      drivers/base/driver.c line 169 (driver_register()).
      
      To fix that, and to stop depending on arbitrary linkage ordering of
      those built-in rpmsg drivers, we make the rpmsg bus initialize at
      subsys_initcall.
      
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarFederico Fuga <fuga@studiofuga.com>
      [ohad: rewrite the commit log]
      Signed-off-by: default avatarOhad Ben-Cohen <ohad@wizery.com>
      96342526
    • Julian Anastasov's avatar
      ipvs: fix oops in ip_vs_dst_event on rmmod · 283283c4
      Julian Anastasov authored
      	After commit 39f618b4 (3.4)
      "ipvs: reset ipvs pointer in netns" we can oops in
      ip_vs_dst_event on rmmod ip_vs because ip_vs_control_cleanup
      is called after the ipvs_core_ops subsys is unregistered and
      net->ipvs is NULL. Fix it by exiting early from ip_vs_dst_event
      if ipvs is NULL. It is safe because all services and dests
      for the net are already freed.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      283283c4
    • Lin Ming's avatar
      ipvs: fix oops on NAT reply in br_nf context · 9e33ce45
      Lin Ming authored
      IPVS should not reset skb->nf_bridge in FORWARD hook
      by calling nf_reset for NAT replies. It triggers oops in
      br_nf_forward_finish.
      
      [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
      [  579.781792] PGD 218f9067 PUD 0
      [  579.781865] Oops: 0000 [#1] SMP
      [  579.781945] CPU 0
      [  579.781983] Modules linked in:
      [  579.782047]
      [  579.782080]
      [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
      [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
      [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
      [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
      [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
      [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
      [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
      [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
      [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
      [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
      [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
      [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
      [  579.783919] Stack:
      [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
      [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
      [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
      [  579.784477] Call Trace:
      [  579.784523]  <IRQ>
      [  579.784562]
      [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
      [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
      [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
      [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
      [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
      [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
      [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
      [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
      [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
      [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
      [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
      [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
      [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
      [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
      [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
      [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
      [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
      [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
      [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
      [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
      [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
      [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
      [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
      
      The steps to reproduce as follow,
      
      1. On Host1, setup brige br0(192.168.1.106)
      2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
      3. Start IPVS service on Host1
         ipvsadm -A -t 192.168.1.106:80 -s rr
         ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
      4. Run apache benchmark on Host2(192.168.1.101)
         ab -n 1000 http://192.168.1.106/
      
      ip_vs_reply4
        ip_vs_out
          handle_response
            ip_vs_notrack
              nf_reset()
              {
                skb->nf_bridge = NULL;
              }
      
      Actually, IPVS wants in this case just to replace nfct
      with untracked version. So replace the nf_reset(skb) call
      in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
      Signed-off-by: default avatarLin Ming <mlin@ss.pku.edu.cn>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      9e33ce45
    • Alexander Duyck's avatar
      ixgbevf: Fix panic when loading driver · 10cc1bdd
      Alexander Duyck authored
      This patch addresses a kernel panic seen when setting up the interface.
      Specifically we see a NULL pointer dereference on the Tx descriptor cleanup
      path when enabling interrupts.  This change corrects that so it cannot
      occur.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: default avatarGreg Rose <gregory.v.rose@intel.com>
      Tested-by: default avatarSibai Li <sibai.li@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10cc1bdd
    • Alan Cox's avatar
      ax25: Fix missing break · ef764a13
      Alan Cox authored
      At least there seems to be no reason to disallow ROSE sockets when
      NETROM is loaded.
      Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef764a13
    • Dmitry Eremin-Solenikov's avatar
      MAINTAINERS: reflect actual changes in IEEE 802.15.4 maintainership · 68653359
      Dmitry Eremin-Solenikov authored
      As the life flows, developers priorities shifts a bit. Reflect actual
      changes in the maintainership of IEEE 802.15.4 code: Sergey mostly
      stopped cared about this piece of code. Most of the work recently was
      done by Alexander, so put him to the MAINTAINERS file to reflect his
      status and to ease the life of respective patches.
      
      Also add new net/mac802154/ directory to the list of maintained files.
      Signed-off-by: default avatarDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68653359
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net · 5dcaba7e
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      This series contains fixes to e1000e.
       ...
      Bruce Allan (1):
        e1000e: fix test for PHY being accessible on 82577/8/9 and I217
      
      Tushar Dave (1):
        e1000e: Correct link check logic for 82571 serdes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5dcaba7e
    • Sjur Brændeland's avatar
      caif: Fix access to freed pernet memory · 96f80d12
      Sjur Brændeland authored
      unregister_netdevice_notifier() must be called before
      unregister_pernet_subsys() to avoid accessing already freed
      pernet memory. This fixes the following oops when doing rmmod:
      
      Call Trace:
       [<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif]
       [<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100
       [<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif]
       [<ffffffff810e7734>] sys_delete_module+0x1a4/0x300
       [<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0
       [<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3
       [<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f
      
      RIP
       [<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif]
      Signed-off-by: default avatarSjur Brændeland <sjur.brandeland@stericsson.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96f80d12
    • Gao feng's avatar
      net: cgroup: fix access the unallocated memory in netprio cgroup · ef209f15
      Gao feng authored
      there are some out of bound accesses in netprio cgroup.
      
      now before accessing the dev->priomap.priomap array,we only check
      if the dev->priomap exist.and because we don't want to see
      additional bound checkings in fast path, so we should make sure
      that dev->priomap is null or array size of dev->priomap.priomap
      is equal to max_prioidx + 1;
      
      so in write_priomap logic,we should call extend_netdev_table when
      dev->priomap is null and dev->priomap.priomap_len < max_len.
      and in cgrp_create->update_netdev_tables logic,we should call
      extend_netdev_table only when dev->priomap exist and
      dev->priomap.priomap_len < max_len.
      
      and it's not needed to call update_netdev_tables in write_priomap,
      we can only allocate the net device's priomap which we change through
      net_prio.ifpriomap.
      
      this patch also add a return value for update_netdev_tables &
      extend_netdev_table, so when new_priomap is allocated failed,
      write_priomap will stop to access the priomap,and return -ENOMEM
      back to the userspace to tell the user what happend.
      
      Change From v3:
      1. add rtnl protect when reading max_prioidx in write_priomap.
      
      2. only call extend_netdev_table when map->priomap_len < max_len,
         this will make sure array size of dev->map->priomap always
         bigger than any prioidx.
      
      3. add a function write_update_netdev_table to make codes clear.
      
      Change From v2:
      1. protect extend_netdev_table by RTNL.
      2. when extend_netdev_table failed,call dev_put to reduce device's refcount.
      Signed-off-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef209f15
    • Narendra K's avatar
      ixgbevf: Prevent RX/TX statistics getting reset to zero · 93659763
      Narendra K authored
      The commit 4197aa7b implements 64 bit
      per ring statistics. But the driver resets the 'total_bytes' and
      'total_packets' from RX and TX rings in the RX and TX interrupt
      handlers to zero. This results in statistics being lost and user space
      reporting RX and TX statistics as zero. This patch addresses the
      issue by preventing the resetting of RX and TX ring statistics to
      zero.
      Signed-off-by: default avatarNarendra K <narendra_k@dell.com>
      Tested-by: default avatarSibai Li <sibai.li@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93659763
    • Neil Horman's avatar
      sctp: Fix list corruption resulting from freeing an association on a list · 2eebc1e1
      Neil Horman authored
      A few days ago Dave Jones reported this oops:
      
      [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
      [22766.295376] CPU 0
      [22766.295384] Modules linked in:
      [22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
      ffff880147c03a74
      [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
      [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
      [22766.387137] Stack:
      [22766.387140]  ffff880147c03a10
      [22766.387140]  ffffffffa169f2b6
      [22766.387140]  ffff88013ed95728
      [22766.387143]  0000000000000002
      [22766.387143]  0000000000000000
      [22766.387143]  ffff880003fad062
      [22766.387144]  ffff88013c120000
      [22766.387144]
      [22766.387145] Call Trace:
      [22766.387145]  <IRQ>
      [22766.387150]  [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0
      [sctp]
      [22766.387154]  [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
      [22766.387157]  [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
      [22766.387161]  [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [22766.387163]  [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
      [22766.387166]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387168]  [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
      [22766.387169]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
      [22766.387171]  [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
      [22766.387172]  [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
      [22766.387174]  [<ffffffff81590c54>] ip_rcv+0x214/0x320
      [22766.387176]  [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
      [22766.387178]  [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
      [22766.387180]  [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
      [22766.387182]  [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
      [22766.387183]  [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
      [22766.387185]  [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
      [22766.387187]  [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
      [22766.387218]  [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
      [22766.387242]  [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
      [22766.387266]  [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
      [22766.387268]  [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
      [22766.387270]  [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
      [22766.387273]  [<ffffffff810734d0>] __do_softirq+0xe0/0x420
      [22766.387275]  [<ffffffff8169826c>] call_softirq+0x1c/0x30
      [22766.387278]  [<ffffffff8101db15>] do_softirq+0xd5/0x110
      [22766.387279]  [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
      [22766.387281]  [<ffffffff81698b03>] do_IRQ+0x63/0xd0
      [22766.387283]  [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
      [22766.387283]  <EOI>
      [22766.387284]
      [22766.387285]  [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
      [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
      89 e5 48 83
      ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00
      48 89 fb
      49 89 f5 66 c1 c0 08 66 39 46 02
      [22766.387307]
      [22766.387307] RIP
      [22766.387311]  [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
      [22766.387311]  RSP <ffff880147c039b0>
      [22766.387142]  ffffffffa16ab120
      [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
      [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt
      
      It appears from his analysis and some staring at the code that this is likely
      occuring because an association is getting freed while still on the
      sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
      while a freed node corrupts part of the list.
      
      Nominally I would think that an mibalanced refcount was responsible for this,
      but I can't seem to find any obvious imbalance.  What I did note however was
      that the two places where we create an association using
      sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
      which free a newly created association after calling sctp_primitive_ASSOCIATE.
      sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
      issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
      the aforementioned hash table.  the sctp command interpreter that process side
      effects has not way to unwind previously processed commands, so freeing the
      association from the __sctp_connect or sctp_sendmsg error path would lead to a
      freed association remaining on this hash table.
      
      I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
      which allows us to proerly use hlist_unhashed to check if the node is on a
      hashlist safely during a delete.  That in turn alows us to safely call
      sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
      before freeing them, regardles of what the associations state is on the hash
      list.
      
      I noted, while I was doing this, that the __sctp_unhash_endpoint was using
      hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
      pointers to make that function work properly, so I fixed that up in a simmilar
      fashion.
      
      I attempted to test this using a virtual guest running the SCTP_RR test from
      netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
      able to recreate the problem prior to this fix, nor was I able to trigger the
      failure after (neither of which I suppose is suprising).  Given the trace above
      however, I think its likely that this is what we hit.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: davej@redhat.com
      CC: davej@redhat.com
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Sridhar Samudrala <sri@us.ibm.com>
      CC: linux-sctp@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2eebc1e1
  5. 16 Jul, 2012 5 commits