1. 01 May, 2012 11 commits
  2. 30 Apr, 2012 8 commits
    • Yuchung Cheng's avatar
      tcp: fix infinite cwnd in tcp_complete_cwr() · 1cebce36
      Yuchung Cheng authored
      When the cwnd reduction is done, ssthresh may be infinite
      if TCP enters CWR via ECN or F-RTO. If cwnd is not undone, i.e.,
      undo_marker is set, tcp_complete_cwr() falsely set cwnd to the
      infinite ssthresh value. The correct operation is to keep cwnd
      intact because it has been updated in ECN or F-RTO.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cebce36
    • Jan Seiffert's avatar
      bpf jit: Let the powerpc jit handle negative offsets · 05be1824
      Jan Seiffert authored
      Now the helper function from filter.c for negative offsets is exported,
      it can be used it in the jit to handle negative offsets.
      
      First modify the asm load helper functions to handle:
      - know positive offsets
      - know negative offsets
      - any offset
      
      then the compiler can be modified to explicitly use these helper
      when appropriate.
      
      This fixes the case of a negative X register and allows to lift
      the restriction that bpf programs with negative offsets can't
      be jited.
      Tested-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJan Seiffert <kaffeemonster@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05be1824
    • Eric Dumazet's avatar
      net: fix sk_sockets_allocated_read_positive · 518fbf9c
      Eric Dumazet authored
      Denys Fedoryshchenko reported frequent crashes on a proxy server and kindly
      provided a lockdep report that explains it all :
      
        [  762.903868]
        [  762.903880] =================================
        [  762.903890] [ INFO: inconsistent lock state ]
        [  762.903903] 3.3.4-build-0061 #8 Not tainted
        [  762.904133] ---------------------------------
        [  762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
        [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
        [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>]
      __percpu_counter_sum+0xd/0x58
        [  762.904542] {IN-SOFTIRQ-W} state was registered at:
        [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
        [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
        [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
        [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
        [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
        [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
        [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
        [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
        [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
        [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
        [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
        [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
        [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
        [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
        [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
        [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
        [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
        [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
        [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
        [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
        [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
        [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
        [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
        [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
        [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
        [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
        [  762.904542] irq event stamp: 156915469
        [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>]
      __slab_alloc.clone.58.clone.63+0xc4/0x2de
        [  762.904542] hardirqs last disabled at (156915468): [<c019b452>]
      __slab_alloc.clone.58.clone.63+0x22/0x2de
        [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>]
      lock_sock_nested+0x64/0x6c
        [  762.904542] softirqs last disabled at (156915464): [<c0349914>]
      _raw_spin_lock_bh+0xe/0x45
        [  762.904542]
        [  762.904542] other info that might help us debug this:
        [  762.904542]  Possible unsafe locking scenario:
        [  762.904542]
        [  762.904542]        CPU0
        [  762.904542]        ----
        [  762.904542]   lock(key#3);
        [  762.904542]   <Interrupt>
        [  762.904542]     lock(key#3);
        [  762.904542]
        [  762.904542]  *** DEADLOCK ***
        [  762.904542]
        [  762.904542] 1 lock held by squid/1603:
        [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>]
      lock_sock+0xa/0xc
        [  762.904542]
        [  762.904542] stack backtrace:
        [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
        [  762.904542] Call Trace:
        [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
        [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
        [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
        [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
        [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
        [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
        [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
        [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
        [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
        [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
        [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
        [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
        [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
        [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
        [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
        [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
        [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
        [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
        [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
        [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
        [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
        [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
        [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
        [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
        [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
        [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
        [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
        [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
      
      Bug is that sk_sockets_allocated_read_positive() calls
      percpu_counter_sum_positive() without BH being disabled.
      
      This bug was added in commit 180d8cd9
      (foundations of per-cgroup memory pressure controlling.), since previous
      code was using percpu_counter_read_positive() which is IRQ safe.
      
      In __sk_mem_schedule() we dont need the precise count of allocated
      sockets and can revert to previous behavior.
      Reported-by: default avatarDenys Fedoryshchenko <denys@visp.net.lb>
      Sined-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      518fbf9c
    • David S. Miller's avatar
      5414fc12
    • Pablo Neira Ayuso's avatar
      netfilter: xt_CT: fix wrong checking in the timeout assignment path · 6cf51852
      Pablo Neira Ayuso authored
      The current checking always succeeded. We have to check the first
      character of the string to check that it's empty, thus, skipping
      the timeout path.
      
      This fixes the use of the CT target without the timeout option.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6cf51852
    • Hans Schillstrom's avatar
      ipvs: kernel oops - do_ip_vs_get_ctl · 8537de8a
      Hans Schillstrom authored
      Change order of init so netns init is ready
      when register ioctl and netlink.
      
      Ver2
      	Whitespace fixes and __init added.
      Reported-by: default avatar"Ryan O'Hara" <rohara@redhat.com>
      Signed-off-by: default avatarHans Schillstrom <hans.schillstrom@ericsson.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      8537de8a
    • Hans Schillstrom's avatar
      ipvs: take care of return value from protocol init_netns · 582b8e3e
      Hans Schillstrom authored
      ip_vs_create_timeout_table() can return NULL
      All functions protocol init_netns is affected of this patch.
      Signed-off-by: default avatarHans Schillstrom <hans.schillstrom@ericsson.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      582b8e3e
    • Hans Schillstrom's avatar
      ipvs: null check of net->ipvs in lblc(r) shedulers · 4b984cd5
      Hans Schillstrom authored
      Avoid crash when registering shedulers after
      the IPVS core initialization for netns fails. Do this by
      checking for present core (net->ipvs).
      Signed-off-by: default avatarHans Schillstrom <hans.schillstrom@ericsson.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      4b984cd5
  3. 28 Apr, 2012 2 commits
    • Neil Horman's avatar
      drop_monitor: Make updating data->skb smp safe · 3885ca78
      Neil Horman authored
      Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
      its smp protections.  Specifically, its possible to replace data->skb while its
      being written.  This patch corrects that by making data->skb an rcu protected
      variable.  That will prevent it from being overwritten while a tracepoint is
      modifying it.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: David Miller <davem@davemloft.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3885ca78
    • Neil Horman's avatar
      drop_monitor: fix sleeping in invalid context warning · cde2e9a6
      Neil Horman authored
      Eric Dumazet pointed out this warning in the drop_monitor protocol to me:
      
      [   38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
      [   38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
      [   38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
      [   38.352582] Call Trace:
      [   38.352592]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
      [   38.352599]  [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
      [   38.352606]  [<ffffffff81655b16>] mutex_lock+0x26/0x50
      [   38.352610]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
      [   38.352616]  [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
      [   38.352621]  [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
      [   38.352625]  [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
      [   38.352630]  [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
      [   38.352636]  [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
      [   38.352640]  [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
      [   38.352645]  [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
      [   38.352649]  [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
      [   38.352653]  [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
      [   38.352658]  [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
      [   38.352663]  [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
      [   38.352668]  [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
      [   38.352673]  [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
      [   38.352677]  [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390
      [   38.352682]  [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210
      [   38.352687]  [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0
      [   38.352693]  [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0
      [   38.352699]  [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10
      [   38.352703]  [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140
      [   38.352708]  [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80
      [   38.352713]  [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b
      
      It stems from holding a spinlock (trace_state_lock) while attempting to register
      or unregister tracepoint hooks, making in_atomic() true in this context, leading
      to the warning when the tracepoint calls might_sleep() while its taking a mutex.
      Since we only use the trace_state_lock to prevent trace protocol state races, as
      well as hardware stat list updates on an rcu write side, we can just convert the
      spinlock to a mutex to avoid this problem.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: David Miller <davem@davemloft.net>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cde2e9a6
  4. 27 Apr, 2012 1 commit
  5. 26 Apr, 2012 12 commits
  6. 25 Apr, 2012 6 commits
    • David S. Miller's avatar
    • Matt Carlson's avatar
      tg3: Avoid panic from reserved statblk field access · f891ea16
      Matt Carlson authored
      When RSS is enabled, interrupt vector 0 does not receive any rx traffic.
      The rx producer index fields for vector 0's status block should be
      considered reserved in this case.  This patch changes the code to
      respect these reserved fields, which avoids a kernel panic when these
      fields take on non-zero values.
      Signed-off-by: default avatarMatt Carlson <mcarlson@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f891ea16
    • Benjamin Poirier's avatar
      tlan: add cast needed for proper 64 bit operation · da3a9e9e
      Benjamin Poirier authored
      Changes this beauty into a statement that actually has an effect on amd64.
      Tested-by: default avatarPer Jessen <per@opensuse.org>
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da3a9e9e
    • John W. Linville's avatar
      Merge branch 'master' of... · 39583628
      John W. Linville authored
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      39583628
    • Julian Anastasov's avatar
      ipvs: fix crash in ip_vs_control_net_cleanup on unload · 8f9b9a2f
      Julian Anastasov authored
      	commit 14e40546 (2.6.39)
      ("Add __ip_vs_control_{init,cleanup}_sysctl()")
      introduced regression due to wrong __net_init for
      __ip_vs_control_cleanup_sysctl. This leads to crash when
      the ip_vs module is unloaded.
      
      	Fix it by changing __net_init to __net_exit for
      the function that is already renamed to ip_vs_control_net_cleanup_sysctl.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarHans Schillstrom <hans@schillstrom.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      8f9b9a2f
    • Sasha Levin's avatar
      ipvs: Verify that IP_VS protocol has been registered · 7118c07a
      Sasha Levin authored
      The registration of a protocol might fail, there were no checks
      and all registrations were assumed to be correct. This lead to
      NULL ptr dereferences when apps tried registering.
      
      For example:
      
      [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0
      [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP
      [ 1293.227038] CPU 1
      [ 1293.227038] Pid: 19609, comm: trinity Tainted: G        W    3.4.0-rc1-next-20120405-sasha-dirty #57
      [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>]  [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038] RSP: 0018:ffff880038c1dd18  EFLAGS: 00010286
      [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000
      [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282
      [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000
      [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668
      [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000
      [ 1293.227038] FS:  00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000
      [ 1293.227038] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0
      [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000)
      [ 1293.227038] Stack:
      [ 1293.227038]  ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580
      [ 1293.227038]  ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b
      [ 1293.227038]  0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580
      [ 1293.227038] Call Trace:
      [ 1293.227038]  [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180
      [ 1293.227038]  [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70
      [ 1293.227038]  [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140
      [ 1293.227038]  [<ffffffff821c9060>] ops_init+0x80/0x90
      [ 1293.227038]  [<ffffffff821c90cb>] setup_net+0x5b/0xe0
      [ 1293.227038]  [<ffffffff821c9416>] copy_net_ns+0x76/0x100
      [ 1293.227038]  [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190
      [ 1293.227038]  [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80
      [ 1293.227038]  [<ffffffff810afd1f>] sys_unshare+0xff/0x290
      [ 1293.227038]  [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [ 1293.227038]  [<ffffffff82665539>] system_call_fastpath+0x16/0x1b
      [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d
      [ 1293.227038] RIP  [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0
      [ 1293.227038]  RSP <ffff880038c1dd18>
      [ 1293.227038] CR2: 0000000000000018
      [ 1293.379284] ---[ end trace 364ab40c7011a009 ]---
      [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt
      Signed-off-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7118c07a