1. 18 Sep, 2015 34 commits
  2. 14 Sep, 2015 1 commit
    • Weilong Chen's avatar
      ipv6: add check for blackhole or prohibited entry in rt6_redire · 9a6fbaeb
      Weilong Chen authored
      There's a check for ip6_null_entry, but it's not enough if the config
      CONFIG_IPV6_MULTIPLE_TABLES is selected. Blackhole or prohibited entries
      should also be ignored.
      
      This path is for kernel before v3.6, as there's a commit b94f1c09
      use icmpv6_notify() instead of rt6_redirect() and rt6_redirect has
      been deleted.
      
      The oops as follow:
          [exception RIP: do_raw_write_lock+12]
          RIP: ffffffff8122c42c  RSP: ffff880666e45820  RFLAGS: 00010282
          RAX: ffff8801207bffd8  RBX: 0000000000000018  RCX: 0000000000000000
          RDX: 0000000000000000  RSI: ffff880666e45898  RDI: 0000000000000018
          RBP: ffff880666e45830   R8: 000000000000001e   R9: 0000000006000000
          R10: ffff88011796b8a0  R11: 0000000000000004  R12: ffff88010391ed00
          R13: 0000000000000000  R14: ffff880666e45898  R15: ffff88011796b890
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
          [ffff880666e45838] _raw_write_lock_bh at ffffffff81450b39
          [ffff880666e45858] __ip6_ins_rt at ffffffff813ed8c1
          [ffff880666e45888] ip6_ins_rt at ffffffff813eef58
          [ffff880666e458b8] rt6_redirect at ffffffff813f0b84
          [ffff880666e45958] ndisc_rcv at ffffffff813f95d8
          [ffff880666e45a08] icmpv6_rcv at ffffffff814000e8
          [ffff880666e45ae8] ip6_input_finish at ffffffff813e43bb
          [ffff880666e45b38] ip6_input at ffffffff813e4b08
          [ffff880666e45b68] ipv6_rcv at ffffffff813e4969
          [ffff880666e45bc8] __netif_receive_skb at ffffffff8135158a
          [ffff880666e45c38] dev_gro_receive at ffffffff81351cb0
          [ffff880666e45c78] napi_gro_receive at ffffffff81351fc5
          [ffff880666e45cb8] tg3_rx at ffffffffa0bfb354 [tg]
          [ffff880666e45d88] tg3_poll_work at ffffffffa0c07857 [tg]
          [ffff880666e45e18] tg3_poll_msix at ffffffffa0c07d1b [tg]
          [ffff880666e45e68] net_rx_action at ffffffff81352219
          [ffff880666e45ec8] __do_softirq at ffffffff8103e5a1
          [ffff880666e45f38] call_softirq at ffffffff81459c4c
          [ffff880666e45f50] do_softirq at ffffffff8100413d
          [ffff880666e45f80] do_IRQ at ffffffff81003cce
      This happened when ip6_route_redirect found a rt which was set
      blackhole, the rt had a NULL rt6i_table argument which is accessed by
      __ip6_ins_rt() when trying to lock rt6i_table->tb6_lock caused a BUG:
      "BUG: unable to handle kernel NULL pointer"
      Signed-off-by: default avatarWeilong Chen <chenweilong@huawei.com>
      9a6fbaeb
  3. 19 Jun, 2015 5 commits
    • Zefan Li's avatar
      Linux 3.4.108 · cf1b3dad
      Zefan Li authored
      cf1b3dad
    • Ian Campbell's avatar
      xen: netback: read hotplug script once at start of day. · 366df578
      Ian Campbell authored
      commit 31a41898 upstream.
      
      When we come to tear things down in netback_remove() and generate the
      uevent it is possible that the xenstore directory has already been
      removed (details below).
      
      In such cases netback_uevent() won't be able to read the hotplug
      script and will write a xenstore error node.
      
      A recent change to the hypervisor exposed this race such that we now
      sometimes lose it (where apparently we didn't ever before).
      
      Instead read the hotplug script configuration during setup and use it
      for the lifetime of the backend device.
      
      The apparently more obvious fix of moving the transition to
      state=Closed in netback_remove() to after the uevent does not work
      because it is possible that we are already in state=Closed (in
      reaction to the guest having disconnected as it shutdown). Being
      already in Closed means the toolstack is at liberty to start tearing
      down the xenstore directories. In principal it might be possible to
      arrange to unregister the device sooner (e.g on transition to Closing)
      such that xenstore would still be there but this state machine is
      fragile and prone to anger...
      
      A modern Xen system only relies on the hotplug uevent for driver
      domains, when the backend is in the same domain as the toolstack it
      will run the necessary setup/teardown directly in the correct sequence
      wrt xenstore changes.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      366df578
    • Joonsoo Kim's avatar
      slub: refactoring unfreeze_partials() · e0483eb8
      Joonsoo Kim authored
      commit 43d77867 upstream.
      
      Current implementation of unfreeze_partials() is so complicated,
      but benefit from it is insignificant. In addition many code in
      do {} while loop have a bad influence to a fail rate of cmpxchg_double_slab.
      Under current implementation which test status of cpu partial slab
      and acquire list_lock in do {} while loop,
      we don't need to acquire a list_lock and gain a little benefit
      when front of the cpu partial slab is to be discarded, but this is a rare case.
      In case that add_partial is performed and cmpxchg_double_slab is failed,
      remove_partial should be called case by case.
      
      I think that these are disadvantages of current implementation,
      so I do refactoring unfreeze_partials().
      
      Minimizing code in do {} while loop introduce a reduced fail rate
      of cmpxchg_double_slab. Below is output of 'slabinfo -r kmalloc-256'
      when './perf stat -r 33 hackbench 50 process 4000 > /dev/null' is done.
      
      ** before **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   182685
      Unlocked Cmpxchg Double redos 0
      
      ** after **
      Cmpxchg_double Looping
      ------------------------
      Locked Cmpxchg Double redos   177995
      Unlocked Cmpxchg Double redos 1
      
      We can see cmpxchg_double_slab fail rate is improved slightly.
      
      Bolow is output of './perf stat -r 30 hackbench 50 process 4000 > /dev/null'.
      
      ** before **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           108517.190463 task-clock                #    7.926 CPUs utilized            ( +-  0.24% )
               2,919,550 context-switches          #    0.027 M/sec                    ( +-  3.07% )
                 100,774 CPU-migrations            #    0.929 K/sec                    ( +-  4.72% )
                 124,201 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         401,500,234,387 cycles                    #    3.700 GHz                      ( +-  0.24% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,576,913,354 instructions              #    0.62  insns per cycle          ( +-  0.13% )
          45,934,956,860 branches                  #  423.297 M/sec                    ( +-  0.14% )
             188,219,787 branch-misses             #    0.41% of all branches          ( +-  0.56% )
      
            13.691837307 seconds time elapsed                                          ( +-  0.24% )
      
      ** after **
       Performance counter stats for './hackbench 50 process 4000' (30 runs):
      
           107784.479767 task-clock                #    7.928 CPUs utilized            ( +-  0.22% )
               2,834,781 context-switches          #    0.026 M/sec                    ( +-  2.33% )
                  93,083 CPU-migrations            #    0.864 K/sec                    ( +-  3.45% )
                 123,967 page-faults               #    0.001 M/sec                    ( +-  0.15% )
         398,781,421,836 cycles                    #    3.700 GHz                      ( +-  0.22% )
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
         250,189,160,419 instructions              #    0.63  insns per cycle          ( +-  0.09% )
          45,855,370,128 branches                  #  425.436 M/sec                    ( +-  0.10% )
             169,881,248 branch-misses             #    0.37% of all branches          ( +-  0.43% )
      
            13.596272341 seconds time elapsed                                          ( +-  0.22% )
      
      No regression is found, but rather we can see slightly better result.
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e0483eb8
    • Ben Hutchings's avatar
      xen-pciback: Add name prefix to global 'permissive' variable · cb990484
      Ben Hutchings authored
      commit 8014bcc8 upstream.
      
      The variable for the 'permissive' module parameter used to be static
      but was recently changed to be extern.  This puts it in the kernel
      global namespace if the driver is built-in, so its name should begin
      with a prefix identifying the driver.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Fixes: af6fc858 ("xen-pciback: limit guest control of command register")
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      cb990484
    • Tejun Heo's avatar
      writeback: use |1 instead of +1 to protect against div by zero · c905f0af
      Tejun Heo authored
      commit 464d1387 upstream.
      
      mm/page-writeback.c has several places where 1 is added to the divisor
      to prevent division by zero exceptions; however, if the original
      divisor is equivalent to -1, adding 1 leads to division by zero.
      
      There are three places where +1 is used for this purpose - one in
      pos_ratio_polynom() and two in bdi_position_ratio().  The second one
      in bdi_position_ratio() actually triggered div-by-zero oops on a
      machine running a 3.10 kernel.  The divisor is
      
        x_intercept - bdi_setpoint + 1 == span + 1
      
      span is confirmed to be (u32)-1.  It isn't clear how it ended up that
      but it could be from write bandwidth calculation underflow fixed by
      c72efb65 ("writeback: fix possible underflow in write bandwidth
      calculation").
      
      At any rate, +1 isn't a proper protection against div-by-zero.  This
      patch converts all +1 protections to |1.  Note that
      bdi_update_dirty_ratelimit() was already using |1 before this patch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      [lizf: Backported to 3.4: drop other two changes as there's only one
       such statment in 3.4]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      c905f0af