1. 24 Oct, 2013 4 commits
    • Shaohua Li's avatar
      raid5: avoid finding "discard" stripe · d47648fc
      Shaohua Li authored
      SCSI discard will damage discard stripe bio setting, eg, some fields are
      changed. If the stripe is reused very soon, we have wrong bios setting. We
      remove discard stripe from hash list, so next time the strip will be fully
      initialized.
      
      Suitable for backport to 3.7+.
      
      Cc: <stable@vger.kernel.org> (3.7+)
      Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      d47648fc
    • Shaohua Li's avatar
      raid5: set bio bi_vcnt 0 for discard request · 37c61ff3
      Shaohua Li authored
      SCSI layer will add new payload for discard request. If two bios are merged
      to one, the second bio has bi_vcnt 1 which is set in raid5. This will confuse
      SCSI and cause oops.
      
      Suitable for backport to 3.7+
      
      Cc: stable@vger.kernel.org (v3.7+)
      Reported-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      37c61ff3
    • Bian Yu's avatar
      md: avoid deadlock when md_set_badblocks. · 905b0297
      Bian Yu authored
      When operate harddisk and hit errors, md_set_badblocks is called after
      scsi_restart_operations which already disabled the irq. but md_set_badblocks
      will call write_sequnlock_irq and enable irq. so softirq can preempt the
      current thread and that may cause a deadlock. I think this situation should
      use write_sequnlock_irqsave/irqrestore instead.
      
      I met the situation and the call trace is below:
      [  638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010
      [  638.921923]  lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0
      [  638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37
      [  638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013
      [  638.927816]  ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007
      [  638.929829]  ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8
      [  638.931848]  ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8
      [  638.933884] Call Trace:
      [  638.935867]  <IRQ>  [<ffffffff8172ff85>] dump_stack+0x55/0x76
      [  638.937878]  [<ffffffff81730030>] spin_dump+0x8a/0x8f
      [  638.939861]  [<ffffffff81730056>] spin_bug+0x21/0x26
      [  638.941836]  [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0
      [  638.943801]  [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80
      [  638.945747]  [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0
      [  638.947672]  [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50
      [  638.949595]  [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0
      [  638.951504]  [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0
      [  638.953388]  [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140
      [  638.955248]  [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90
      [  638.957116]  [<ffffffff8104fddd>] __do_softirq+0xfd/0x330
      [  638.958987]  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
      [  638.960861]  [<ffffffff8174a5cc>] call_softirq+0x1c/0x30
      [  638.962724]  [<ffffffff81004c7d>] do_softirq+0x8d/0xc0
      [  638.964565]  [<ffffffff8105024e>] irq_exit+0x10e/0x150
      [  638.966390]  [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60
      [  638.968223]  [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80
      [  638.970079]  <EOI>  [<ffffffff810b964f>] ? __lock_release+0x6f/0x100
      [  638.971899]  [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50
      [  638.973691]  [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50
      [  638.975475]  [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0
      [  638.977243]  [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80
      [  638.978988]  [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456]
      [  638.980723]  [<ffffffff811b5a1d>] bio_endio+0x1d/0x40
      [  638.982463]  [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0
      [  638.984214]  [<ffffffff81306b9f>] blk_update_request+0x7f/0x350
      [  638.985967]  [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90
      [  638.987710]  [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50
      [  638.989439]  [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30
      [  638.991149]  [<ffffffff81308746>] blk_peek_request+0x106/0x250
      [  638.992861]  [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130
      [  638.994561]  [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0
      [  638.996251]  [<ffffffff813040a7>] __blk_run_queue+0x37/0x50
      [  638.997900]  [<ffffffff813045af>] blk_run_queue+0x2f/0x50
      [  638.999553]  [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0
      [  639.001185]  [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40
      [  639.002798]  [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200
      [  639.004391]  [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0
      [  639.005996]  [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0
      [  639.007600]  [<ffffffff81072f6b>] kthread+0xdb/0xe0
      [  639.009205]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170
      [  639.010821]  [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0
      [  639.012437]  [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170
      
      This bug was introduce in commit  2e8ac303
      (the first time rdev_set_badblock was call from interrupt context),
      so this patch is appropriate for 3.5 and subsequent kernels.
      
      Cc: <stable@vger.kernel.org> (3.5+)
      Signed-off-by: default avatarBian Yu <bianyu@kedacom.com>
      Reviewed-by: default avatarJianpeng Ma <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      905b0297
    • Lukasz Dorau's avatar
      md: Fix skipping recovery for read-only arrays. · 61e4947c
      Lukasz Dorau authored
      Since:
              commit 7ceb17e8
              md: Allow devices to be re-added to a read-only array.
      
      spares are activated on a read-only array. In case of raid1 and raid10
      personalities it causes that not-in-sync devices are marked in-sync
      without checking if recovery has been finished.
      
      If a read-only array is degraded and one of its devices is not in-sync
      (because the array has been only partially recovered) recovery will be skipped.
      
      This patch adds checking if recovery has been finished before marking a device
      in-sync for raid1 and raid10 personalities. In case of raid5 personality
      such condition is already present (at raid5.c:6029).
      
      Bug was introduced in 3.10 and causes data corruption.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPawel Baldysiak <pawel.baldysiak@intel.com>
      Signed-off-by: default avatarLukasz Dorau <lukasz.dorau@intel.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      61e4947c
  2. 23 Oct, 2013 7 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 320437af
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "Several last minute bug fixes.
      
        Two of them are on the larger side for rc7, the dasd format patch for
        older storage devices and the store-clock-fast patch where we have
        been to optimistic with an optimization"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/time: correct use of store clock fast
        s390/vmlogrdr: fix array access in vmlogrdr_open()
        s390/compat,signal: fix return value of copy_siginfo_(to|from)_user32()
        s390/dasd: check for availability of prefix command during format
        s390/mm,kvm: fix software dirty bits vs. kvm for old machines
      320437af
    • Linus Torvalds's avatar
      Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 90338325
      Linus Torvalds authored
      Pull thermal management fixes from Zhang Rui:
       "These includes several commits that are necessary to properly fix
        regression for TMU test MUX address setting after reset, for exynos
        thermal driver.
      
        Specifics:
      
         - fix a regression that the removal of setting a certain field at TMU
           configuration setting results in immediately shutdown after reset
           on Exynos4412 SoC.
      
         - revert a patch which tries to link the thermal_zone device and its
           hwmon node but breaks libsensors.
      
         - fix a deadlock/lockdep warning issue in x86_pkg_temp thermal
           driver, which can be reproduced on a buggy platform only.
      
         - fix ti-soc-thermal driver to fall back on bandgap reading when
           reading from PCB temperature sensor fails"
      
      * 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        Revert "drivers: thermal: parent virtual hwmon with thermal zone"
        drivers: thermal: allow ti-soc-thermal run without pcb zone
        thermal: exynos: Provide initial setting for TMU's test MUX address at Exynos4412
        thermal: exynos: Provide separate TMU data for Exynos4412
        thermal: exynos: Remove check for thermal device pointer at exynos_report_trigger()
        Thermal: x86_pkg_temp: change spin lock
      90338325
    • Randy Dunlap's avatar
      platform/x86: fix asus-wmi build error · ea89e1d3
      Randy Dunlap authored
      Fix build error in asus_wmi.c when ASUS_WMI=y and ACPI_VIDEO=m
      by preventing that combination.
      
        drivers/built-in.o: In function `asus_wmi_probe':
        asus-wmi.c:(.text+0x65ddb4): undefined reference to `acpi_video_unregister'
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea89e1d3
    • Kent Overstreet's avatar
      bcache: Fixed incorrect order of arguments to bio_alloc_bioset() · d4eddd42
      Kent Overstreet authored
      Signed-off-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d4eddd42
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · f4e5e14f
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       - Compilation fixes for GCC < 4.4.6
       - one Kbuild dependency select fix (selecting videobuf on msi3101)
       - driver fixes on tda10071, e4000, msi3101, soc_camera, s5p-jpeg,
         saa7134 and adv7511
       - some device quirks needed to make them work properly
       - some videobuf2 core regression fixes for some features used only on
         embedded drivers
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] saa7134: Fix crash when device is closed before streamoff
        [media] adv7511: fix error return code in adv7511_probe()
        [media] ths8200: fix compilation with GCC < 4.4.6
        [media] ad9389b: fix compilation with GCC < 4.4.6
        [media] adv7511: fix compilation with GCC < 4.4.6
        [media] adv7842: fix compilation with GCC < 4.4.6
        [media] s5p-jpeg: Initialize vfd_decoder->vfl_dir field
        [media] videobuf2-dc: Fix support for mappings without struct page in userptr mode
        [media] vb2: Allow queuing OUTPUT buffers with zeroed 'bytesused'
        [media] mx3-camera: locking cleanup in mx3_videobuf_queue()
        [media] sh_vou: almost forever loop in sh_vou_try_fmt_vid_out()
        [media] tda10071: change firmware download condition
        [media] msi3101: correct max videobuf2 alloc
        [media] Add HCL T12Rg-H to STK webcam upside-down table
        [media] msi3101: Kconfig select VIDEOBUF2_VMALLOC
        [media] msi3101: msi3101_ioctl_ops can be static
        [media] e4000: fix PLL calc bug on 32-bit arch
        [media] uvcvideo: quirk PROBE_DEF for Microsoft Lifecam NX-3000
        [media] uvcvideo: quirk PROBE_DEF for Dell SP2008WFP monitor
      f4e5e14f
    • Linus Torvalds's avatar
      Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · 0d645a8b
      Linus Torvalds authored
      Pull infiniband bugfix from Roland Dreier:
       "Disable not-quite-ready userspace ABI for IB flow steering"
      
      * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
        IB/core: Temporarily disable create_flow/destroy_flow uverbs
      0d645a8b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · db10accf
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Sorry I let so much accumulate, I was in Buffalo and wanted a few
        things to cook in my tree for a while before sending to you.  Anyways,
        it's a lot of little things as usual at this stage in the game"
      
       1) Make bonding MAINTAINERS entry reflect reality, from Andy
          Gospodarek.
      
       2) Fix accidental sock_put() on timewait mini sockets, from Eric
          Dumazet.
      
       3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6
          addresses, from François CACHEREUL.
      
       4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
          Carpenter.
      
       5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
          Dumazet.
      
       6) SFC driver bug fixes from Ben Hutchings.
      
       7) Fix TX packet scheduling wedge after channel change in ath9k driver,
          from Felix Fietkau.
      
       8) Fix user after free in BPF JIT code, from Alexei Starovoitov.
      
       9) Source address selection test is reversed in
          __ip_route_output_key(), fix from Jiri Benc.
      
      10) VLAN and CAN layer mis-size netlink attributes, from Marc
          Kleine-Budde.
      
      11) Fix permission checks in sysctls to use current_euid() instead of
          current_uid().  From Eric W Biederman.
      
      12) IPSEC policies can go away while a timer is still pending for them,
          add appropriate ref-counting to fix, from Steffen Klassert.
      
      13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
          chips, from Nguyen Hong Ky and Simon Horman.
      
      14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.
      
      15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
          Ghorbel.
      
      16) Fix typo in fq_change(), we were assigning "initial quantum" to
          "quantum".  From Eric Dumazet.
      
      17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
          FQ packet scheduler does not pace those flows properly.  Also from
          Eric Dumazet.
      
      18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.
      
      19) l2tp_xmit_skb() can be called from process context, not just softirq
          context, so we must always make sure to BH disable around it.  From
          Eric Dumazet.
      
      20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
          packet scheduler.  From Stephen Hemminger.
      
      21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
          Carpenter and Salva Peiró.
      
      22) Fix PHY reset and other issues in dm9000 driver, from Nikita
          Kiryanov and Michael Abbott.
      
      23) When hardware can do SCTP crc32 checksums, we accidently don't
          disable the csum offload when IPSEC transformations have been
          applied.  From Fan Du and Vlad Yasevich.
      
      24) Tail loss probing in TCP leaves the socket in the wrong congestion
          avoidance state.  From Yuchung Cheng.
      
      25) In CPSW driver, enable NAPI before interrupts are turned on, from
          Markus Pargmann.
      
      26) Integer underflow and dual-assignment in YAM hamradio driver, from
          Dan Carpenter.
      
      27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
          unclone it.  This fixes various hard to track down crashes in
          drivers where the SKBs ->gso_segs was changing right from underneath
          the driver during TX queueing.  From Eric Dumazet.
      
      28) Fix the handling of VLAN IDs, and in particular the special IDs 0
          and 4095, in the bridging layer.  From Toshiaki Makita.
      
      29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.
      
      30) Fix race in socket credential passing, from Daniel Borkmann.
      
      31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
          from Seif Mazareeb.
      
      32) Fix identification of fragmented frames in ipv4/ipv6 UDP
          Fragmentation Offload output paths, from Jiri Pirko.
      
      33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
          Elior.
      
      34) When we removed the explicit neighbour pointer from ipv6 routes a
          slight regression was introduced for users such as IPVS, xt_TEE, and
          raw sockets.  We mix up the users requested destination address with
          the routes assigned nexthop/gateway.  From Julian Anastasov and
          Simon Horman.
      
      35) Fix stack overruns in rt6_probe(), the issue is that can end up
          doing two full packet xmit paths at the same time when emitting
          neighbour discovery messages.  From Hannes Frederic Sowa.
      
      36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
          Mariusz Ceier.
      
      37) Make sure to set TCP sk_pacing_rate after the first legitimate RTT
          sample, from Neal Cardwell.
      
      38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
          Steffen Klassert.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits)
        ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
        ax88179_178a: Correct the RX error definition in RX header
        Revert "bridge: only expire the mdb entry when query is received"
        tcp: initialize passive-side sk_pacing_rate after 3WHS
        davinci_emac.c: Fix IFF_ALLMULTI setup
        mac802154: correct a typo in ieee802154_alloc_device() prototype
        ipv6: probe routes asynchronous in rt6_probe
        netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
        ipv6: fill rt6i_gateway with nexthop address
        ipv6: always prefer rt6i_gateway if present
        bnx2x: Set NETIF_F_HIGHDMA unconditionally
        bnx2x: Don't pretend during register dump
        bnx2x: Lock DMAE when used by statistic flow
        bnx2x: Prevent null pointer dereference on error flow
        bnx2x: Fix config when SR-IOV and iSCSI are enabled
        bnx2x: Fix Coalescing configuration
        bnx2x: Unlock VF-PF channel on MAC/VLAN config error
        bnx2x: Prevent an illegal pointer dereference during panic
        bnx2x: Fix Maximum CoS estimation for VFs
        drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing
        ...
      db10accf
  3. 22 Oct, 2013 15 commits
  4. 21 Oct, 2013 14 commits
    • Neal Cardwell's avatar
      tcp: initialize passive-side sk_pacing_rate after 3WHS · 02cf4ebd
      Neal Cardwell authored
      For passive TCP connections, upon receiving the ACK that completes the
      3WHS, make sure we set our pacing rate after we get our first RTT
      sample.
      
      On passive TCP connections, when we receive the ACK completing the
      3WHS we do not take an RTT sample in tcp_ack(), but rather in
      tcp_synack_rtt_meas(). So upon receiving the ACK that completes the
      3WHS, tcp_ack() leaves sk_pacing_rate at its initial value.
      
      Originally the initial sk_pacing_rate value was 0, so passive-side
      connections defaulted to sysctl_tcp_min_tso_segs (2 segs) in skbuffs
      made in the first RTT. With a default initial cwnd of 10 packets, this
      happened to be correct for RTTs 5ms or bigger, so it was hard to
      see problems in WAN or emulated WAN testing.
      
      Since 7eec4174 ("pkt_sched: fq: fix non TCP flows pacing"), the
      initial sk_pacing_rate is 0xffffffff. So after that change, passive
      TCP connections were keeping this value (and using large numbers of
      segments per skbuff) until receiving an ACK for data.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02cf4ebd
    • Mariusz Ceier's avatar
      davinci_emac.c: Fix IFF_ALLMULTI setup · d69e0f7e
      Mariusz Ceier authored
      When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
      emac_dev_mcast_set should only enable RX of multicasts and reset
      MACHASH registers.
      
      It does this, but afterwards it either sets up multicast MACs
      filtering or disables RX of multicasts and resets MACHASH registers
      again, rendering IFF_ALLMULTI flag useless.
      
      This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
      disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.
      
      Tested with kernel 2.6.37.
      Signed-off-by: default avatarMariusz Ceier <mceier+kernel@gmail.com>
      Acked-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d69e0f7e
    • Alexandre Belloni's avatar
      mac802154: correct a typo in ieee802154_alloc_device() prototype · 7e4d8a19
      Alexandre Belloni authored
      This has no other impact than a cosmetic one.
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e4d8a19
    • Hannes Frederic Sowa's avatar
      ipv6: probe routes asynchronous in rt6_probe · c2f17e82
      Hannes Frederic Sowa authored
      Routes need to be probed asynchronous otherwise the call stack gets
      exhausted when the kernel attemps to deliver another skb inline, like
      e.g. xt_TEE does, and we probe at the same time.
      
      We update neigh->updated still at once, otherwise we would send to
      many probes.
      
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2f17e82
    • David S. Miller's avatar
      Merge branch 'rt6i_gateway' · 3a70417c
      David S. Miller authored
      Julian Anastasov says:
      
      ====================
      ipv6: use rt6i_gateway as nexthop
      
      	The following patchset makes sure that rt6i_gateway
      contains valid nexthop information in all cases, so that
      we can use different nexthop for sending.
      
      	The first patch is a simple fix that makes IPVS, TEE,
      RAW(hdrincl) and RTF_DYNAMIC(without RTF_GATEWAY) work as
      before 3.9. There is a single corner case not solved by
      this patch: RAW(hdrincl) or TEE using local address for
      nexthop, a silly feature, I guess. In this case we
      see zeroes in rt6i_gateway because we get route that is not
      cloned. This is solved only with patch 2.
      
      	The second patch is an optimization that makes sure
      all resulting routes have rt6i_gateway filled, so that we
      can avoid the complex ipv6_addr_any() call added to rt6_nexthop()
      by patch 1. And it sets rt6i_gateway for local routes, a case
      not handled by patch 1.
      
      	The third patch uses the new rt6_nexthop() function to fix
      the matching of gateways in the same way as commit bbb5823c
      ("netfilter: nf_conntrack: fix rt_gateway checks for H.323 helper")
      fixes nf_conntrack_h323_main.c for IPv4. Currently, it depends on
      the new definition of rt6_nexthop() in patch 2. Actually, if
      patch 2 is applied, patch 3 becomes a cosmetic change.
      
      	I see the following two alternatives for applying these
      patches:
      
      1. Linger patch 2 in net-next to avoid surprises in the upcoming
      release. In this case patch 3 can be reworked not to depend on
      the new rt6_nexthop() definition in patch 2. I guess this is a
      better option, so that patch 2 can be reviewed and tested for
      longer time.
      
      2. Include all 3 patches in net tree - more risky because this
      is my first attempt to change IPv6.
      
      	Here is the situation as handled by patch 2:
      
      	In IPv6 the resolved routes are always host routes (/128
      with DST_HOST), mostly cloned ones. We allow routes in FIB
      to contain rt6i_gateway with zeroes (eg. for local subnets) but
      on cloning we can fill the rt6i_gateway field in result.
      This works even without this patchset.
      
      	There is a single special case where dst is provided as
      skb_dst directly without a routing call: icmp6_dst_alloc(). It is a
      private dst allocated just for the particular ICMP packet. Patch 2
      fills rt6i_gateway in this case, needed for the new rt6_nexthop()
      simplification.
      
      	The last case is addrconf_dst_alloc(), it can put in
      FIB local/anycast routes when addresses are added. Patch 2
      needs to fill rt6i_gateway in this case because such routes
      are returned without cloning.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a70417c
    • Julian Anastasov's avatar
      netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper · 56e42441
      Julian Anastasov authored
      Now when rt6_nexthop() can return nexthop address we can use it
      for proper nexthop comparison of directly connected destinations.
      For more information refer to commit bbb5823c
      ("netfilter: nf_conntrack: fix rt_gateway checks for H.323 helper").
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56e42441
    • Julian Anastasov's avatar
      ipv6: fill rt6i_gateway with nexthop address · 550bab42
      Julian Anastasov authored
      Make sure rt6i_gateway contains nexthop information in
      all routes returned from lookup or when routes are directly
      attached to skb for generated ICMP packets.
      
      The effect of this patch should be a faster version of
      rt6_nexthop() and the consideration of local addresses as
      nexthop.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      550bab42
    • Julian Anastasov's avatar
      ipv6: always prefer rt6i_gateway if present · 96dc8095
      Julian Anastasov authored
      In v3.9 6fd6ce20 ("ipv6: Do not depend on rt->n in
      ip6_finish_output2()." changed the behaviour of ip6_finish_output2()
      such that the recently introduced rt6_nexthop() is used
      instead of an assigned neighbor.
      
      As rt6_nexthop() prefers rt6i_gateway only for gatewayed
      routes this causes a problem for users like IPVS, xt_TEE and
      RAW(hdrincl) if they want to use different address for routing
      compared to the destination address.
      
      Another case is when redirect can create RTF_DYNAMIC
      route without RTF_GATEWAY flag, we ignore the rt6i_gateway
      in rt6_nexthop().
      
      Fix the above problems by considering the rt6i_gateway if
      present, so that traffic routed to address on local subnet is
      not wrongly diverted to the destination address.
      
      Thanks to Simon Horman and Phil Oester for spotting the
      problematic commit.
      
      Thanks to Hannes Frederic Sowa for his review and help in testing.
      Reported-by: default avatarPhil Oester <kernel@linuxace.com>
      Reported-by: default avatarMark Brooks <mark@loadbalancer.org>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96dc8095
    • David S. Miller's avatar
      Merge branch 'bnx2x' · 4440c6f7
      David S. Miller authored
      Yuval Mintz says:
      
      ====================
      bnx2x: Bug fixes patch series
      
      This patch series contains fixes for various flows - several SR-IOV issues
      are fixed, ethtool callbacks (coalescing and register dump) are corrected,
      null pointer dereference on error flows is prevented, etc.
      
      Changes from V1
      ---------------
       - Patch 2  "bnx2x: Prevent an illegal pointer dereference during panic"
         is revised, with improved handling of edge cases.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4440c6f7
    • Merav Sicron's avatar
      bnx2x: Set NETIF_F_HIGHDMA unconditionally · edd31476
      Merav Sicron authored
      Current driver implementation incorrectly sets the flag only if 64-bit
      DMA mask succeeded.
      Signed-off-by: default avatarMerav Sicron <meravs@broadcom.com>
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edd31476
    • Dmitry Kravkov's avatar
      bnx2x: Don't pretend during register dump · 4293b9f5
      Dmitry Kravkov authored
      As part of a register dump, the interface pretends to have the identity
      of other interfaces of the same physical device in order to perform
      HW configuration for them - specifically, it needs to prevent attentions
      from generating on those functions as the register dump accesses registers
      in common blocks which whose reading might generate an attention.
      
      However, such pretension is unsafe - unlike other flows in which the driver
      uses pretend, during register dump there is no guarantee no other HW access
      will take place (by other flows). If such access will take place, the HW will
      be accessed by the wrong interface, and leave both functions in an incorrect
      state.
      
      This patch removes all pretensions from the register dump flow. Instead, it
      changes initial configuration of attentions such that no fatal attention will
      be generated for other functions as a result of the register dump
      (notice however, a debug print claiming an attention from other functions IS
      possible during the register dump)
      Signed-off-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4293b9f5
    • Ariel Elior's avatar
      bnx2x: Lock DMAE when used by statistic flow · 32316a46
      Ariel Elior authored
      bnx2x has several clients to its DMAE machines - all of them with the exception
      of the statistics flow used the same locking mechanisms to synchronize the DMAE
      machines' usage.
      
      Since statistics (which are periodically entered) use DMAE without taking the
      locks, they may erase the commands which were previously set -
      e.g., it may cause a VF to timeout while waiting for a PF answer on the VF-PF
      channel as that command header would have been overwritten by the statistics'
      header.
      
      This patch makes certain that all flows utilizing DMAE will use the same
      API, assuring that the locking scheme will be kept by all said flows.
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32316a46
    • Yuval Mintz's avatar
      bnx2x: Prevent null pointer dereference on error flow · 6b991c37
      Yuval Mintz authored
      If debug message is open and bnx2x_vfop_qdtor_cmd() were to fail,
      the resulting print would have caused a null pointer dereference.
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b991c37
    • Ariel Elior's avatar
      bnx2x: Fix config when SR-IOV and iSCSI are enabled · 0907f34c
      Ariel Elior authored
      Starting with commit b9871bcf "bnx2x: VF RSS support - PF side", if a PF will
      have SR-IOV supported in its PCI configuration space, storage drivers will not
      work for that interface.
      
      This patch fixes the resource calculation to allow such a configuration to
      properly work.
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0907f34c