1. 13 Sep, 2014 40 commits
    • Sasha Levin's avatar
      iovec: make sure the caller actually wants anything in memcpy_fromiovecend · 12ef6094
      Sasha Levin authored
      [ Upstream commit 06ebb06d ]
      
      Check for cases when the caller requests 0 bytes instead of running off
      and dereferencing potentially invalid iovecs.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      12ef6094
    • Vlad Yasevich's avatar
      macvlan: Initialize vlan_features to turn on offload support. · a57d246b
      Vlad Yasevich authored
      [ Upstream commit 081e83a7 ]
      
      Macvlan devices do not initialize vlan_features.  As a result,
      any vlan devices configured on top of macvlans perform very poorly.
      Initialize vlan_features based on the vlan features of the lower-level
      device.
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a57d246b
    • Daniel Borkmann's avatar
      net: sctp: inherit auth_capable on INIT collisions · 38710dd1
      Daniel Borkmann authored
      [ Upstream commit 1be9a950 ]
      
      Jason reported an oops caused by SCTP on his ARM machine with
      SCTP authentication enabled:
      
      Internal error: Oops: 17 [#1] ARM
      CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
      task: c6eefa40 ti: c6f52000 task.ti: c6f52000
      PC is at sctp_auth_calculate_hmac+0xc4/0x10c
      LR is at sg_init_table+0x20/0x38
      pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
      sp : c6f538e8  ip : 00000000  fp : c6f53924
      r10: c6f50d80  r9 : 00000000  r8 : 00010000
      r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
      r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
      Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      Control: 0005397f  Table: 06f28000  DAC: 00000015
      Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
      Stack: (0xc6f538e8 to 0xc6f54000)
      [...]
      Backtrace:
      [<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
      [<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
      [<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
      [<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
      [<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
      [<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
      [<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
      [<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)
      
      While we already had various kind of bugs in that area
      ec0223ec ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
      we/peer is AUTH capable") and b14878cc ("net: sctp: cache
      auth_enable per endpoint"), this one is a bit of a different
      kind.
      
      Giving a bit more background on why SCTP authentication is
      needed can be found in RFC4895:
      
        SCTP uses 32-bit verification tags to protect itself against
        blind attackers. These values are not changed during the
        lifetime of an SCTP association.
      
        Looking at new SCTP extensions, there is the need to have a
        method of proving that an SCTP chunk(s) was really sent by
        the original peer that started the association and not by a
        malicious attacker.
      
      To cause this bug, we're triggering an INIT collision between
      peers; normal SCTP handshake where both sides intent to
      authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
      parameters that are being negotiated among peers:
      
        ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
        <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
      
      RFC4895 says that each endpoint therefore knows its own random
      number and the peer's random number *after* the association
      has been established. The local and peer's random number along
      with the shared key are then part of the secret used for
      calculating the HMAC in the AUTH chunk.
      
      Now, in our scenario, we have 2 threads with 1 non-blocking
      SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
      and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
      sctp_bindx(3), listen(2) and connect(2) against each other,
      thus the handshake looks similar to this, e.g.:
      
        ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
        <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
        <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
        -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
        ...
      
      Since such collisions can also happen with verification tags,
      the RFC4895 for AUTH rather vaguely says under section 6.1:
      
        In case of INIT collision, the rules governing the handling
        of this Random Number follow the same pattern as those for
        the Verification Tag, as explained in Section 5.2.4 of
        RFC 2960 [5]. Therefore, each endpoint knows its own Random
        Number and the peer's Random Number after the association
        has been established.
      
      In RFC2960, section 5.2.4, we're eventually hitting Action B:
      
        B) In this case, both sides may be attempting to start an
           association at about the same time but the peer endpoint
           started its INIT after responding to the local endpoint's
           INIT. Thus it may have picked a new Verification Tag not
           being aware of the previous Tag it had sent this endpoint.
           The endpoint should stay in or enter the ESTABLISHED
           state but it MUST update its peer's Verification Tag from
           the State Cookie, stop any init or cookie timers that may
           running and send a COOKIE ACK.
      
      In other words, the handling of the Random parameter is the
      same as behavior for the Verification Tag as described in
      Action B of section 5.2.4.
      
      Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
      case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
      side effect interpreter, and in fact it properly copies over
      peer_{random, hmacs, chunks} parameters from the newly created
      association to update the existing one.
      
      Also, the old asoc_shared_key is being released and based on
      the new params, sctp_auth_asoc_init_active_key() updated.
      However, the issue observed in this case is that the previous
      asoc->peer.auth_capable was 0, and has *not* been updated, so
      that instead of creating a new secret, we're doing an early
      return from the function sctp_auth_asoc_init_active_key()
      leaving asoc->asoc_shared_key as NULL. However, we now have to
      authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).
      
      That in fact causes the server side when responding with ...
      
        <------------------ AUTH; COOKIE-ACK -----------------
      
      ... to trigger a NULL pointer dereference, since in
      sctp_packet_transmit(), it discovers that an AUTH chunk is
      being queued for xmit, and thus it calls sctp_auth_calculate_hmac().
      
      Since the asoc->active_key_id is still inherited from the
      endpoint, and the same as encoded into the chunk, it uses
      asoc->asoc_shared_key, which is still NULL, as an asoc_key
      and dereferences it in ...
      
        crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)
      
      ... causing an oops. All this happens because sctp_make_cookie_ack()
      called with the *new* association has the peer.auth_capable=1
      and therefore marks the chunk with auth=1 after checking
      sctp_auth_send_cid(), but it is *actually* sent later on over
      the then *updated* association's transport that didn't initialize
      its shared key due to peer.auth_capable=0. Since control chunks
      in that case are not sent by the temporary association which
      are scheduled for deletion, they are issued for xmit via
      SCTP_CMD_REPLY in the interpreter with the context of the
      *updated* association. peer.auth_capable was 0 in the updated
      association (which went from COOKIE_WAIT into ESTABLISHED state),
      since all previous processing that performed sctp_process_init()
      was being done on temporary associations, that we eventually
      throw away each time.
      
      The correct fix is to update to the new peer.auth_capable
      value as well in the collision case via sctp_assoc_update(),
      so that in case the collision migrated from 0 -> 1,
      sctp_auth_asoc_init_active_key() can properly recalculate
      the secret. This therefore fixes the observed server panic.
      
      Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
      Reported-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Tested-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      38710dd1
    • Christoph Paasch's avatar
      tcp: Fix integer-overflow in TCP vegas · 4cdcdfdb
      Christoph Paasch authored
      [ Upstream commit 1f74e613 ]
      
      In vegas we do a multiplication of the cwnd and the rtt. This
      may overflow and thus their result is stored in a u64. However, we first
      need to cast the cwnd so that actually 64-bit arithmetic is done.
      
      Then, we need to do do_div to allow this to be used on 32-bit arches.
      
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Doug Leith <doug.leith@nuim.ie>
      Fixes: 8d3a564d (tcp: tcp_vegas cong avoid fix)
      Signed-off-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4cdcdfdb
    • Christoph Paasch's avatar
      tcp: Fix integer-overflows in TCP veno · a16f7f29
      Christoph Paasch authored
      [ Upstream commit 45a07695 ]
      
      In veno we do a multiplication of the cwnd and the rtt. This
      may overflow and thus their result is stored in a u64. However, we first
      need to cast the cwnd so that actually 64-bit arithmetic is done.
      
      A first attempt at fixing 76f10177 ([TCP]: TCP Veno congestion
      control) was made by 15913114 (tcp: Overflow bug in Vegas), but it
      failed to add the required cast in tcp_veno_cong_avoid().
      
      Fixes: 76f10177 ([TCP]: TCP Veno congestion control)
      Signed-off-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a16f7f29
    • Eric Dumazet's avatar
      ip: make IP identifiers less predictable · bf63acfd
      Eric Dumazet authored
      [ Upstream commit 04ca6973 ]
      
      In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
      Jedidiah describe ways exploiting linux IP identifier generation to
      infer whether two machines are exchanging packets.
      
      With commit 73f156a6 ("inetpeer: get rid of ip_id_count"), we
      changed IP id generation, but this does not really prevent this
      side-channel technique.
      
      This patch adds a random amount of perturbation so that IP identifiers
      for a given destination [1] are no longer monotonically increasing after
      an idle period.
      
      Note that prandom_u32_max(1) returns 0, so if generator is used at most
      once per jiffy, this patch inserts no hole in the ID suite and do not
      increase collision probability.
      
      This is jiffies based, so in the worst case (HZ=1000), the id can
      rollover after ~65 seconds of idle time, which should be fine.
      
      We also change the hash used in __ip_select_ident() to not only hash
      on daddr, but also saddr and protocol, so that ICMP probes can not be
      used to infer information for other protocols.
      
      For IPv6, adds saddr into the hash as well, but not nexthdr.
      
      If I ping the patched target, we can see ID are now hard to predict.
      
      21:57:11.008086 IP (...)
          A > target: ICMP echo request, seq 1, length 64
      21:57:11.010752 IP (... id 2081 ...)
          target > A: ICMP echo reply, seq 1, length 64
      
      21:57:12.013133 IP (...)
          A > target: ICMP echo request, seq 2, length 64
      21:57:12.015737 IP (... id 3039 ...)
          target > A: ICMP echo reply, seq 2, length 64
      
      21:57:13.016580 IP (...)
          A > target: ICMP echo request, seq 3, length 64
      21:57:13.019251 IP (... id 3437 ...)
          target > A: ICMP echo reply, seq 3, length 64
      
      [1] TCP sessions uses a per flow ID generator not changed by this patch.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarJeffrey Knockel <jeffk@cs.unm.edu>
      Reported-by: default avatarJedidiah R. Crandall <crandall@cs.unm.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Hannes Frederic Sowa <hannes@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bf63acfd
    • Eric Dumazet's avatar
      inetpeer: get rid of ip_id_count · 64b5c251
      Eric Dumazet authored
      [ Upstream commit 73f156a6 ]
      
      Ideally, we would need to generate IP ID using a per destination IP
      generator.
      
      linux kernels used inet_peer cache for this purpose, but this had a huge
      cost on servers disabling MTU discovery.
      
      1) each inet_peer struct consumes 192 bytes
      
      2) inetpeer cache uses a binary tree of inet_peer structs,
         with a nominal size of ~66000 elements under load.
      
      3) lookups in this tree are hitting a lot of cache lines, as tree depth
         is about 20.
      
      4) If server deals with many tcp flows, we have a high probability of
         not finding the inet_peer, allocating a fresh one, inserting it in
         the tree with same initial ip_id_count, (cf secure_ip_id())
      
      5) We garbage collect inet_peer aggressively.
      
      IP ID generation do not have to be 'perfect'
      
      Goal is trying to avoid duplicates in a short period of time,
      so that reassembly units have a chance to complete reassembly of
      fragments belonging to one message before receiving other fragments
      with a recycled ID.
      
      We simply use an array of generators, and a Jenkin hash using the dst IP
      as a key.
      
      ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
      belongs (it is only used from this file)
      
      secure_ip_id() and secure_ipv6_id() no longer are needed.
      
      Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
      unnecessary decrement/increment of the number of segments.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      64b5c251
    • Jonas Bonn's avatar
      openrisc: include export.h for EXPORT_SYMBOL · 04619b6c
      Jonas Bonn authored
      commit abdf8b5e upstream.
      
      Use of EXPORT_SYMBOL requires inclusion of export.h
      Signed-off-by: default avatarJonas Bonn <jonas@southpole.se>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      04619b6c
    • Ralf Baechle's avatar
      MIPS: Fix accessing to per-cpu data when flushing the cache · 2ce27762
      Ralf Baechle authored
      commit ff522058 upstream.
      
      This fixes the following issue
      
      BUG: using smp_processor_id() in preemptible [00000000] code: kjournald/1761
      caller is blast_dcache32+0x30/0x254
      Call Trace:
      [<8047f02c>] dump_stack+0x8/0x34
      [<802e7e40>] debug_smp_processor_id+0xe0/0xf0
      [<80114d94>] blast_dcache32+0x30/0x254
      [<80118484>] r4k_dma_cache_wback_inv+0x200/0x288
      [<80110ff0>] mips_dma_map_sg+0x108/0x180
      [<80355098>] ide_dma_prepare+0xf0/0x1b8
      [<8034eaa4>] do_rw_taskfile+0x1e8/0x33c
      [<8035951c>] ide_do_rw_disk+0x298/0x3e4
      [<8034a3c4>] do_ide_request+0x2e0/0x704
      [<802bb0dc>] __blk_run_queue+0x44/0x64
      [<802be000>] queue_unplugged.isra.36+0x1c/0x54
      [<802beb94>] blk_flush_plug_list+0x18c/0x24c
      [<802bec6c>] blk_finish_plug+0x18/0x48
      [<8026554c>] journal_commit_transaction+0x3b8/0x151c
      [<80269648>] kjournald+0xec/0x238
      [<8014ac00>] kthread+0xb8/0xc0
      [<8010268c>] ret_from_kernel_thread+0x14/0x1c
      
      Caches in most systems are identical - but not always, so we can't avoid
      the use of smp_call_function() by just looking at the boot CPU's data,
      have to fiddle with preemption instead.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/5835Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2ce27762
    • Florian Fainelli's avatar
      MIPS: perf: Fix build error caused by unused counters_per_cpu_to_total() · 588ca81b
      Florian Fainelli authored
      commit 6c37c958 upstream.
      
      cc1: warnings being treated as errors
      arch/mips/kernel/perf_event_mipsxx.c:166: error: 'counters_per_cpu_to_total' defined but not used
      make[2]: *** [arch/mips/kernel/perf_event_mipsxx.o] Error 1
      make[2]: *** Waiting for unfinished jobs....
      
      It was first introduced by 82091564 [MIPS:
      perf: Add support for 64-bit perf counters.] in 3.2.
      Signed-off-by: default avatarFlorian Fainelli <florian@openwrt.org>
      Cc: linux-mips@linux-mips.org
      Cc: david.daney@cavium.com
      Patchwork: https://patchwork.linux-mips.org/patch/3357/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      588ca81b
    • Stefan Kristiansson's avatar
      openrisc: add missing header inclusion · fda9662d
      Stefan Kristiansson authored
      commit 160d8378 upstream.
      
      Prevents build issue with updated toolchain
      Reported-by: default avatarJack Thomasson <jkt@moonlitsw.com>
      Tested-by: default avatarChristian Svensson <blue@cmd.nu>
      Signed-off-by: default avatarStefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Signed-off-by: default avatarJonas Bonn <jonas@southpole.se>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fda9662d
    • Johan Hovold's avatar
      USB: serial: fix potential heap buffer overflow · b067dfbd
      Johan Hovold authored
      commit 5654699f upstream.
      
      Make sure to verify the number of ports requested by subdriver to avoid
      writing beyond the end of fixed-size array in interface data.
      
      The current usb-serial implementation is limited to eight ports per
      interface but failed to verify that the number of ports requested by a
      subdriver (which could have been determined from device descriptors) did
      not exceed this limit.
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: s/ddev/\&interface->dev/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b067dfbd
    • Johan Hovold's avatar
      USB: serial: fix potential stack buffer overflow · 51140f5c
      Johan Hovold authored
      commit d979e9f9 upstream.
      
      Make sure to verify the maximum number of endpoints per type to avoid
      writing beyond the end of a stack-allocated array.
      
      The current usb-serial implementation is limited to eight ports per
      interface but failed to verify that the number of endpoints of a certain
      type reported by a device did not exceed this limit.
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      51140f5c
    • Mark Rutland's avatar
      ARM: 8129/1: errata: work around Cortex-A15 erratum 830321 using dummy strex · bbd4080b
      Mark Rutland authored
      commit 2c32c65e upstream.
      
      On revisions of Cortex-A15 prior to r3p3, a CLREX instruction at PL1 may
      falsely trigger a watchpoint exception, leading to potential data aborts
      during exception return and/or livelock.
      
      This patch resolves the issue in the following ways:
      
        - Replacing our uses of CLREX with a dummy STREX sequence instead (as
          we did for v6 CPUs).
      
        - Removing the clrex code from v7_exit_coherency_flush and derivatives,
          since this only exists as a minor performance improvement when
          non-cached exclusives are in use (Linux doesn't use these).
      
      Benchmarking on a variety of ARM cores revealed no measurable
      performance difference with this change applied, so the change is
      performed unconditionally and no new Kconfig entry is added.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      [bwh: Backported to 3.2:
       - Drop inapplicable changes to arch/arm/include/asm/cacheflush.h and
         arch/arm/mach-exynos/mcpm-exynos.c]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bbd4080b
    • Mark Rutland's avatar
      ARM: 8128/1: abort: don't clear the exclusive monitors · 8630bac3
      Mark Rutland authored
      commit 85868313 upstream.
      
      The ARMv6 and ARMv7 early abort handlers clear the exclusive monitors
      upon entry to the kernel, but this is redundant:
      
        - We clear the monitors on every exception return since commit
          200b812d ("Clear the exclusive monitor when returning from an
          exception"), so this is not necessary to ensure the monitors are
          cleared before returning from a fault handler.
      
        - Any dummy STREX will target a temporary scratch area in memory, and
          may succeed or fail without corrupting useful data. Its status value
          will not be used.
      
        - Any other STREX in the kernel must be preceded by an LDREX, which
          will initialise the monitors consistently and will not depend on the
          earlier state of the monitors.
      
      Therefore we have no reason to care about the initial state of the
      exclusive monitors when a data abort is taken, and clearing the monitors
      prior to exception return (as we already do) is sufficient.
      
      This patch removes the redundant clearing of the exclusive monitors from
      the early abort handlers.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8630bac3
    • Jiri Kosina's avatar
      HID: picolcd: sanity check report size in raw_event() callback · b23ea023
      Jiri Kosina authored
      commit 844817e4 upstream.
      
      The report passed to us from transport driver could potentially be
      arbitrarily large, therefore we better sanity-check it so that raw_data
      that we hold in picolcd_pending structure are always kept within proper
      bounds.
      Reported-by: default avatarSteven Vittitoe <scvitti@google.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b23ea023
    • Jiri Kosina's avatar
      HID: magicmouse: sanity check report size in raw_event() callback · e3ead924
      Jiri Kosina authored
      commit c54def7b upstream.
      
      The report passed to us from transport driver could potentially be
      arbitrarily large, therefore we better sanity-check it so that
      magicmouse_emit_touch() gets only valid values of raw_id.
      Reported-by: default avatarSteven Vittitoe <scvitti@google.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e3ead924
    • Trond Myklebust's avatar
      NFSv4: Fix problems with close in the presence of a delegation · 74efedad
      Trond Myklebust authored
      commit aee7af35 upstream.
      
      In the presence of delegations, we can no longer assume that the
      state->n_rdwr, state->n_rdonly, state->n_wronly reflect the open
      stateid share mode, and so we need to calculate the initial value
      for calldata->arg.fmode using the state->flags.
      Reported-by: default avatarJames Drews <drews@engr.wisc.edu>
      Fixes: 88069f77 (NFSv41: Fix a potential state leakage when...)
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      74efedad
    • Stephen Hemminger's avatar
      USB: sisusb: add device id for Magic Control USB video · ec5afb05
      Stephen Hemminger authored
      commit 5b6b80ae upstream.
      
      I have a j5 create (JUA210) USB 2 video device and adding it device id
      to SIS USB video gets it to work.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ec5afb05
    • Lv Zheng's avatar
      ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set · e3f7925d
      Lv Zheng authored
      commit 3afcf2ec upstream.
      
      There is a platform refusing to respond QR_EC when SCI_EVT isn't set
      (Acer Aspire V5-573G).
      
      Currently, we rely on the behaviour that the EC firmware can respond
      something (for example, 0x00 to indicate "no outstanding events") to
      QR_EC even when SCI_EVT is not set, but the reporter has complained
      about AC/battery pluging/unpluging and video brightness change delay
      on that platform.
      
      This is because the work item that has issued QR_EC has to wait until
      timeout in this case, and the _Qxx method evaluation work item queued
      after QR_EC one is delayed.
      
      It sounds reasonable to fix this issue by:
       1. Implementing SCI_EVT sanity check before issuing QR_EC in the EC
          driver's main state machine.
       2. Moving QR_EC issuing out of the work queue used by _Qxx evaluation
          to a seperate IRQ handling thread.
      
      This patch fixes this issue using solution 1.
      
      By disallowing QR_EC to be issued when SCI_EVT isn't set, we are able to
      handle such platform in the EC driver's main state machine. This patch
      enhances the state machine in this way to survive with such malfunctioning
      EC firmware.
      
      Note that this patch can also fix CLEAR_ON_RESUME quirk which also relies
      on the assumption that the platforms are able to respond even when SCI_EVT
      isn't set.
      
      Fixes: c0d65341 ACPI / EC: Fix race condition in ec_transaction_completed()
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=82611Reported-and-tested-by: default avatarAlexander Mezin <mezin.alexander@gmail.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e3f7925d
    • Benjamin Tissoires's avatar
      HID: logitech-dj: prevent false errors to be shown · 74182f6b
      Benjamin Tissoires authored
      commit 5abfe85c upstream.
      
      Commit "HID: logitech: perform bounds checking on device_id early
      enough" unfortunately leaks some errors to dmesg which are not real
      ones:
      - if the report is not a DJ one, then there is not point in checking
        the device_id
      - the receiver (index 0) can also receive some notifications which
        can be safely ignored given the current implementation
      
      Move out the test regarding the report_id and also discards
      printing errors when the receiver got notified.
      
      Fixes: ad3e14d7Reported-and-tested-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      74182f6b
    • James Forshaw's avatar
      USB: whiteheat: Added bounds checking for bulk command response · f92c5bd2
      James Forshaw authored
      commit 6817ae22 upstream.
      
      This patch fixes a potential security issue in the whiteheat USB driver
      which might allow a local attacker to cause kernel memory corrpution. This
      is due to an unchecked memcpy into a fixed size buffer (of 64 bytes). On
      EHCI and XHCI busses it's possible to craft responses greater than 64
      bytes leading a buffer overflow.
      Signed-off-by: default avatarJames Forshaw <forshaw@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f92c5bd2
    • Jiri Kosina's avatar
      HID: fix a couple of off-by-ones · 328538d7
      Jiri Kosina authored
      commit 4ab25786 upstream.
      
      There are a few very theoretical off-by-one bugs in report descriptor size
      checking when performing a pre-parsing fixup. Fix those.
      Reported-by: default avatarBen Hawkes <hawkes@google.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      328538d7
    • Jiri Kosina's avatar
      HID: logitech: perform bounds checking on device_id early enough · e6bc6f66
      Jiri Kosina authored
      commit ad3e14d7 upstream.
      
      device_index is a char type and the size of paired_dj_deivces is 7
      elements, therefore proper bounds checking has to be applied to
      device_index before it is used.
      
      We are currently performing the bounds checking in
      logi_dj_recv_add_djhid_device(), which is too late, as malicious device
      could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the
      problem in one of the report forwarding functions called from
      logi_dj_raw_event().
      
      Fix this by performing the check at the earliest possible ocasion in
      logi_dj_raw_event().
      Reported-by: default avatarBen Hawkes <hawkes@google.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e6bc6f66
    • Jan Kara's avatar
      isofs: Fix unbounded recursion when processing relocated directories · d6621d0d
      Jan Kara authored
      commit 410dd3cf upstream.
      
      We did not check relocated directory in any way when processing Rock
      Ridge 'CL' tag. Thus a corrupted isofs image can possibly have a CL
      entry pointing to another CL entry leading to possibly unbounded
      recursion in kernel code and thus stack overflow or deadlocks (if there
      is a loop created from CL entries).
      
      Fix the problem by not allowing CL entry to point to a directory entry
      with CL entry (such use makes no good sense anyway) and by checking
      whether CL entry doesn't point to itself.
      Reported-by: default avatarChris Evans <cevans@google.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d6621d0d
    • Mathias Nyman's avatar
      xhci: rework cycle bit checking for new dequeue pointers · 416b0d26
      Mathias Nyman authored
      commit 365038d8 upstream.
      
      When we manually need to move the TR dequeue pointer we need to set the
      correct cycle bit as well. Previously we used the trb pointer from the
      last event received as a base, but this was changed in
      commit 1f81b6d2 ("usb: xhci: Prefer endpoint context dequeue pointer")
      to use the dequeue pointer from the endpoint context instead
      
      It turns out some Asmedia controllers advance the dequeue pointer
      stored in the endpoint context past the event triggering TRB, and
      this messed up the way the cycle bit was calculated.
      
      Instead of adding a quirk or complicating the already hard to follow cycle bit
      code, the whole cycle bit calculation is now simplified and adapted to handle
      event and endpoint context dequeue pointer differences.
      
      Fixes: 1f81b6d2 ("usb: xhci: Prefer endpoint context dequeue pointer")
      Reported-by: default avatarMaciej Puzio <mx34567@gmail.com>
      Reported-by: default avatarEvan Langlois <uudruid74@gmail.com>
      Reviewed-by: default avatarJulius Werner <jwerner@chromium.org>
      Tested-by: default avatarMaciej Puzio <mx34567@gmail.com>
      Tested-by: default avatarEvan Langlois <uudruid74@gmail.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2:
       - Debug logging in xhci_find_new_dequeue_state() is slightly different
       - Don't delete find_trb_seg(); it's still needed by xhci_cmd_to_noop()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      416b0d26
    • Aaro Koskinen's avatar
      MIPS: OCTEON: make get_system_type() thread-safe · a1724533
      Aaro Koskinen authored
      commit 60830868 upstream.
      
      get_system_type() is not thread-safe on OCTEON. It uses static data,
      also more dangerous issue is that it's calling cvmx_fuse_read_byte()
      every time without any synchronization. Currently it's possible to get
      processes stuck looping forever in kernel simply by launching multiple
      readers of /proc/cpuinfo:
      
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	...
      
      Fix by initializing the system type string only once during the early
      boot.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@nsn.com>
      Reviewed-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7437/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a1724533
    • Huang Rui's avatar
      usb: xhci: amd chipset also needs short TX quirk · e0d0f5bb
      Huang Rui authored
      commit 2597fe99 upstream.
      
      AMD xHC also needs short tx quirk after tested on most of chipset
      generations. That's because there is the same incorrect behavior like
      Fresco Logic host. Please see below message with on USB webcam
      attached on xHC host:
      
      [  139.262944] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.266934] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.270913] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.274937] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.278914] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.282936] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.286915] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.290938] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.294913] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      [  139.298917] xhci_hcd 0000:00:10.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
      Reported-by: default avatarArindam Nath <arindam.nath@amd.com>
      Tested-by: default avatarShriraj-Rai P <shriraj-rai.p@amd.com>
      Signed-off-by: default avatarHuang Rui <ray.huang@amd.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e0d0f5bb
    • Hans de Goede's avatar
      xhci: Treat not finding the event_seg on COMP_STOP the same as COMP_STOP_INVAL · 498f060a
      Hans de Goede authored
      commit 9a548863 upstream.
      
      When using a Renesas uPD720231 chipset usb-3 uas to sata bridge with a 120G
      Crucial M500 ssd, model string: Crucial_ CT120M500SSD1, together with a
      the integrated Intel xhci controller on a Haswell laptop:
      
      00:14.0 USB controller [0c03]: Intel Corporation 8 Series USB xHCI HC [8086:9c31] (rev 04)
      
      The following error gets logged to dmesg:
      
      xhci error: Transfer event TRB DMA ptr not part of current TD
      
      Treating COMP_STOP the same as COMP_STOP_INVAL when no event_seg gets found
      fixes this.
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      498f060a
    • Michael S. Tsirkin's avatar
      kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601) · 1bc64854
      Michael S. Tsirkin authored
      commit 350b8bdd upstream.
      
      The third parameter of kvm_iommu_put_pages is wrong,
      It should be 'gfn - slot->base_gfn'.
      
      By making gfn very large, malicious guest or userspace can cause kvm to
      go to this error path, and subsequently to pass a huge value as size.
      Alternatively if gfn is small, then pages would be pinned but never
      unpinned, causing host memory leak and local DOS.
      
      Passing a reasonable but large value could be the most dangerous case,
      because it would unpin a page that should have stayed pinned, and thus
      allow the device to DMA into arbitrary memory.  However, this cannot
      happen because of the condition that can trigger the error:
      
      - out of memory (where you can't allocate even a single page)
        should not be possible for the attacker to trigger
      
      - when exceeding the iommu's address space, guest pages after gfn
        will also exceed the iommu's address space, and inside
        kvm_iommu_put_pages() the iommu_iova_to_phys() will fail.  The
        page thus would not be unpinned at all.
      Reported-by: default avatarJack Morgenstein <jackm@mellanox.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1bc64854
    • Arjun Sreedharan's avatar
      pata_scc: propagate return value of scc_wait_after_reset · 0e886058
      Arjun Sreedharan authored
      commit 4dc7c76c upstream.
      
      scc_bus_softreset not necessarily should return zero.
      Propagate the error code.
      Signed-off-by: default avatarArjun Sreedharan <arjun024@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0e886058
    • Joerg Roedel's avatar
      iommu/amd: Fix cleanup_domain for mass device removal · 36d724c9
      Joerg Roedel authored
      commit 9b29d3c6 upstream.
      
      When multiple devices are detached in __detach_device, they
      are also removed from the domains dev_list. This makes it
      unsafe to use list_for_each_entry_safe, as the next pointer
      might also not be in the list anymore after __detach_device
      returns. So just repeatedly remove the first element of the
      list until it is empty.
      Tested-by: default avatarMarti Raudsepp <marti@juffo.org>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      36d724c9
    • Jaša Bartelj's avatar
      USB: ftdi_sio: Added PID for new ekey device · 4c39c216
      Jaša Bartelj authored
      commit 646907f5 upstream.
      
      Added support to the ftdi_sio driver for ekey Converter USB which
      uses an FT232BM chip.
      Signed-off-by: default avatarJaša Bartelj <jasa.bartelj@gmail.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4c39c216
    • Greg KH's avatar
      USB: serial: pl2303: add device id for ztek device · 03288882
      Greg KH authored
      commit 91fcb1ce upstream.
      
      This adds a new device id to the pl2303 driver for the ZTEK device.
      Reported-by: default avatarMike Chu <Mike-Chu@prolific.com.tw>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      03288882
    • Johan Hovold's avatar
      USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID · 6e7015cc
      Johan Hovold authored
      commit 6552cc7f upstream.
      
      Add device id for Basic Micro ATOM Nano USB2Serial adapters.
      Reported-by: default avatarNicolas Alt <n.alt@mytum.de>
      Tested-by: default avatarNicolas Alt <n.alt@mytum.de>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6e7015cc
    • Brennan Ashton's avatar
      USB: option: add VIA Telecom CDS7 chipset device id · c6902e93
      Brennan Ashton authored
      commit d7730273 upstream.
      
      This VIA Telecom baseband processor is used is used by by u-blox in both the
      FW2770 and FW2760 products and may be used in others as well.
      
      This patch has been tested on both of these modem versions.
      Signed-off-by: default avatarBrennan Ashton <bashton@brennanashton.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c6902e93
    • NeilBrown's avatar
      md/raid6: avoid data corruption during recovery of double-degraded RAID6 · 1417f897
      NeilBrown authored
      commit 9c4bdf69 upstream.
      
      During recovery of a double-degraded RAID6 it is possible for
      some blocks not to be recovered properly, leading to corruption.
      
      If a write happens to one block in a stripe that would be written to a
      missing device, and at the same time that stripe is recovering data
      to the other missing device, then that recovered data may not be written.
      
      This patch skips, in the double-degraded case, an optimisation that is
      only safe for single-degraded arrays.
      
      Bug was introduced in 2.6.32 and fix is suitable for any kernel since
      then.  In an older kernel with separate handle_stripe5() and
      handle_stripe6() functions the patch must change handle_stripe6().
      
      Fixes: 6c0069c0
      Cc: Yuri Tikhonov <yur@emcraft.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Reported-by: default avatar"Manibalan P" <pmanibalan@amiindia.co.in>
      Tested-by: default avatar"Manibalan P" <pmanibalan@amiindia.co.in>
      Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1417f897
    • Pavel Shilovsky's avatar
      CIFS: Fix wrong directory attributes after rename · 9c50d4fd
      Pavel Shilovsky authored
      commit b46799a8 upstream.
      
      When we requests rename we also need to update attributes
      of both source and target parent directories. Not doing it
      causes generic/309 xfstest to fail on SMB2 mounts. Fix this
      by marking these directories for force revalidating.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9c50d4fd
    • Takashi Iwai's avatar
      ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co · d26e2c02
      Takashi Iwai authored
      commit f3ee07d8 upstream.
      
      ALC269 & co have many vendor-specific setups with COEF verbs.
      However, some verbs seem specific to some codec versions and they
      result in the codec stalling.  Typically, such a case can be avoided
      by checking the return value from reading a COEF.  If the return value
      is -1, it implies that the COEF is invalid, thus it shouldn't be
      written.
      
      This patch adds the invalid COEF checks in appropriate places
      accessing ALC269 and its variants.  The patch actually fixes the
      resume problem on Acer AO725 laptop.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=52181Tested-by: default avatarFrancesco Muzio <muziofg@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d26e2c02
    • Filipe Manana's avatar
      Btrfs: fix csum tree corruption, duplicate and outdated checksums · b055da33
      Filipe Manana authored
      commit 27b9a812 upstream.
      
      Under rare circumstances we can end up leaving 2 versions of a checksum
      for the same file extent range.
      
      The reason for this is that after calling btrfs_next_leaf we process
      slot 0 of the leaf it returns, instead of processing the slot set in
      path->slots[0]. Most of the time (by far) path->slots[0] is 0, but after
      btrfs_next_leaf() releases the path and before it searches for the next
      leaf, another task might cause a split of the next leaf, which migrates
      some of its keys to the leaf we were processing before calling
      btrfs_next_leaf(). In this case btrfs_next_leaf() returns again the
      same leaf but with path->slots[0] having a slot number corresponding
      to the first new key it got, that is, a slot number that didn't exist
      before calling btrfs_next_leaf(), as the leaf now has more keys than
      it had before. So we must really process the returned leaf starting at
      path->slots[0] always, as it isn't always 0, and the key at slot 0 can
      have an offset much lower than our search offset/bytenr.
      
      For example, consider the following scenario, where we have:
      
      sums->bytenr: 40157184, sums->len: 16384, sums end: 40173568
      four 4kb file data blocks with offsets 40157184, 40161280, 40165376, 40169472
      
        Leaf N:
      
          slot = 0                           slot = btrfs_header_nritems() - 1
        |-------------------------------------------------------------------|
        | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4] |
        |-------------------------------------------------------------------|
      
        Leaf N + 1:
      
            slot = 0                          slot = btrfs_header_nritems() - 1
        |--------------------------------------------------------------------|
        | [(CSUM CSUM 40161280), size 32] ... [((CSUM CSUM 40615936), size 8 |
        |--------------------------------------------------------------------|
      
      Because we are at the last slot of leaf N, we call btrfs_next_leaf() to
      find the next highest key, which releases the current path and then searches
      for that next key. However after releasing the path and before finding that
      next key, the item at slot 0 of leaf N + 1 gets moved to leaf N, due to a call
      to ctree.c:push_leaf_left() (via ctree.c:split_leaf()), and therefore
      btrfs_next_leaf() will returns us a path again with leaf N but with the slot
      pointing to its new last key (CSUM CSUM 40161280). This new version of leaf N
      is then:
      
          slot = 0                        slot = btrfs_header_nritems() - 2  slot = btrfs_header_nritems() - 1
        |----------------------------------------------------------------------------------------------------|
        | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4]  [(CSUM CSUM 40161280), size 32] |
        |----------------------------------------------------------------------------------------------------|
      
      And incorrecly using slot 0, makes us set next_offset to 39239680 and we jump
      into the "insert:" label, which will set tmp to:
      
          tmp = min((sums->len - total_bytes) >> blocksize_bits,
              (next_offset - file_key.offset) >> blocksize_bits) =
          min((16384 - 0) >> 12, (39239680 - 40157184) >> 12) =
          min(4, (u64)-917504 = 18446744073708634112 >> 12) = 4
      
      and
      
         ins_size = csum_size * tmp = 4 * 4 = 16 bytes.
      
      In other words, we insert a new csum item in the tree with key
      (CSUM_OBJECTID CSUM_KEY 40157184 = sums->bytenr) that contains the checksums
      for all the data (4 blocks of 4096 bytes each = sums->len). Which is wrong,
      because the item with key (CSUM CSUM 40161280) (the one that was moved from
      leaf N + 1 to the end of leaf N) contains the old checksums of the last 12288
      bytes of our data and won't get those old checksums removed.
      
      So this leaves us 2 different checksums for 3 4kb blocks of data in the tree,
      and breaks the logical rule:
      
         Key_N+1.offset >= Key_N.offset + length_of_data_its_checksums_cover
      
      An obvious bad effect of this is that a subsequent csum tree lookup to get
      the checksum of any of the blocks with logical offset of 40161280, 40165376
      or 40169472 (the last 3 4kb blocks of file data), will get the old checksums.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b055da33