1. 27 Jul, 2017 40 commits
    • Balbir Singh's avatar
      powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp() · f349607b
      Balbir Singh authored
      commit e71ff982 upstream.
      
      Once upon a time there were only two PP (page protection) bits. In ISA
      2.03 an additional PP bit was added, but because of the layout of the
      HPTE it could not be made contiguous with the existing PP bits.
      
      The result is that we now have three PP bits, named pp0, pp1, pp2,
      where pp0 occupies bit 63 of dword 1 of the HPTE and pp1 and pp2
      occupy bits 1 and 0 respectively. Until recently Linux hasn't used
      pp0, however with the addition of _PAGE_KERNEL_RO we started using it.
      
      The problem arises in the LPAR code, where we need to translate the PP
      bits into the argument for the H_PROTECT hypercall. Currently the code
      only passes bits 0-2 of newpp, which covers pp1, pp2 and N (no
      execute), meaning pp0 is not passed to the hypervisor at all.
      
      We can't simply pass it through in bit 63, as that would collide with a
      different field in the flags argument, as defined in PAPR. Instead we
      have to shift it down to bit 8 (IBM bit 55).
      
      Fixes: e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO")
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      [mpe: Simplify the test, rework change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f349607b
    • Michael Ellerman's avatar
      powerpc/mm/radix: Only add X for pages overlapping kernel text · cab0b22b
      Michael Ellerman authored
      commit 9abcc981 upstream.
      
      Currently we map the whole linear mapping with PAGE_KERNEL_X. Instead we
      should check if the page overlaps the kernel text and only then add
      PAGE_KERNEL_X.
      
      Note that we still use 1G pages if they're available, so this will
      typically still result in a 1G executable page at KERNELBASE. So this fix is
      primarily useful for catching stray branches to high linear mapping addresses.
      
      Without this patch, we can execute at 1G in xmon using:
      
        0:mon> m c000000040000000
        c000000040000000  00 l
        c000000040000000  00000000 01006038
        c000000040000004  00000000 2000804e
        c000000040000008  00000000 x
        0:mon> di c000000040000000
        c000000040000000  38600001      li      r3,1
        c000000040000004  4e800020      blr
        0:mon> p c000000040000000
        return value is 0x1
      
      After we get a 400 as expected:
      
        0:mon> p c000000040000000
        *** 400 exception occurred
      
      Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cab0b22b
    • Paolo Bonzini's avatar
      scsi: virtio_scsi: always read VPD pages for multiqueue too · db367439
      Paolo Bonzini authored
      commit a680f1d4 upstream.
      
      Multi-queue virtio-scsi uses a different scsi_host_template struct.  Add
      the .device_alloc field there, too.
      
      Fixes: 25d1d50e
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarFam Zheng <famz@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db367439
    • Bart Van Assche's avatar
      xen/scsiback: Fix a TMR related use-after-free · 26e21de7
      Bart Van Assche authored
      commit 9f4ab18a upstream.
      
      scsiback_release_cmd() must not dereference se_cmd->se_tmr_req
      because that memory is freed by target_free_cmd_mem() before
      scsiback_release_cmd() is called. Fix this use-after-free by
      inlining struct scsiback_tmr into struct vscsibk_pend.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: David Disseldorp <ddiss@suse.de>
      Cc: xen-devel@lists.xenproject.org
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26e21de7
    • Nicholas Bellinger's avatar
      iscsi-target: Add login_keys_workaround attribute for non RFC initiators · 430c78c7
      Nicholas Bellinger authored
      commit 138d351e upstream.
      
      This patch re-introduces part of a long standing login workaround that
      was recently dropped by:
      
        commit 1c99de98
        Author: Nicholas Bellinger <nab@linux-iscsi.org>
        Date:   Sun Apr 2 13:36:44 2017 -0700
      
            iscsi-target: Drop work-around for legacy GlobalSAN initiator
      
      Namely, the workaround for FirstBurstLength ended up being required by
      Mellanox Flexboot PXE boot ROMs as reported by Robert.
      
      So this patch re-adds the work-around for FirstBurstLength within
      iscsi_check_proposer_for_optional_reply(), and makes the key optional
      to respond when the initiator does not propose, nor respond to it.
      
      Also as requested by Arun, this patch introduces a new TPG attribute
      named 'login_keys_workaround' that controls the use of both the
      FirstBurstLength workaround, as well as the two other existing
      workarounds for gPXE iSCSI boot client.
      
      By default, the workaround is enabled with login_keys_workaround=1,
      since Mellanox FlexBoot requires it, and Arun has verified the Qlogic
      MSFT initiator already proposes FirstBurstLength, so it's uneffected
      by this re-adding this part of the original work-around.
      Reported-by: default avatarRobert LeBlanc <robert@leblancnet.us>
      Cc: Robert LeBlanc <robert@leblancnet.us>
      Reviewed-by: default avatarArun Easi <arun.easi@cavium.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      430c78c7
    • Bart Van Assche's avatar
      scsi: Avoid that scsi_exit_rq() triggers a use-after-free · bbfbcfa3
      Bart Van Assche authored
      commit 8e688254 upstream.
      
      Dereferencing shost from scsi_exit_rq() is not safe because the SCSI
      host may already have been freed when scsi_exit_rq() is called.
      Increasing the shost reference count in scsi_init_rq() and dropping that
      reference in scsi_exit_rq() is nontrivial since scsi_host_dev_release()
      may sleep and since scsi_exit_rq() may be called from interrupt
      context. Since scsi_exit_rq() only needs a single bit from shost, copy
      that bit into struct scsi_cmnd.
      Reported-by: default avatarScott Bauer <scott.bauer@intel.com>
      Fixes: e9c787e6 ("scsi: allocate scsi_cmnd structures as part of struct request")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Scott Bauer <scott.bauer@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbfbcfa3
    • Ewan D. Milne's avatar
      scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state · d5ec2793
      Ewan D. Milne authored
      commit f9279c96 upstream.
      
      The addition of the STARGET_REMOVE state had the side effect of
      introducing a race condition that can cause a crash.
      
      scsi_target_reap_ref_release() checks the starget->state to
      see if it still in STARGET_CREATED, and if so, skips calling
      transport_remove_device() and device_del(), because the starget->state
      is only set to STARGET_RUNNING after scsi_target_add() has called
      device_add() and transport_add_device().
      
      However, if an rport loss occurs while a target is being scanned,
      it can happen that scsi_remove_target() will be called while the
      starget is still in the STARGET_CREATED state.  In this case, the
      starget->state will be set to STARGET_REMOVE, and as a result,
      scsi_target_reap_ref_release() will take the wrong path.  The end
      result is a panic:
      
      [ 1255.356653] Oops: 0000 [#1] SMP
      [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i
      [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G        W       4.11.0+ #8
      [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013
      [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc]
      [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000
      [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0
      [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246
      [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000
      [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025
      [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659
      [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000
      [ 1255.474931] FS:  0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000
      [ 1255.483959] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0
      [ 1255.498332] Call Trace:
      [ 1255.501058]  kernfs_find_and_get_ns+0x31/0x60
      [ 1255.505916]  sysfs_unmerge_group+0x1d/0x60
      [ 1255.510498]  dpm_sysfs_remove+0x22/0x60
      [ 1255.514783]  device_del+0xf4/0x2e0
      [ 1255.518577]  ? device_remove_file+0x19/0x20
      [ 1255.523241]  attribute_container_class_device_del+0x1a/0x20
      [ 1255.529457]  transport_remove_classdev+0x4e/0x60
      [ 1255.534607]  ? transport_add_class_device+0x40/0x40
      [ 1255.540046]  attribute_container_device_trigger+0xb0/0xc0
      [ 1255.546069]  transport_remove_device+0x15/0x20
      [ 1255.551025]  scsi_target_reap_ref_release+0x25/0x40
      [ 1255.556467]  scsi_target_reap+0x2e/0x40
      [ 1255.560744]  __scsi_scan_target+0xaa/0x5b0
      [ 1255.565312]  scsi_scan_target+0xec/0x100
      [ 1255.569689]  fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc]
      [ 1255.576099]  process_one_work+0x14b/0x390
      [ 1255.580569]  worker_thread+0x4b/0x390
      [ 1255.584651]  kthread+0x109/0x140
      [ 1255.588251]  ? rescuer_thread+0x330/0x330
      [ 1255.592730]  ? kthread_park+0x60/0x60
      [ 1255.596815]  ret_from_fork+0x29/0x40
      [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
      [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0
      [ 1255.628479] CR2: 0000000000000068
      [ 1255.632756] ---[ end trace 34a69ba0477d036f ]---
      
      Fix this by adding another scsi_target state STARGET_CREATED_REMOVE
      to distinguish this case.
      
      Fixes: f05795d3 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state")
      Reported-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Tested-by: default avatarLaurence Oberman <loberman@redhat.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5ec2793
    • Quinn Tran's avatar
      scsi: qla2xxx: Allow ABTS, PURX, RIDA on ATIOQ for ISP83XX/27XX · 1dfabed3
      Quinn Tran authored
      commit 3c4810ff upstream.
      
      Driver added mechanism to move ABTS/PUREX/RIDA mailbox to
      ATIO queue as part of commit id 41dc529a
      ("qla2xxx: Improve RSCN handling in driver").
      
      This patch adds a check to only allow ABTS/PURX/RIDA
      to be moved to ATIO Queue for ISP83XX and ISP27XX.
      Signed-off-by: default avatarQuinn Tran <quinn.tran@cavium.com>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Reviewed-by: default avatarBart Van Assche <Bart.VanAssche@sandisk.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1dfabed3
    • Paolo Bonzini's avatar
      scsi: virtio_scsi: let host do exception handling · 2ced4c4c
      Paolo Bonzini authored
      commit e72c9a2a upstream.
      
      virtio_scsi tries to do exception handling after the default 30 seconds
      timeout expires.  However, it's better to let the host control the
      timeout, otherwise with a heavy I/O load it is likely that an abort will
      also timeout.  This leads to fatal errors like filesystems going
      offline.
      
      Disable the 'sd' timeout and allow the host to do exception handling,
      following the precedent of the storvsc driver.
      
      Hannes has a proposal to introduce timeouts in virtio, but this provides
      an immediate solution for stable kernels too.
      
      [mkp: fixed typo]
      Reported-by: default avatarDouglas Miller <dougmill@linux.vnet.ibm.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: linux-scsi@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ced4c4c
    • Maurizio Lombardi's avatar
      scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails. · ad8e43cf
      Maurizio Lombardi authored
      commit 62e62ffd upstream.
      
      The enclosure_add_device() function should fail if it can't create the
      relevant sysfs links.
      Signed-off-by: default avatarMaurizio Lombardi <mlombard@redhat.com>
      Tested-by: default avatarDouglas Miller <dougmill@linux.vnet.ibm.com>
      Acked-by: default avatarJames Bottomley <jejb@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad8e43cf
    • Krzysztof Kozlowski's avatar
      PM / Domains: Fix unsafe iteration over modified list of domains · adc01577
      Krzysztof Kozlowski authored
      commit a7e2d1bc upstream.
      
      of_genpd_remove_last() iterates over list of domains and removes
      matching element thus it has to use safe version of list iteration.
      
      Fixes: 17926551 (PM / Domains: Add support for removing nested PM domains by provider)
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adc01577
    • Krzysztof Kozlowski's avatar
      PM / Domains: Fix unsafe iteration over modified list of domain providers · 01ee0ee7
      Krzysztof Kozlowski authored
      commit b556b15d upstream.
      
      of_genpd_del_provider() iterates over list of domain provides and
      removes matching element thus it has to use safe version of list
      iteration.
      
      Fixes: aa42240a (PM / Domains: Add generic OF-based PM domain look-up)
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01ee0ee7
    • Krzysztof Kozlowski's avatar
      PM / Domains: Fix unsafe iteration over modified list of device links · 61461312
      Krzysztof Kozlowski authored
      commit c6e83cac upstream.
      
      pm_genpd_remove_subdomain() iterates over domain's master_links list and
      removes matching element thus it has to use safe version of list
      iteration.
      
      Fixes: f721889f ("PM / Domains: Support for generic I/O PM domains (v8)")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61461312
    • Peter Rosin's avatar
      ASoC: atmel: tse850: fix off-by-one in the "ANA" enumeration count · 5a18ee01
      Peter Rosin authored
      commit a00cebf5 upstream.
      
      At some point I added the "Low" entry at the beginning of the array
      without bumping the enumeration count from 9 to 10. Fix this. While at
      it, fix the anti-pattern for the other enumeration (used by MUX{1,2}).
      
      Fixes: aa431124 ("ASoC: atmel: tse850: add ASoC driver for the Axentia TSE-850")
      Signed-off-by: default avatarPeter Rosin <peda@axentia.se>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a18ee01
    • Satish Babu Patakokila's avatar
      ASoC: compress: Derive substream from stream based on direction · 8024384e
      Satish Babu Patakokila authored
      commit 01b8cedf upstream.
      
      Currently compress driver hardcodes direction as playback to get
      substream from the stream. This results in getting the incorrect
      substream for compressed capture usecase.
      To fix this, remove the hardcoding and derive substream based on
      the stream direction.
      Signed-off-by: default avatarSatish Babu Patakokila <sbpata@codeaurora.org>
      Signed-off-by: default avatarBanajit Goswami <bgoswami@codeaurora.org>
      Acked-By: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8024384e
    • Shawn Guo's avatar
      ASoC: zx-i2s: flip I2S master/slave mode · 1f3fb267
      Shawn Guo authored
      commit a205c159 upstream.
      
      The SND_SOC_DAIFMT_MASTER bits are defined to specify the master/slave
      mode for Codec, not I2S.  So the I2S master/slave mode should be flipped
      according to SND_SOC_DAIFMT_MASTER bits.
      Signed-off-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f3fb267
    • Cyrille Pitchen's avatar
      spi: atmel: fix corrupted data issue on SAM9 family SoCs · 2e74a4f5
      Cyrille Pitchen authored
      commit 7094576c upstream.
      
      This patch disables the use of the DMA for data transfer and forces the
      use of PIO transfers instead as a quick fixup to solve the cache aliasing
      issue on ARM9 based cores, which embeds a VIVT data cache.
      
      Indeed in the case of VIVT data caches, it is not safe to call dma_map_*()
      functions to map buffers for DMA transfers when those buffers have been
      allocated by vmalloc() or from any DMA-unsafe area.
      
      Further patches may propose a better solution based on the use of a bounce
      buffer at the SPI sub-system level but such solution needs more time to be
      discussed. Then the use of DMA transfers could be enabled again to improve
      the performances but before that, this patch already solves the issue.
      Signed-off-by: default avatarCyrille Pitchen <cyrille.pitchen@microchip.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e74a4f5
    • Matwey V Kornilov's avatar
      igb: Explicitly select page 0 at initialization · 9317c1d0
      Matwey V Kornilov authored
      commit 440aeca4 upstream.
      
      The functions igb_read_phy_reg_gs40g/igb_write_phy_reg_gs40g (which were
      removed in 2a3cdead) explicitly selected the required page at every phy_reg
      access. Currently, igb_get_phy_id_82575 relays on the fact that page 0 is
      already selected. The assumption is not fulfilled for my Lex 3I380CW
      motherboard with integrated dual i211 based gigabit ethernet. This leads to igb
      initialization failure and network interfaces are not working:
      
          igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
          igb: Copyright (c) 2007-2014 Intel Corporation.
          igb: probe of 0000:01:00.0 failed with error -2
          igb: probe of 0000:02:00.0 failed with error -2
      
      In order to fix it, we explicitly select page 0 before first access to phy
      registers.
      
      See also: https://bugzilla.suse.com/show_bug.cgi?id=1009911
      See also: http://www.lex.com.tw/products/pdf/3I380A&3I380CW.pdf
      
      Fixes: 2a3cdead ("igb: Remove GS40G specific defines/functions")
      Signed-off-by: default avatarMatwey V Kornilov <matwey@sai.msu.ru>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9317c1d0
    • Filipe Manana's avatar
      Btrfs: incremental send, fix invalid memory access · 114571a8
      Filipe Manana authored
      commit 24e52b11 upstream.
      
      When doing an incremental send, while processing an extent that changed
      between the parent and send snapshots and that extent was an inline extent
      in the parent snapshot, it's possible to access a memory region beyond
      the end of leaf if the inline extent is very small and it is the first
      item in a leaf.
      
      An example scenario is described below.
      
      The send snapshot has the following leaf:
      
       leaf 33865728 items 33 free space 773 generation 46 owner 5
       fs uuid ab7090d8-dafd-4fb9-9246-723b6d2e2fb7
       chunk uuid 2d16478c-c704-4ab9-b574-68bff2281b1f
              (...)
              item 14 key (335 EXTENT_DATA 0) itemoff 3052 itemsize 53
                      generation 36 type 1 (regular)
                      extent data disk byte 12791808 nr 4096
                      extent data offset 0 nr 4096 ram 4096
                      extent compression 0 (none)
              item 15 key (335 EXTENT_DATA 8192) itemoff 2999 itemsize 53
                      generation 36 type 1 (regular)
                      extent data disk byte 138170368 nr 225280
                      extent data offset 0 nr 225280 ram 225280
                      extent compression 0 (none)
              (...)
      
      And the parent snapshot has the following leaf:
      
       leaf 31272960 items 17 free space 17 generation 31 owner 5
       fs uuid ab7090d8-dafd-4fb9-9246-723b6d2e2fb7
       chunk uuid 2d16478c-c704-4ab9-b574-68bff2281b1f
              item 0 key (335 EXTENT_DATA 0) itemoff 3951 itemsize 44
                      generation 31 type 0 (inline)
                      inline extent data size 23 ram_bytes 613 compression 1 (zlib)
              (...)
      
      When computing the send stream, it is detected that the extent of inode
      335, at file offset 0, and at fs/btrfs/send.c:is_extent_unchanged() we
      grab the leaf from the parent snapshot and access the inline extent item.
      However, before jumping to the 'out' label, we access the 'offset' and
      'disk_bytenr' fields of the extent item, which should not be done for
      inline extents since the inlined data starts at the offset of the
      'disk_bytenr' field and can be very small. For example accessing the
      'offset' field of the file extent item results in the following trace:
      
      [  599.705368] general protection fault: 0000 [#1] PREEMPT SMP
      [  599.706296] Modules linked in: btrfs psmouse i2c_piix4 ppdev acpi_cpufreq serio_raw parport_pc i2c_core evdev tpm_tis tpm_tis_core sg pcspkr parport tpm button su$
      [  599.709340] CPU: 7 PID: 5283 Comm: btrfs Not tainted 4.10.0-rc8-btrfs-next-46+ #1
      [  599.709340] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [  599.709340] task: ffff88023eedd040 task.stack: ffffc90006658000
      [  599.709340] RIP: 0010:read_extent_buffer+0xdb/0xf4 [btrfs]
      [  599.709340] RSP: 0018:ffffc9000665ba00 EFLAGS: 00010286
      [  599.709340] RAX: db73880000000000 RBX: 0000000000000000 RCX: 0000000000000001
      [  599.709340] RDX: ffffc9000665ba60 RSI: db73880000000000 RDI: ffffc9000665ba5f
      [  599.709340] RBP: ffffc9000665ba30 R08: 0000000000000001 R09: ffff88020dc5e098
      [  599.709340] R10: 0000000000001000 R11: 0000160000000000 R12: 6db6db6db6db6db7
      [  599.709340] R13: ffff880000000000 R14: 0000000000000000 R15: ffff88020dc5e088
      [  599.709340] FS:  00007f519555a8c0(0000) GS:ffff88023f3c0000(0000) knlGS:0000000000000000
      [  599.709340] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  599.709340] CR2: 00007f1411afd000 CR3: 0000000235f8e000 CR4: 00000000000006e0
      [  599.709340] Call Trace:
      [  599.709340]  btrfs_get_token_64+0x93/0xce [btrfs]
      [  599.709340]  ? printk+0x48/0x50
      [  599.709340]  btrfs_get_64+0xb/0xd [btrfs]
      [  599.709340]  process_extent+0x3a1/0x1106 [btrfs]
      [  599.709340]  ? btree_read_extent_buffer_pages+0x5/0xef [btrfs]
      [  599.709340]  changed_cb+0xb03/0xb3d [btrfs]
      [  599.709340]  ? btrfs_get_token_32+0x7a/0xcc [btrfs]
      [  599.709340]  btrfs_compare_trees+0x432/0x53d [btrfs]
      [  599.709340]  ? process_extent+0x1106/0x1106 [btrfs]
      [  599.709340]  btrfs_ioctl_send+0x960/0xe26 [btrfs]
      [  599.709340]  btrfs_ioctl+0x181b/0x1fed [btrfs]
      [  599.709340]  ? trace_hardirqs_on_caller+0x150/0x1ac
      [  599.709340]  vfs_ioctl+0x21/0x38
      [  599.709340]  ? vfs_ioctl+0x21/0x38
      [  599.709340]  do_vfs_ioctl+0x611/0x645
      [  599.709340]  ? rcu_read_unlock+0x5b/0x5d
      [  599.709340]  ? __fget+0x6d/0x79
      [  599.709340]  SyS_ioctl+0x57/0x7b
      [  599.709340]  entry_SYSCALL_64_fastpath+0x18/0xad
      [  599.709340] RIP: 0033:0x7f51945eec47
      [  599.709340] RSP: 002b:00007ffc21c13e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  599.709340] RAX: ffffffffffffffda RBX: ffffffff81096459 RCX: 00007f51945eec47
      [  599.709340] RDX: 00007ffc21c13f20 RSI: 0000000040489426 RDI: 0000000000000004
      [  599.709340] RBP: ffffc9000665bf98 R08: 00007f519450d700 R09: 00007f519450d700
      [  599.709340] R10: 00007f519450d9d0 R11: 0000000000000202 R12: 0000000000000046
      [  599.709340] R13: ffffc9000665bf78 R14: 0000000000000000 R15: 00007f5195574040
      [  599.709340]  ? trace_hardirqs_off_caller+0x43/0xb1
      [  599.709340] Code: 29 f0 49 39 d8 4c 0f 47 c3 49 03 81 58 01 00 00 44 89 c1 4c 01 c2 4c 29 c3 48 c1 f8 03 49 0f af c4 48 c1 e0 0c 4c 01 e8 48 01 c6 <f3> a4 31 f6 4$
      [  599.709340] RIP: read_extent_buffer+0xdb/0xf4 [btrfs] RSP: ffffc9000665ba00
      [  599.762057] ---[ end trace fe00d7af61b9f49e ]---
      
      This is because the 'offset' field starts at an offset of 37 bytes
      (offsetof(struct btrfs_file_extent_item, offset)), has a length of 8
      bytes and therefore attemping to read it causes a 1 byte access beyond
      the end of the leaf, as the first item's content in a leaf is located
      at the tail of the leaf, the item size is 44 bytes and the offset of
      that field plus its length (37 + 8 = 45) goes beyond the item's size
      by 1 byte.
      
      So fix this by accessing the 'offset' and 'disk_bytenr' fields after
      jumping to the 'out' label if we are processing an inline extent. We
      move the reading operation of the 'disk_bytenr' field too because we
      have the same problem as for the 'offset' field explained above when
      the inline data is less then 8 bytes. The access to the 'generation'
      field is also moved but just for the sake of grouping access to all
      the fields.
      
      Fixes: e1cbfd7b ("Btrfs: send, fix file hole not being preserved due to inline extent")
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      114571a8
    • Jan Kara's avatar
      btrfs: Don't clear SGID when inheriting ACLs · 88b4b815
      Jan Kara authored
      commit b7f8a09f upstream.
      
      When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
      set, DIR1 is expected to have SGID bit set (and owning group equal to
      the owning group of 'DIR0'). However when 'DIR0' also has some default
      ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
      'DIR1' to get cleared if user is not member of the owning group.
      
      Fix the problem by moving posix_acl_update_mode() out of
      __btrfs_set_acl() into btrfs_set_acl(). That way the function will not be
      called when inheriting ACLs which is what we want as it prevents SGID
      bit clearing and the mode has been properly set by posix_acl_create()
      anyway.
      
      Fixes: 07393101
      CC: linux-btrfs@vger.kernel.org
      CC: David Sterba <dsterba@suse.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88b4b815
    • Filipe Manana's avatar
      Btrfs: fix invalid extent maps due to hole punching · dad8a6e1
      Filipe Manana authored
      commit 609805d8 upstream.
      
      While punching a hole in a range that is not aligned with the sector size
      (currently the same as the page size) we can end up leaving an extent map
      in memory with a length that is smaller then the sector size or with a
      start offset that is not aligned to the sector size. Both cases are not
      expected and can lead to problems. This issue is easily detected
      after the patch from commit a7e3b975 ("Btrfs: fix reported number of
      inode blocks"), introduced in kernel 4.12-rc1, in a scenario like the
      following for example:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
        $ xfs_io -c "pwrite -S 0xaa -b 100K 0 100K" /mnt/foo
        $ xfs_io -c "fpunch 60K 90K" /mnt/foo
        $ xfs_io -c "pwrite -S 0xbb -b 100K 50K 100K" /mnt/foo
        $ xfs_io -c "pwrite -S 0xcc -b 50K 100K 50K" /mnt/foo
        $ umount /mnt
      
      After the unmount operation we can see several warnings emmitted due to
      underflows related to space reservation counters:
      
      [ 2837.443299] ------------[ cut here ]------------
      [ 2837.447395] WARNING: CPU: 8 PID: 2474 at fs/btrfs/inode.c:9444 btrfs_destroy_inode+0xe8/0x27e [btrfs]
      [ 2837.452108] Modules linked in: dm_flakey dm_mod ppdev parport_pc psmouse parport sg pcspkr acpi_cpufreq tpm_tis tpm_tis_core i2c_piix4 i2c_core evdev tpm button se
      rio_raw sunrpc loop autofs4 ext4 crc16 jbd2 mbcache btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_gene
      ric raid1 raid0 multipath linear md_mod sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy
      [ 2837.458389] CPU: 8 PID: 2474 Comm: umount Tainted: G        W       4.10.0-rc8-btrfs-next-43+ #1
      [ 2837.459754] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [ 2837.462379] Call Trace:
      [ 2837.462379]  dump_stack+0x68/0x92
      [ 2837.462379]  __warn+0xc2/0xdd
      [ 2837.462379]  warn_slowpath_null+0x1d/0x1f
      [ 2837.462379]  btrfs_destroy_inode+0xe8/0x27e [btrfs]
      [ 2837.462379]  destroy_inode+0x3d/0x55
      [ 2837.462379]  evict+0x177/0x17e
      [ 2837.462379]  dispose_list+0x50/0x71
      [ 2837.462379]  evict_inodes+0x132/0x141
      [ 2837.462379]  generic_shutdown_super+0x3f/0xeb
      [ 2837.462379]  kill_anon_super+0x12/0x1c
      [ 2837.462379]  btrfs_kill_super+0x16/0x21 [btrfs]
      [ 2837.462379]  deactivate_locked_super+0x30/0x68
      [ 2837.462379]  deactivate_super+0x36/0x39
      [ 2837.462379]  cleanup_mnt+0x58/0x76
      [ 2837.462379]  __cleanup_mnt+0x12/0x14
      [ 2837.462379]  task_work_run+0x77/0x9b
      [ 2837.462379]  prepare_exit_to_usermode+0x9d/0xc5
      [ 2837.462379]  syscall_return_slowpath+0x196/0x1b9
      [ 2837.462379]  entry_SYSCALL_64_fastpath+0xab/0xad
      [ 2837.462379] RIP: 0033:0x7f3ef3e6b9a7
      [ 2837.462379] RSP: 002b:00007ffdd0d8de58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [ 2837.462379] RAX: 0000000000000000 RBX: 0000556f76a39060 RCX: 00007f3ef3e6b9a7
      [ 2837.462379] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556f76a3f910
      [ 2837.462379] RBP: 0000556f76a3f910 R08: 0000556f76a3e670 R09: 0000000000000015
      [ 2837.462379] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f3ef436ce64
      [ 2837.462379] R13: 0000000000000000 R14: 0000556f76a39240 R15: 00007ffdd0d8e0e0
      [ 2837.519355] ---[ end trace e79345fe24b30b8d ]---
      [ 2837.596256] ------------[ cut here ]------------
      [ 2837.597625] WARNING: CPU: 8 PID: 2474 at fs/btrfs/extent-tree.c:5699 btrfs_free_block_groups+0x246/0x3eb [btrfs]
      [ 2837.603547] Modules linked in: dm_flakey dm_mod ppdev parport_pc psmouse parport sg pcspkr acpi_cpufreq tpm_tis tpm_tis_core i2c_piix4 i2c_core evdev tpm button serio_raw sunrpc loop autofs4 ext4 crc16 jbd2 mbcache btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy
      [ 2837.659372] CPU: 8 PID: 2474 Comm: umount Tainted: G        W       4.10.0-rc8-btrfs-next-43+ #1
      [ 2837.663359] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [ 2837.663359] Call Trace:
      [ 2837.663359]  dump_stack+0x68/0x92
      [ 2837.663359]  __warn+0xc2/0xdd
      [ 2837.663359]  warn_slowpath_null+0x1d/0x1f
      [ 2837.663359]  btrfs_free_block_groups+0x246/0x3eb [btrfs]
      [ 2837.663359]  close_ctree+0x1dd/0x2e1 [btrfs]
      [ 2837.663359]  ? evict_inodes+0x132/0x141
      [ 2837.663359]  btrfs_put_super+0x15/0x17 [btrfs]
      [ 2837.663359]  generic_shutdown_super+0x6a/0xeb
      [ 2837.663359]  kill_anon_super+0x12/0x1c
      [ 2837.663359]  btrfs_kill_super+0x16/0x21 [btrfs]
      [ 2837.663359]  deactivate_locked_super+0x30/0x68
      [ 2837.663359]  deactivate_super+0x36/0x39
      [ 2837.663359]  cleanup_mnt+0x58/0x76
      [ 2837.663359]  __cleanup_mnt+0x12/0x14
      [ 2837.663359]  task_work_run+0x77/0x9b
      [ 2837.663359]  prepare_exit_to_usermode+0x9d/0xc5
      [ 2837.663359]  syscall_return_slowpath+0x196/0x1b9
      [ 2837.663359]  entry_SYSCALL_64_fastpath+0xab/0xad
      [ 2837.663359] RIP: 0033:0x7f3ef3e6b9a7
      [ 2837.663359] RSP: 002b:00007ffdd0d8de58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [ 2837.663359] RAX: 0000000000000000 RBX: 0000556f76a39060 RCX: 00007f3ef3e6b9a7
      [ 2837.663359] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556f76a3f910
      [ 2837.663359] RBP: 0000556f76a3f910 R08: 0000556f76a3e670 R09: 0000000000000015
      [ 2837.663359] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f3ef436ce64
      [ 2837.663359] R13: 0000000000000000 R14: 0000556f76a39240 R15: 00007ffdd0d8e0e0
      [ 2837.739445] ---[ end trace e79345fe24b30b8e ]---
      [ 2837.745595] ------------[ cut here ]------------
      [ 2837.746412] WARNING: CPU: 8 PID: 2474 at fs/btrfs/extent-tree.c:5700 btrfs_free_block_groups+0x261/0x3eb [btrfs]
      [ 2837.747955] Modules linked in: dm_flakey dm_mod ppdev parport_pc psmouse parport sg pcspkr acpi_cpufreq tpm_tis tpm_tis_core i2c_piix4 i2c_core evdev tpm button serio_raw sunrpc loop autofs4 ext4 crc16 jbd2 mbcache btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy
      [ 2837.755395] CPU: 8 PID: 2474 Comm: umount Tainted: G        W       4.10.0-rc8-btrfs-next-43+ #1
      [ 2837.756769] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [ 2837.758526] Call Trace:
      [ 2837.758925]  dump_stack+0x68/0x92
      [ 2837.759383]  __warn+0xc2/0xdd
      [ 2837.759383]  warn_slowpath_null+0x1d/0x1f
      [ 2837.759383]  btrfs_free_block_groups+0x261/0x3eb [btrfs]
      [ 2837.759383]  close_ctree+0x1dd/0x2e1 [btrfs]
      [ 2837.759383]  ? evict_inodes+0x132/0x141
      [ 2837.759383]  btrfs_put_super+0x15/0x17 [btrfs]
      [ 2837.759383]  generic_shutdown_super+0x6a/0xeb
      [ 2837.759383]  kill_anon_super+0x12/0x1c
      [ 2837.759383]  btrfs_kill_super+0x16/0x21 [btrfs]
      [ 2837.759383]  deactivate_locked_super+0x30/0x68
      [ 2837.759383]  deactivate_super+0x36/0x39
      [ 2837.759383]  cleanup_mnt+0x58/0x76
      [ 2837.759383]  __cleanup_mnt+0x12/0x14
      [ 2837.759383]  task_work_run+0x77/0x9b
      [ 2837.759383]  prepare_exit_to_usermode+0x9d/0xc5
      [ 2837.759383]  syscall_return_slowpath+0x196/0x1b9
      [ 2837.759383]  entry_SYSCALL_64_fastpath+0xab/0xad
      [ 2837.759383] RIP: 0033:0x7f3ef3e6b9a7
      [ 2837.759383] RSP: 002b:00007ffdd0d8de58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [ 2837.759383] RAX: 0000000000000000 RBX: 0000556f76a39060 RCX: 00007f3ef3e6b9a7
      [ 2837.759383] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556f76a3f910
      [ 2837.759383] RBP: 0000556f76a3f910 R08: 0000556f76a3e670 R09: 0000000000000015
      [ 2837.759383] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f3ef436ce64
      [ 2837.759383] R13: 0000000000000000 R14: 0000556f76a39240 R15: 00007ffdd0d8e0e0
      [ 2837.777063] ---[ end trace e79345fe24b30b8f ]---
      [ 2837.778235] ------------[ cut here ]------------
      [ 2837.778856] WARNING: CPU: 8 PID: 2474 at fs/btrfs/extent-tree.c:9825 btrfs_free_block_groups+0x348/0x3eb [btrfs]
      [ 2837.791385] Modules linked in: dm_flakey dm_mod ppdev parport_pc psmouse parport sg pcspkr acpi_cpufreq tpm_tis tpm_tis_core i2c_piix4 i2c_core evdev tpm button serio_raw sunrpc loop autofs4 ext4 crc16 jbd2 mbcache btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy
      [ 2837.797711] CPU: 8 PID: 2474 Comm: umount Tainted: G        W       4.10.0-rc8-btrfs-next-43+ #1
      [ 2837.798594] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [ 2837.800118] Call Trace:
      [ 2837.800515]  dump_stack+0x68/0x92
      [ 2837.801015]  __warn+0xc2/0xdd
      [ 2837.801471]  warn_slowpath_null+0x1d/0x1f
      [ 2837.801698]  btrfs_free_block_groups+0x348/0x3eb [btrfs]
      [ 2837.801698]  close_ctree+0x1dd/0x2e1 [btrfs]
      [ 2837.801698]  ? evict_inodes+0x132/0x141
      [ 2837.801698]  btrfs_put_super+0x15/0x17 [btrfs]
      [ 2837.801698]  generic_shutdown_super+0x6a/0xeb
      [ 2837.801698]  kill_anon_super+0x12/0x1c
      [ 2837.801698]  btrfs_kill_super+0x16/0x21 [btrfs]
      [ 2837.801698]  deactivate_locked_super+0x30/0x68
      [ 2837.801698]  deactivate_super+0x36/0x39
      [ 2837.801698]  cleanup_mnt+0x58/0x76
      [ 2837.801698]  __cleanup_mnt+0x12/0x14
      [ 2837.801698]  task_work_run+0x77/0x9b
      [ 2837.801698]  prepare_exit_to_usermode+0x9d/0xc5
      [ 2837.801698]  syscall_return_slowpath+0x196/0x1b9
      [ 2837.801698]  entry_SYSCALL_64_fastpath+0xab/0xad
      [ 2837.801698] RIP: 0033:0x7f3ef3e6b9a7
      [ 2837.801698] RSP: 002b:00007ffdd0d8de58 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
      [ 2837.801698] RAX: 0000000000000000 RBX: 0000556f76a39060 RCX: 00007f3ef3e6b9a7
      [ 2837.801698] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556f76a3f910
      [ 2837.801698] RBP: 0000556f76a3f910 R08: 0000556f76a3e670 R09: 0000000000000015
      [ 2837.801698] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f3ef436ce64
      [ 2837.801698] R13: 0000000000000000 R14: 0000556f76a39240 R15: 00007ffdd0d8e0e0
      [ 2837.818441] ---[ end trace e79345fe24b30b90 ]---
      [ 2837.818991] BTRFS info (device sdc): space_info 1 has 7974912 free, is not full
      [ 2837.819830] BTRFS info (device sdc): space_info total=8388608, used=417792, pinned=0, reserved=0, may_use=18446744073709547520, readonly=0
      
      What happens in the above example is the following:
      
      1) When punching the hole, at btrfs_punch_hole(), the variable tail_len
         is set to 2048 (as tail_start is 148Kb + 1 and offset + len is 150Kb).
         This results in the creation of an extent map with a length of 2Kb
         starting at file offset 148Kb, through find_first_non_hole() ->
         btrfs_get_extent().
      
      2) The second write (first write after the hole punch operation), sets
         the range [50Kb, 152Kb[ to delalloc.
      
      3) The third write, at btrfs_find_new_delalloc_bytes(), sees the extent
         map covering the range [148Kb, 150Kb[ and ends up calling
         set_extent_bit() for the same range, which results in splitting an
         existing extent state record, covering the range [148Kb, 152Kb[ into
         two 2Kb extent state records, covering the ranges [148Kb, 150Kb[ and
         [150Kb, 152Kb[.
      
      4) Finally at lock_and_cleanup_extent_if_need(), immediately after calling
         btrfs_find_new_delalloc_bytes() we clear the delalloc bit from the
         range [100Kb, 152Kb[ which results in the btrfs_clear_bit_hook()
         callback being invoked against the two 2Kb extent state records that
         cover the ranges [148Kb, 150Kb[ and [150Kb, 152Kb[. When called against
         the first 2Kb extent state, it calls btrfs_delalloc_release_metadata()
         with a length argument of 2048 bytes. That function rounds up the length
         to a sector size aligned length, so it ends up considering a length of
         4096 bytes, and then calls calc_csum_metadata_size() which results in
         decrementing the inode's csum_bytes counter by 4096 bytes, so after
         it stays a value of 0 bytes. Then the same happens when
         btrfs_clear_bit_hook() is called against the second extent state that
         has a length of 2Kb, covering the range [150Kb, 152Kb[, the length is
         rounded up to 4096 and calc_csum_metadata_size() ends up being called
         to decrement 4096 bytes from the inode's csum_bytes counter, which
         at that time has a value of 0, leading to an underflow, which is
         exactly what triggers the first warning, at btrfs_destroy_inode().
         All the other warnings relate to several space accounting counters
         that underflow as well due to similar reasons.
      
      A similar case but where the hole punching operation creates an extent map
      with a start offset not aligned to the sector size is the following:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
        $ xfs_io -f -c "fpunch 695K 820K" $SCRATCH_MNT/bar
        $ xfs_io -c "pwrite -S 0xaa 1008K 307K" $SCRATCH_MNT/bar
        $ xfs_io -c "pwrite -S 0xbb -b 630K 1073K 630K" $SCRATCH_MNT/bar
        $ xfs_io -c "pwrite -S 0xcc -b 459K 1068K 459K" $SCRATCH_MNT/bar
        $ umount /mnt
      
      During the unmount operation we get similar traces for the same reasons as
      in the first example.
      
      So fix the hole punching operation to make sure it never creates extent
      maps with a length that is not aligned to the sector size nor with a start
      offset that is not aligned to the sector size, as this breaks all
      assumptions and it's a land mine.
      
      Fixes: d7781546 ("btrfs: Avoid trucating page or punching hole in a already existed hole.")
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dad8a6e1
    • Brian Norris's avatar
      mwifiex: fixup error cases in mwifiex_add_virtual_intf() · a88266ef
      Brian Norris authored
      commit 8535107a upstream.
      
      If we fail to add an interface in mwifiex_add_virtual_intf(), we might
      hit a BUG_ON() in the networking code, because we didn't tear things
      down properly. Among the problems:
      
       (a) when failing to allocate workqueues, we fail to unregister the
           netdev before calling free_netdev()
       (b) even if we do try to unregister the netdev, we're still holding the
           rtnl lock, so the device never properly unregistered; we'll be at
           state NETREG_UNREGISTERING, and then hit free_netdev()'s:
      	BUG_ON(dev->reg_state != NETREG_UNREGISTERED);
       (c) we're allocating some dependent resources (e.g., DFS workqueues)
           after we've registered the interface; this may or may not cause
           problems, but it's good practice to allocate these before registering
       (d) we're not even trying to unwind anything when mwifiex_send_cmd() or
           mwifiex_sta_init_cmd() fail
      
      To fix these issues, let's:
      
       * add a stacked set of error handling labels, to keep error handling
         consistent and properly ordered (resolving (a) and (d))
       * move the workqueue allocations before the registration (to resolve
         (c); also resolves (b) by avoiding error cases where we have to
         unregister)
      
      [Incidentally, it's pretty easy to interrupt the alloc_workqueue() in,
      e.g., the following:
      
        iw phy phy0 interface add mlan0 type station
      
      by sending it SIGTERM.]
      
      This bugfix covers commits like commit 7d652034 ("mwifiex: channel
      switch support for mwifiex"), but parts of this bug exist all the way
      back to the introduction of dynamic interface handling in commit
      93a1df48 ("mwifiex: add cfg80211 handlers add/del_virtual_intf").
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a88266ef
    • Ankit Kumar's avatar
      pstore: Don't warn if data is uncompressed and type is not PSTORE_TYPE_DMESG · ad000510
      Ankit Kumar authored
      commit 4a16d1cb upstream.
      
      commit 9abdcccc ("pstore: Extract common arguments into structure")
      moved record decompression to function. decompress_record() gets
      called without checking type and compressed flag. Warning will be
      reported if data is uncompressed. Pstore type PSTORE_TYPE_PPC_OPAL,
      PSTORE_TYPE_PPC_COMMON doesn't contain compressed data and warning get
      printed part of dmesg.
      
      Partial dmesg log:
      [   35.848914] pstore: ignored compressed record type 6
      [   35.848927] pstore: ignored compressed record type 8
      
      Above warning should not get printed as it is known that data won't be
      compressed for above type and it is valid condition.
      
      This patch returns if data is not compressed and print warning only if
      data is compressed and type is not PSTORE_TYPE_DMESG.
      Reported-by: default avatarAnton Blanchard <anton@au1.ibm.com>
      Signed-off-by: default avatarAnkit Kumar <ankit@linux.vnet.ibm.com>
      Reviewed-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Fixes: 9abdcccc ("pstore: Extract common arguments into structure")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad000510
    • Arnd Bergmann's avatar
      wlcore: fix 64K page support · 1566e059
      Arnd Bergmann authored
      commit 4a4274bf upstream.
      
      In the stable linux-3.16 branch, I ran into a warning in the
      wlcore driver:
      
      drivers/net/wireless/ti/wlcore/spi.c: In function 'wl12xx_spi_raw_write':
      drivers/net/wireless/ti/wlcore/spi.c:315:1: error: the frame size of 12848 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
      
      Newer kernels no longer show the warning, but the bug is still there,
      as the allocation is based on the CPU page size rather than the
      actual capabilities of the hardware.
      
      This replaces the PAGE_SIZE macro with the SZ_4K macro, i.e. 4096 bytes
      per buffer.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1566e059
    • Jason A. Donenfeld's avatar
      Bluetooth: use constant time memory comparison for secret values · b53b6793
      Jason A. Donenfeld authored
      commit 329d8230 upstream.
      
      This file is filled with complex cryptography. Thus, the comparisons of
      MACs and secret keys and curve points and so forth should not add timing
      attacks, which could either result in a direct forgery, or, given the
      complexity, some other type of attack.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b53b6793
    • Adrian Hunter's avatar
      perf intel-pt: Clear FUP flag on error · 3f30aa79
      Adrian Hunter authored
      commit 6a558f12 upstream.
      
      Sometimes a FUP packet is associated with a TSX transaction and a flag is
      set to indicate that. Ensure that flag is cleared on any error condition
      because at that point the decoder can no longer assume it is correct.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f30aa79
    • Adrian Hunter's avatar
      perf intel-pt: Use FUP always when scanning for an IP · ea6c2549
      Adrian Hunter authored
      commit 622b7a47 upstream.
      
      The decoder will try to use branch packets to find an IP to start decoding
      or to recover from errors. Currently the FUP packet is used only in the
      case of an overflow, however there is no reason for that to be a special
      case. So just use FUP always when scanning for an IP.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea6c2549
    • Adrian Hunter's avatar
      perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero · c68f8e30
      Adrian Hunter authored
      commit f952eace upstream.
      
      Intel PT uses IP compression based on the last IP. For decoding purposes,
      'last IP' is not updated when a branch target has been suppressed, which is
      indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
      ensure never to set 'last_ip' when packet 'count' is zero.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c68f8e30
    • Adrian Hunter's avatar
      perf intel-pt: Fix last_ip usage · d7cccb4b
      Adrian Hunter authored
      commit ee14ac0e upstream.
      
      Intel PT uses IP compression based on the last IP. For decoding
      purposes, 'last IP' is considered to be reset to zero whenever there is
      a synchronization packet (PSB). The decoder wasn't doing that, and was
      treating the zero value to mean that there was no last IP, whereas
      compression can be done against the zero value. Fix by setting last_ip
      to zero when a PSB is received and keep track of have_last_ip.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7cccb4b
    • Adrian Hunter's avatar
      perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP · c93c534c
      Adrian Hunter authored
      commit ad7167a8 upstream.
      
      A value of zero is used to indicate that there is no IP. Ensure the
      value is zero when the state is INTEL_PT_STATE_NO_IP.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c93c534c
    • Adrian Hunter's avatar
      perf intel-pt: Fix missing stack clear · 14f2cad1
      Adrian Hunter authored
      commit 12b70806 upstream.
      
      The return compression stack must be cleared whenever there is a PSB. Fix
      one case where that was not happening.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14f2cad1
    • Adrian Hunter's avatar
      perf intel-pt: Improve sample timestamp · 309721dd
      Adrian Hunter authored
      commit 3f04d98e upstream.
      
      The decoder uses its current timestamp in samples. Usually that is a
      timestamp that has already passed, but in some cases it is a timestamp
      for a branch that the decoder is walking towards, and consequently
      hasn't reached. Improve that situation by using the pkt_state to
      determine when to use the current or previous timestamp.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      309721dd
    • Adrian Hunter's avatar
      perf intel-pt: Move decoder error setting into one condition · 84c366c6
      Adrian Hunter authored
      commit 22c06892 upstream.
      
      Move decoder error setting into one condition.
      
      Cc'ed to stable because later fixes depend on it.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84c366c6
    • Mateusz Jurczyk's avatar
      NFC: Add sockaddr length checks before accessing sa_family in bind handlers · c9270e37
      Mateusz Jurczyk authored
      commit f6a5885f upstream.
      
      Verify that the caller-provided sockaddr structure is large enough to
      contain the sa_family field, before accessing it in bind() handlers of the
      AF_NFC socket. Since the syscall doesn't enforce a minimum size of the
      corresponding memory region, very short sockaddrs (zero or one byte long)
      result in operating on uninitialized memory while referencing .sa_family.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9270e37
    • Mateusz Jurczyk's avatar
      nfc: Fix the sockaddr length sanitization in llcp_sock_connect · e66553a5
      Mateusz Jurczyk authored
      commit 608c4adf upstream.
      
      Fix the sockaddr length verification in the connect() handler of NFC/LLCP
      sockets, to compare against the size of the actual structure expected on
      input (sockaddr_nfc_llcp) instead of its shorter version (sockaddr_nfc).
      
      Both structures are defined in include/uapi/linux/nfc.h. The fields
      specific to the _llcp extended struct are as follows:
      
         276		__u8 dsap; /* Destination SAP, if known */
         277		__u8 ssap; /* Source SAP to be bound to */
         278		char service_name[NFC_LLCP_MAX_SERVICE_NAME]; /* Service name URI */;
         279		size_t service_name_len;
      
      If the caller doesn't provide a sufficiently long sockaddr buffer, these
      fields remain uninitialized (and they currently originate from the stack
      frame of the top-level sys_connect handler). They are then copied by
      llcp_sock_connect() into internal storage (nfc_llcp_sock structure), and
      could be subsequently read back through the user-mode getsockname()
      function (handled by llcp_sock_getname()). This would result in the
      disclosure of up to ~70 uninitialized bytes from the kernel stack to
      user-mode clients capable of creating AFC_NFC sockets.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e66553a5
    • Mateusz Jurczyk's avatar
      nfc: Ensure presence of required attributes in the activate_target handler · b6a39af4
      Mateusz Jurczyk authored
      commit a0323b97 upstream.
      
      Check that the NFC_ATTR_TARGET_INDEX and NFC_ATTR_PROTOCOLS attributes (in
      addition to NFC_ATTR_DEVICE_INDEX) are provided by the netlink client
      prior to accessing them. This prevents potential unhandled NULL pointer
      dereference exceptions which can be triggered by malicious user-mode
      programs, if they omit one or both of these attributes.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6a39af4
    • Johan Hovold's avatar
      NFC: nfcmrvl: fix firmware-management initialisation · 36a8a4e9
      Johan Hovold authored
      commit 45dd39b9 upstream.
      
      The nci-device was never deregistered in the event that
      fw-initialisation failed.
      
      Fix this by moving the firmware initialisation before device
      registration since the firmware work queue should be available before
      registering.
      
      Note that this depends on a recent fix that moved device-name
      initialisation back to to nci_allocate_device() as the
      firmware-workqueue name is now derived from the nfc-device name.
      
      Fixes: 3194c687 ("NFC: nfcmrvl: add firmware download support")
      Cc: Vincent Cuissard <cuissard@marvell.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36a8a4e9
    • Johan Hovold's avatar
      NFC: nfcmrvl: use nfc-device for firmware download · 7b69ecce
      Johan Hovold authored
      commit e5834ac2 upstream.
      
      Use the nfc- rather than phy-device in firmware-management code that
      needs a valid struct device.
      
      This specifically fixes a NULL-pointer dereference in
      nfcmrvl_fw_dnld_init() during registration when the underlying tty is
      one end of a Unix98 pty.
      
      Note that the driver still uses the phy device for any debugging, which
      is fine for now.
      
      Fixes: 3194c687 ("NFC: nfcmrvl: add firmware download support")
      Cc: Vincent Cuissard <cuissard@marvell.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b69ecce
    • Johan Hovold's avatar
      NFC: nfcmrvl: do not use device-managed resources · 6cb06562
      Johan Hovold authored
      commit 0cbe4011 upstream.
      
      This specifically fixes resource leaks in the registration error paths.
      
      Device-managed resources is a bad fit for this driver as devices can be
      registered from the n_nci line discipline. Firstly, a tty may not even
      have a corresponding device (should it be part of a Unix98 pty)
      something which would lead to a NULL-pointer dereference when
      registering resources.
      
      Secondly, if the tty has a class device, its lifetime exceeds that of
      the line discipline, which means that resources would leak every time
      the line discipline is closed (or if registration fails).
      
      Currently, the devres interface was only being used to request a reset
      gpio despite the fact that it was already explicitly freed in
      nfcmrvl_nci_unregister_dev() (along with the private data), something
      which also prevented the resource leak at close.
      
      Note that the driver treats gpio number 0 as invalid despite it being
      perfectly valid. This will be addressed in a follow-up patch.
      
      Fixes: b2fe288e ("NFC: nfcmrvl: free reset gpio")
      Fixes: 4a2b947f ("NFC: nfcmrvl: add chip reset management")
      Cc: Vincent Cuissard <cuissard@marvell.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6cb06562
    • Johan Hovold's avatar
      NFC: nfcmrvl_uart: add missing tty-device sanity check · 5d714615
      Johan Hovold authored
      commit 15e0c59f upstream.
      
      Make sure to check the tty-device pointer before trying to access the
      parent device to avoid dereferencing a NULL-pointer when the tty is one
      end of a Unix98 pty.
      
      Fixes: e097dc62 ("NFC: nfcmrvl: add UART driver")
      Cc: Vincent Cuissard <cuissard@marvell.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d714615