1. 10 Jul, 2015 40 commits
    • Jens Freimann's avatar
      KVM: s390: clear floating interrupt bitmap and parameters · 534c9f98
      Jens Freimann authored
      commit f2ae45ed upstream.
      
      commit 6d3da241 ("KVM: s390: deliver floating interrupts in order
      of priority") introduced a regression for the reset handling.
      
      We don't clear the bitmap of pending floating interrupts
      and interrupt parameters. This could result in stale interrupts
      even after a reset. Let's fix this by clearing the pending bitmap
      and the parameters for service and machine check interrupts.
      Signed-off-by: default avatarJens Freimann <jfrei@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      534c9f98
    • David Hildenbrand's avatar
      KVM: s390: fix external call injection without sigp interpretation · 14fe2f14
      David Hildenbrand authored
      commit b938eace upstream.
      
      Commit ea5f4969 ("KVM: s390: only one external call may be pending
      at a time") introduced a bug on machines that don't have SIGP
      interpretation facility installed.
      The injection of an external call will now always fail with -EBUSY
      (if none is already pending).
      
      This leads to the following symptoms:
      - An external call will be injected but with the wrong "src cpu id",
        as this id will not be remembered.
      - The target vcpu will not be woken up, therefore the guest will hang if
        it cannot deal with unexpected failures of the SIGP EXTERNAL CALL
        instruction.
      - If an external call is already pending, -EBUSY will not be reported.
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarJens Freimann <jfrei@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14fe2f14
    • James Hogan's avatar
      MIPS: Fix KVM guest fixmap address · 4f3d3bc2
      James Hogan authored
      commit 8e748c8d upstream.
      
      KVM guest kernels for trap & emulate run in user mode, with a modified
      set of kernel memory segments. However the fixmap address is still in
      the normal KSeg3 region at 0xfffe0000 regardless, causing problems when
      cache alias handling makes use of them when handling copy on write.
      
      Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped
      region when CONFIG_KVM_GUEST is defined.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9887/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f3d3bc2
    • Paolo Bonzini's avatar
      KVM: mips: use id_to_memslot correctly · dff1316f
      Paolo Bonzini authored
      commit 69a12200 upstream.
      
      The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
      position in the memslots array, which is sorted by gfn.
      Reviewed-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dff1316f
    • Bjorn Helgaas's avatar
      x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A · b6f2faff
      Bjorn Helgaas authored
      commit 1dace011 upstream.
      
      The Foxconn K8M890-8237A has two PCI host bridges, and we can't assign
      resources correctly without the information from _CRS that tells us which
      address ranges are claimed by which bridge.  In the bugs mentioned below,
      we incorrectly assign a sound card address (this example is from 1033299):
      
        bus: 00 index 2 [mem 0x80000000-0xfcffffffff]
        ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
        pci_root PNP0A08:00: host bridge window [mem 0x80000000-0xbfefffff] (ignored)
        pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff] (ignored)
        pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
        ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
        pci_root PNP0A08:01: host bridge window [mem 0xbff00000-0xbfffffff] (ignored)
        pci 0000:80:01.0: [1106:3288] type 0 class 0x000403
        pci 0000:80:01.0: reg 10: [mem 0xbfffc000-0xbfffffff 64bit]
        pci 0000:80:01.0: address space collision: [mem 0xbfffc000-0xbfffffff 64bit] conflicts with PCI Bus #00 [mem 0x80000000-0xfcffffffff]
        pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
        BUG: unable to handle kernel paging request at ffffc90000378000
        IP: [<ffffffffa0345f63>] azx_create+0x37c/0x822 [snd_hda_intel]
      
      We assigned 0xfd_0000_0000, but that is not in any of the host bridge
      windows, and the sound card doesn't work.
      
      Turn on pci=use_crs automatically for this system.
      
      Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368
      Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6f2faff
    • Bjorn Helgaas's avatar
      x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing · 9dc6d435
      Bjorn Helgaas authored
      commit 3d9fecf6 upstream.
      
      We enable _CRS on all systems from 2008 and later.  On older systems, we
      ignore _CRS and assume the whole physical address space (excluding RAM and
      other devices) is available for PCI devices, but on systems that support
      physical address spaces larger than 4GB, it's doubtful that the area above
      4GB is really available for PCI.
      
      After d56dbf5b ("PCI: Allocate 64-bit BARs above 4G when possible"), we
      try to use that space above 4GB *first*, so we're more likely to put a
      device there.
      
      On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394,
      and card reader devices unassigned (but only after Windows had been
      booted).  Only the sound device had a 64-bit BAR, so it was the only device
      placed above 4GB, and hence the only device that didn't work.
      
      Keep _CRS enabled even on pre-2008 systems if they support physical address
      space larger than 4GB.
      
      Fixes: d56dbf5b ("PCI: Allocate 64-bit BARs above 4G when possible")
      Reported-and-tested-by: default avatarJuan Dayer <jdayer@outlook.com>
      Reported-and-tested-by: default avatarAlan Horsfield <alan@hazelgarth.co.uk>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221
      Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dc6d435
    • Anton Blanchard's avatar
      powerpc/perf: Fix book3s kernel to userspace backtraces · f6707abd
      Anton Blanchard authored
      commit 72e349f1 upstream.
      
      When we take a PMU exception or a software event we call
      perf_read_regs(). This overloads regs->result with a boolean that
      describes if we should use the sampled instruction address register
      (SIAR) or the regs.
      
      If the exception is in kernel, we start with the kernel regs and
      backtrace through the kernel stack. At this point we switch to the
      userspace regs and backtrace the user stack with perf_callchain_user().
      
      Unfortunately these regs have not got the perf_read_regs() treatment,
      so regs->result could be anything. If it is non zero,
      perf_instruction_pointer() decides to use the SIAR, and we get issues
      like this:
      
      0.11%  qemu-system-ppc  [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
             |
             ---_raw_spin_lock_irqsave
                |
                |--52.35%-- 0
                |          |
                |          |--46.39%-- __hrtimer_start_range_ns
                |          |          kvmppc_run_core
                |          |          kvmppc_vcpu_run_hv
                |          |          kvmppc_vcpu_run
                |          |          kvm_arch_vcpu_ioctl_run
                |          |          kvm_vcpu_ioctl
                |          |          do_vfs_ioctl
                |          |          sys_ioctl
                |          |          system_call
                |          |          |
                |          |          |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
                |          |          |          |
                |          |          |           --100.00%-- 0x7e714
                |          |          |                     0x7e714
      
      Notice the bogus _raw_spin_irqsave when we transition from kernel
      (system_call) to userspace (0x7e714). We inserted what was in the SIAR.
      
      Add a check in regs_use_siar() to check that the regs in question
      are from a PMU exception. With this fix the backtrace makes sense:
      
           0.47%  qemu-system-ppc  [kernel.vmlinux]         [k] _raw_spin_lock_irqsave
                  |
                  ---_raw_spin_lock_irqsave
                     |
                     |--53.83%-- 0
                     |          |
                     |          |--44.73%-- hrtimer_try_to_cancel
                     |          |          kvmppc_start_thread
                     |          |          kvmppc_run_core
                     |          |          kvmppc_vcpu_run_hv
                     |          |          kvmppc_vcpu_run
                     |          |          kvm_arch_vcpu_ioctl_run
                     |          |          kvm_vcpu_ioctl
                     |          |          do_vfs_ioctl
                     |          |          sys_ioctl
                     |          |          system_call
                     |          |          __ioctl
                     |          |          0x7e714
                     |          |          0x7e714
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6707abd
    • preeti's avatar
      tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode · b9d118e1
      preeti authored
      commit cc5a2f7b upstream.
      
      On some archs, the local clockevent device stops in deep cpuidle states.
      The broadcast framework is used to wakeup cpus in these idle states, in
      which either an external clockevent device is used to send wakeup ipis
      or the hrtimer broadcast framework kicks in in the absence of such a
      device. One cpu is nominated as the broadcast cpu and this cpu sends
      wakeup ipis to sleeping cpus at the appropriate time. This is the
      implementation in the oneshot mode of broadcast.
      
      In periodic mode of broadcast however, the presence of such cpuidle
      states results in the cpuidle driver calling tick_broadcast_enable()
      which shuts down the local clockevent devices of all the cpus and
      appoints the tick broadcast device as the clockevent device for each of
      them. This works on those archs where the tick broadcast device is a
      real clockevent device.  But on archs which depend on the hrtimer mode
      of broadcast, the tick broadcast device hapens to be a pseudo device.
      The consequence is that the local clockevent devices of all cpus are
      shutdown and the kernel hangs at boot time in periodic mode.
      
      Let us thus not register the cpuidle states which have
      CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
      mode of broadcast in periodic mode. This patch takes care of doing this
      on powerpc. The cpus would not have entered into such deep cpuidle
      states in periodic mode on powerpc anyway. So there is no loss here.
      Signed-off-by: default avatarPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9d118e1
    • Thomas Petazzoni's avatar
      ARM: mvebu: fix suspend to RAM on big-endian configurations · 301773b6
      Thomas Petazzoni authored
      commit 2f5bc307 upstream.
      
      The current Armada XP suspend to RAM implementation, as added in
      commit 27432825 ("ARM: mvebu: Armada XP GP specific
      suspend/resume code") does not handle big-endian configurations
      properly: the small bit of assembly code putting the DRAM in
      self-refresh and toggling the GPIOs to turn off power forgets to
      convert the values to little-endian.
      
      This commit fixes that by making sure the two values we will write to
      the DRAM controller register and GPIO register are already in
      little-endian before entering the critical assembly code.
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Fixes: 27432825 ("ARM: mvebu: Armada XP GP specific suspend/resume code")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      301773b6
    • Dmitry Osipenko's avatar
      ARM: tegra20: Store CPU "resettable" status in IRAM · 3544f27e
      Dmitry Osipenko authored
      commit 4d48edb3 upstream.
      
      Commit 7232398a ("ARM: tegra: Convert PMC to a driver") changed tegra_resume()
      location storing from late to early and, as a result, broke suspend on Tegra20.
      PMC scratch register 41 is used by tegra LP1 resume code for retrieving stored
      physical memory address of common resume function and in the same time used by
      tegra20_cpu_shutdown() (shared by Tegra20 cpuidle driver and platform SMP code),
      which is storing CPU1 "resettable" status. It implies strict order of scratch
      register usage, otherwise resume function address is lost on Tegra20 after
      disabling non-boot CPU's on suspend. Fix it by storing "resettable" status in
      IRAM instead of PMC scratch register.
      Signed-off-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Fixes: 7232398a (ARM: tegra: Convert PMC to a driver)
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3544f27e
    • Lorenzo Pieralisi's avatar
      ARM: kvm: psci: fix handling of unimplemented functions · 3f3587c4
      Lorenzo Pieralisi authored
      commit e2d99736 upstream.
      
      According to the PSCI specification and the SMC/HVC calling
      convention, PSCI function_ids that are not implemented must
      return NOT_SUPPORTED as return value.
      
      Current KVM implementation takes an unhandled PSCI function_id
      as an error and injects an undefined instruction into the guest
      if PSCI implementation is called with a function_id that is not
      handled by the resident PSCI version (ie it is not implemented),
      which is not the behaviour expected by a guest when calling a
      PSCI function_id that is not implemented.
      
      This patch fixes this issue by returning NOT_SUPPORTED whenever
      the kvm PSCI call is executed for a function_id that is not
      implemented by the PSCI kvm layer.
      
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f3587c4
    • Marc Zyngier's avatar
      arm: KVM: force execution of HCPTR access on VM exit · c8bdf091
      Marc Zyngier authored
      commit 85e84ba3 upstream.
      
      On VM entry, we disable access to the VFP registers in order to
      perform a lazy save/restore of these registers.
      
      On VM exit, we restore access, test if we did enable them before,
      and save/restore the guest/host registers if necessary. In this
      sequence, the FPEXC register is always accessed, irrespective
      of the trapping configuration.
      
      If the guest didn't touch the VFP registers, then the HCPTR access
      has now enabled such access, but we're missing a barrier to ensure
      architectural execution of the new HCPTR configuration. If the HCPTR
      access has been delayed/reordered, the subsequent access to FPEXC
      will cause a trap, which we aren't prepared to handle at all.
      
      The same condition exists when trapping to enable VFP for the guest.
      
      The fix is to introduce a barrier after enabling VFP access. In the
      vmexit case, it can be relaxed to only takes place if the guest hasn't
      accessed its view of the VFP registers, making the access to FPEXC safe.
      
      The set_hcptr macro is modified to deal with both vmenter/vmexit and
      vmtrap operations, and now takes an optional label that is branched to
      when the guest hasn't touched the VFP registers.
      Reported-by: default avatarVikram Sethi <vikrams@codeaurora.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8bdf091
    • J. Bruce Fields's avatar
      selinux: fix setting of security labels on NFS · 805f18e0
      J. Bruce Fields authored
      commit 9fc2b4b4 upstream.
      
      Before calling into the filesystem, vfs_setxattr calls
      security_inode_setxattr, which ends up calling selinux_inode_setxattr in
      our case.  That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
      SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
      only if selinux_is_sblabel_mnt returns true.
      
      The selinux_is_sblabel_mnt logic was broken by eadcabc6 "SELinux: do
      all flags twiddling in one place", which didn't take into the account
      the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
      with eb9ae686 "SELinux: Add new labeling type native labels".
      
      This caused setxattr's of security labels over NFSv4.2 to fail.
      
      Cc: Eric Paris <eparis@redhat.com>
      Cc: David Quigley <dpquigl@davequigley.com>
      Reported-by: default avatarRichard Chan <rc556677@outlook.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      [PM: added the stable dependency]
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      805f18e0
    • Joe Konno's avatar
      intel_pstate: set BYT MSR with wrmsrl_on_cpu() · cd430d3e
      Joe Konno authored
      commit 0dd23f94 upstream.
      
      Commit 007bea09 (intel_pstate: Add setting voltage value for
      baytrail P states.) introduced byt_set_pstate() with the assumption that
      it would always be run by the CPU whose MSR is to be written by it.  It
      turns out, however, that is not always the case in practice, so modify
      byt_set_pstate() to enforce the MSR write done by it to always happen on
      the right CPU.
      
      Fixes: 007bea09 (intel_pstate: Add setting voltage value for baytrail P states.)
      Signed-off-by: default avatarJoe Konno <joe.konno@intel.com>
      Acked-by: default avatarKristen Carlson Accardi <kristen@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd430d3e
    • Jiri Slaby's avatar
      mmc: sdhci: fix low memory corruption · 4b81f9f8
      Jiri Slaby authored
      commit 62a7f368 upstream.
      
      When dma mapping (dma_map_sg) fails in sdhci_pre_dma_transfer, -EINVAL
      is returned. There are 3 callers of sdhci_pre_dma_transfer:
      * sdhci_pre_req and sdhci_adma_table_pre: handle negative return
      * sdhci_prepare_data: handles 0 (error) and "else" (good) only
      
      sdhci_prepare_data is therefore broken. When it receives -EINVAL from
      sdhci_pre_dma_transfer, it assumes 1 sg mapping was mapped. Later,
      this non-existent mapping with address 0 is kmap'ped and written to:
      Corrupted low memory at ffff880000001000 (1000 phys) = 22b7d67df2f6d1cf
      Corrupted low memory at ffff880000001008 (1008 phys) = 63848a5216b7dd95
      Corrupted low memory at ffff880000001010 (1010 phys) = 330eb7ddef39e427
      Corrupted low memory at ffff880000001018 (1018 phys) = 8017ac7295039bda
      Corrupted low memory at ffff880000001020 (1020 phys) = 8ce039eac119074f
      ...
      
      So teach sdhci_prepare_data to understand negative return values from
      sdhci_pre_dma_transfer and disable DMA in that case, as well as for
      zero.
      
      It was introduced in 348487cb (mmc:
      sdhci: use pipeline mmc requests to improve performance). The commit
      seems to be suspicious also by assigning host->sg_count both in
      sdhci_pre_dma_transfer and sdhci_adma_table_pre.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Fixes: 348487cb
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Cc: Haibo Chen <haibo.chen@freescale.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b81f9f8
    • Joerg Roedel's avatar
      iommu/amd: Handle large pages correctly in free_pagetable · 396887ba
      Joerg Roedel authored
      commit 0b3fff54 upstream.
      
      Make sure that we are skipping over large PTEs while walking
      the page-table tree.
      
      Fixes: 5c34c403 ("iommu/amd: Fix memory leak in free_pagetable")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      396887ba
    • Will Deacon's avatar
      iommu/arm-smmu: Fix broken ATOS check · 72e09509
      Will Deacon authored
      commit d38f0ff9 upstream.
      
      Commit 83a60ed8 ("iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS
      condition") accidentally negated the ID0_ATOSNS predicate in the ATOS
      feature check, causing the driver to attempt ATOS requests on SMMUv2
      hardware without the ATOS feature implemented.
      
      This patch restores the predicate to the correct value.
      Reported-by: default avatarVarun Sethi <varun.sethi@freescale.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72e09509
    • Horia Geant?'s avatar
      Revert "crypto: talitos - convert to use be16_add_cpu()" · 44cb6ff1
      Horia Geant? authored
      commit 69d9cd8c upstream.
      
      This reverts commit 7291a932.
      
      The conversion to be16_add_cpu() is incorrect in case cryptlen is
      negative due to premature (i.e. before addition / subtraction)
      implicit conversion of cryptlen (int -> u16) leading to sign loss.
      
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: default avatarHoria Geanta <horia.geanta@freescale.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44cb6ff1
    • Horia Geant?'s avatar
      crypto: talitos - avoid memleak in talitos_alg_alloc() · 0e566fe9
      Horia Geant? authored
      commit 5fa7dadc upstream.
      
      Fixes: 1d11911a ("crypto: talitos - fix warning: 'alg' may be used uninitialized in this function")
      Signed-off-by: default avatarHoria Geanta <horia.geanta@freescale.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e566fe9
    • Rui Miguel Silva's avatar
      usb: gadget: f_fs: add extra check before unregister_gadget_item · 1e4205d4
      Rui Miguel Silva authored
      commit f14e9ad1 upstream.
      
      ffs_closed can race with configfs_rmdir which will call config_item_release, so
      add an extra check to avoid calling the unregister_gadget_item with an null
      gadget item.
      Signed-off-by: default avatarRui Miguel Silva <rui.silva@linaro.org>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e4205d4
    • Simon Guinot's avatar
      net: mvneta: disable IP checksum with jumbo frames for Armada 370 · bfa06e62
      Simon Guinot authored
      [ Upstream commit b65657fc ]
      
      The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
      support TCP/IP checksumming with frame sizes larger than 1600 bytes.
      
      This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
      NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
      to a value greater than 1600 bytes.
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
      Cc: <stable@vger.kernel.org> # v3.8+
      Acked-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfa06e62
    • Simon Guinot's avatar
      ARM: mvebu: update Ethernet compatible string for Armada XP · 5c40e8bf
      Simon Guinot authored
      [ Upstream commit ea3b55fe ]
      
      This patch updates the Ethernet DT nodes for Armada XP SoCs with the
      compatible string "marvell,armada-xp-neta".
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Fixes: 77916519 ("arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces")
      Cc: <stable@vger.kernel.org> # v3.8+
      Acked-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Reviewed-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c40e8bf
    • Simon Guinot's avatar
      net: mvneta: introduce compatible string "marvell, armada-xp-neta" · b5aded83
      Simon Guinot authored
      [ Upstream commit f522a975 ]
      
      The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
      380 and 385 SoCs. Since at least one more hardware feature is available
      for the Armada XP SoCs then a way to identify them is needed.
      
      This patch introduces a new compatible string "marvell,armada-xp-neta".
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
      Cc: <stable@vger.kernel.org> # v3.8+
      Acked-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Acked-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5aded83
    • Tom Lendacky's avatar
      amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation · 8c6e5415
      Tom Lendacky authored
      [ Upstream commit 472cfe71 ]
      
      When allocating Rx related buffers, alloc_pages is called using an order
      number that is decreased until successful. A system under stress can
      experience failures during this allocation process resulting in a warning
      being issued. This message can be of concern to end users even though the
      failure is not fatal. Since the failure is not fatal and can occur
      multiple times, the driver should include the __GFP_NOWARN flag to
      suppress the warning message from being issued.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c6e5415
    • Alexander Sverdlin's avatar
      sctp: Fix race between OOTB responce and route removal · 67866a8c
      Alexander Sverdlin authored
      [ Upstream commit 29c4afc4 ]
      
      There is NULL pointer dereference possible during statistics update if the route
      used for OOTB responce is removed at unfortunate time. If the route exists when
      we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
      ABORT, but in the meantime route is removed under our feet, we take "no_route"
      path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).
      
      But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
      sctp_transport_set_owner() and therefore there is no asoc associated with this
      packet. Probably temporary asoc just for OOTB responces is overkill, so just
      introduce a check like in all other places in sctp_packet_transmit(), where
      "asoc" is dereferenced.
      
      To reproduce this, one needs to
      0. ensure that sctp module is loaded (otherwise ABORT is not generated)
      1. remove default route on the machine
      2. while true; do
           ip route del [interface-specific route]
           ip route add [interface-specific route]
         done
      3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
         responce
      
      On x86_64 the crash looks like this:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
      PGD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: ...
      CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.0.5-1-ARCH #1
      Hardware name: ...
      task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000
      RIP: 0010:[<ffffffffa05ec9ac>]  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
      RSP: 0018:ffff880127c037b8  EFLAGS: 00010296
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480
      RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700
      RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af
      R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28
      R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0
      FS:  0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0
      Stack:
       ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400
       ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520
       0000000000000000 0000000000000001 0000000000000000 0000000000000000
      Call Trace:
       <IRQ>
       [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
       [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp]
       [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp]
       [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210
       [<ffffffff810e0329>] ? update_process_times+0x59/0x60
       [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0
       [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0
       [<ffffffff8101f599>] ? read_tsc+0x9/0x10
       [<ffffffff8116d4b5>] ? put_page+0x55/0x60
       [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100
       [<ffffffff81462b68>] ? skb_free_head+0x58/0x80
       [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic]
       [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0
       [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
       [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp]
       [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp]
       [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
       [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp]
       [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30
       [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150
       [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210
       [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90
       [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370
       [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0
       [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50
       [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60
       [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0
       [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120
       [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169]
       [<ffffffff8147896a>] net_rx_action+0x21a/0x360
       [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0
       [<ffffffff8107912d>] irq_exit+0xad/0xb0
       [<ffffffff8157d158>] do_IRQ+0x58/0xf0
       [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d
       <EOI>
       [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20
       [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
       [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0
       [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20
       [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480
       [<ffffffff8156b365>] rest_init+0x85/0x90
       [<ffffffff818eb035>] start_kernel+0x48b/0x4ac
       [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120
       [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c
       [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184
      Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
      RIP  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
       RSP <ffff880127c037b8>
      CR2: 0000000000000020
      ---[ end trace 5aec7fd2dc983574 ]---
      Kernel panic - not syncing: Fatal exception in interrupt
      Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
      drm_kms_helper: panic occurred, switching back to text console
      ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67866a8c
    • Eric Dumazet's avatar
      bnx2x: fix lockdep splat · 7e2a3d66
      Eric Dumazet authored
      [ Upstream commit d53c66a5 ]
      
      Michel reported following lockdep splat
      
      [   44.718117] INFO: trying to register non-static key.
      [   44.723081] the code is fine but needs lockdep annotation.
      [   44.728559] turning off the locking correctness validator.
      [   44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0
      [   44.770289] Call Trace:
      [   44.772741]  [<ffffffff816eb1cd>] dump_stack+0x4c/0x65
      [   44.777879]  [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510
      [   44.783708]  [<ffffffff811121f5>] __lock_acquire+0x1d05/0x1f10
      [   44.789538]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
      [   44.795276]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
      [   44.801967]  [<ffffffff8111390d>] ? trace_hardirqs_on+0xd/0x10
      [   44.807793]  [<ffffffff811330fa>] ? hrtimer_try_to_cancel+0x4a/0x250
      [   44.814142]  [<ffffffff81112ba6>] lock_acquire+0xb6/0x290
      [   44.819537]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
      [   44.824844]  [<ffffffff810d66ad>] flush_work+0x3d/0x280
      [   44.830061]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
      [   44.835366]  [<ffffffff816f3c43>] ? schedule_hrtimeout_range+0x13/0x20
      [   44.841889]  [<ffffffff8112ec9b>] ? usleep_range+0x4b/0x50
      [   44.847365]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
      [   44.853102]  [<ffffffff810d8585>] ? __cancel_work_timer+0x105/0x1c0
      [   44.859359]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
      [   44.866045]  [<ffffffff810d851f>] __cancel_work_timer+0x9f/0x1c0
      [   44.872048]  [<ffffffffa0010982>] ? bnx2x_func_stop+0x42/0x90 [bnx2x]
      [   44.878481]  [<ffffffff810d8670>] cancel_work_sync+0x10/0x20
      [   44.884134]  [<ffffffffa00259e5>] bnx2x_chip_cleanup+0x245/0x730 [bnx2x]
      [   44.890829]  [<ffffffff8110ce02>] ? up+0x32/0x50
      [   44.895439]  [<ffffffff811306b5>] ? del_timer_sync+0x5/0xd0
      [   44.901005]  [<ffffffffa005596d>] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x]
      [   44.907527]  [<ffffffff811f1aef>] ? might_fault+0x5f/0xb0
      [   44.912921]  [<ffffffffa005851c>] bnx2x_reload_if_running+0x2c/0x50 [bnx2x]
      [   44.919879]  [<ffffffffa005a3c5>] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x]
      [   44.926664]  [<ffffffff815d498b>] dev_ethtool+0x55b/0x1c40
      [   44.932148]  [<ffffffff815dfdc7>] ? rtnl_lock+0x17/0x20
      [   44.937364]  [<ffffffff815e7f8b>] dev_ioctl+0x17b/0x630
      [   44.942582]  [<ffffffff815abf8d>] sock_do_ioctl+0x5d/0x70
      [   44.947972]  [<ffffffff815ac013>] sock_ioctl+0x73/0x280
      [   44.953192]  [<ffffffff8124c1c8>] do_vfs_ioctl+0x88/0x5b0
      [   44.958587]  [<ffffffff8110d0b3>] ? up_read+0x23/0x40
      [   44.963631]  [<ffffffff812584cc>] ? __fget_light+0x6c/0xa0
      [   44.969105]  [<ffffffff8124c781>] SyS_ioctl+0x91/0xb0
      [   44.974149]  [<ffffffff816f4dd7>] system_call_fastpath+0x12/0x6f
      
      As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED,
      we also need to guard bnx2x_stop_ptp() with same condition, otherwise
      ptp_task workqueue is not initialized and kernel barfs on
      cancel_work_sync()
      
      Fixes: eeed018c ("bnx2x: Add timestamping and PTP hardware clock support")
      Reported-by: default avatarMichel Lespinasse <walken@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Michal Kalderon <Michal.Kalderon@qlogic.com>
      Cc: Ariel Elior <Ariel.Elior@qlogic.com>
      Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
      Cc: David Decotigny <decot@google.com>
      Acked-by: default avatarSony Chacko <sony.chacko@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e2a3d66
    • Mugunthan V N's avatar
      net: phy: fix phy link up when limiting speed via device tree · 6c10c841
      Mugunthan V N authored
      [ Upstream commit eb686231 ]
      
      When limiting phy link speed using "max-speed" to 100mbps or less on a
      giga bit phy, phy never completes auto negotiation and phy state
      machine is held in PHY_AN. Fixing this issue by comparing the giga
      bit advertise though phydev->supported doesn't have it but phy has
      BMSR_ESTATEN set. So that auto negotiation is restarted as old and
      new advertise are different and link comes up fine.
      Signed-off-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6c10c841
    • Or Gerlitz's avatar
      mlx4: Disable HA for SRIOV PF RoCE devices · 62a9ad17
      Or Gerlitz authored
      [ Upstream commit 7254acff ]
      
      When in HA mode, the driver exposes an IB (RoCE) device instance with only
      one port. Under SRIOV, the existing implementation doesn't go well with
      the PF RoCE driver's role of Special QPs Para-Virtualization, etc.
      
      As such, disable HA for the mlx4 PF RoCE device in SRIOV mode.
      
      Fixes: a5750090 ('IB/mlx4: Add port aggregation support')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62a9ad17
    • Ido Shamay's avatar
      net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled · 1b740800
      Ido Shamay authored
      [ Upstream commit 79a25852 ]
      
      The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
      offload is disabled. This is wrong since rxvlan offload can be switched
      on/off regardless of hwtstamp_rx_filter.
      
      Also moved check_csum to query CQE information to identify VLAN packets
      and removed the check of IP packets, since it has been validated before.
      
      Fixes: f8c6455b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
      Signed-off-by: default avatarIdo Shamay <idos@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b740800
    • Ido Shamay's avatar
      net/mlx4_en: Wake TX queues only when there's enough room · 7a9aa8ab
      Ido Shamay authored
      [ Upstream commit 488a9b48 ]
      
      Indication of a single completed packet, marked by txbbs_skipped
      being bigger then zero, in not enough in order to wake up a
      stopped TX queue. The completed packet may contain a single TXBB,
      while next packet to be sent (after the wake up) may have multiple
      TXBBs (LSO/TSO packets for example), causing overflow in queue followed
      by WQE corruption and TX queue timeout.
      Instead, wake the stopped queue only when there's enough room for the
      worst case (maximum sized WQE) packet that we should need to handle after
      the queue is opened again.
      
      Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
      if the current TX ring is full or not. It provides better code readability
      and removes code duplication.
      Signed-off-by: default avatarIdo Shamay <idos@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a9aa8ab
    • Eran Ben Elisha's avatar
      net/mlx4_en: Release TX QP when destroying TX ring · f3f6617f
      Eran Ben Elisha authored
      [ Upstream commit 0eb08514 ]
      
      TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
      used the deprecated base_tx_qpn field. Move TX QP release to
      mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.
      
      Fixes: ddae0349 ('net/mlx4: Change QP allocation scheme')
      Signed-off-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3f6617f
    • Julian Anastasov's avatar
      ip: report the original address of ICMP messages · 66634bb1
      Julian Anastasov authored
      [ Upstream commit 34b99df4 ]
      
      ICMP messages can trigger ICMP and local errors. In this case
      serr->port is 0 and starting from Linux 4.0 we do not return
      the original target address to the error queue readers.
      Add function to define which errors provide addr_offset.
      With this fix my ping command is not silent anymore.
      
      Fixes: c247f053 ("ip: fix error queue empty skb handling")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66634bb1
    • Palik, Imre's avatar
      xen-netback: fix a BUG() during initialization · 6fc8b947
      Palik, Imre authored
      [ Upstream commit 12b322ac ]
      
      Commit edafc132 ("xen-netback: making the bandwidth limiter runtime settable")
      introduced the capability to change the bandwidth rate limit at runtime.
      But it also introduced a possible crashing bug.
      
      If netback receives two XenbusStateConnected without getting the
      hotplug-status watch firing in between, then it will try to register the
      watches for the rate limiter again.  But this triggers a BUG() in the watch
      registration code.
      
      The fix modifies connect() to remove the possibly existing packet-rate
      watches before trying to install those watches.  This behaviour is in line
      with how connect() deals with the hotplug-status watch.
      Signed-off-by: default avatarImre Palik <imrep@amazon.de>
      Cc: Matt Wilson <msw@amazon.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fc8b947
    • Christoph Paasch's avatar
      tcp: Do not call tcp_fastopen_reset_cipher from interrupt context · c31967d4
      Christoph Paasch authored
      [ Upstream commit dfea2aa6 ]
      
      tcp_fastopen_reset_cipher really cannot be called from interrupt
      context. It allocates the tcp_fastopen_context with GFP_KERNEL and
      calls crypto_alloc_cipher, which allocates all kind of stuff with
      GFP_KERNEL.
      
      Thus, we might sleep when the key-generation is triggered by an
      incoming TFO cookie-request which would then happen in interrupt-
      context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:
      
      [   36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266
      [   36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
      [   36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
      [   36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [   36.008250]  00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8
      [   36.009630]  ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928
      [   36.011076]  0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00
      [   36.012494] Call Trace:
      [   36.012953]  <IRQ>  [<ffffffff8171d53a>] dump_stack+0x4f/0x6d
      [   36.014085]  [<ffffffff810967d3>] ___might_sleep+0x103/0x170
      [   36.015117]  [<ffffffff81096892>] __might_sleep+0x52/0x90
      [   36.016117]  [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190
      [   36.017266]  [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130
      [   36.018485]  [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130
      [   36.019679]  [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70
      [   36.020884]  [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60
      [   36.022058]  [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730
      [   36.023118]  [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0
      [   36.024185]  [<ffffffff810e3872>] ? __module_text_address+0x12/0x60
      [   36.025327]  [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60
      [   36.026410]  [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0
      [   36.027556]  [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170
      [   36.028784]  [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0
      [   36.029832]  [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20
      [   36.030936]  [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0
      [   36.031875]  [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70
      [   36.032953]  [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0
      [   36.034065]  [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0
      [   36.035069]  [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0
      [   36.035963]  [<ffffffff81657569>] ip_rcv_finish+0x119/0x330
      [   36.036950]  [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0
      [   36.037847]  [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930
      [   36.038994]  [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70
      [   36.040033]  [<ffffffff81610b72>] process_backlog+0xd2/0x1f0
      [   36.041025]  [<ffffffff81611482>] net_rx_action+0x122/0x310
      [   36.042007]  [<ffffffff81076743>] __do_softirq+0x103/0x2f0
      [   36.042978]  [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30
      
      This patch moves the call to tcp_fastopen_init_key_once to the places
      where a listener socket creates its TFO-state, which always happens in
      user-context (either from the setsockopt, or implicitly during the
      listen()-call)
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Fixes: 222e83d2 ("tcp: switch tcp_fastopen key generation to net_get_random_once")
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c31967d4
    • Stas Sergeev's avatar
      mvneta: add forgotten initialization of autonegotiation bits · 1bc31b1e
      Stas Sergeev authored
      [ Upstream commit 538761b7 ]
      
      The commit 898b2970 ("mvneta: implement SGMII-based in-band link state
      signaling")
      changed mvneta_adjust_link() so that it does not clear the auto-negotiation
      bits in MVNETA_GMAC_AUTONEG_CONFIG register. This was necessary for
      auto-negotiation mode to work.
      Unfortunately I haven't checked if these bits are ever initialized.
      It appears they are not.
      This patch adds the missing initialization of the auto-negotiation bits
      in the MVNETA_GMAC_AUTONEG_CONFIG register.
      It fixes the following regression:
      https://www.mail-archive.com/netdev@vger.kernel.org/msg67928.html
      
      Since the patch was tested to fix a regression, it should be applied to
      stable tree.
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      
      CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarStas Sergeev <stsp@users.sourceforge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bc31b1e
    • Johannes Berg's avatar
      mac80211: fix locking in update_vlan_tailroom_need_count() · 80b856db
      Johannes Berg authored
      [ Upstream commit 51f458d9 ]
      
      Unfortunately, Michal's change to fix AP_VLAN crypto tailroom
      caused a locking issue that was reported by lockdep, but only
      in a few cases - the issue was a classic ABBA deadlock caused
      by taking the mtx after the key_mtx, where normally they're
      taken the other way around.
      
      As the key mutex protects the field in question (I'm adding a
      few annotations to make that clear) only the iteration needs
      to be protected, but we can also iterate the interface list
      with just RCU protection while holding the key mutex.
      
      Fixes: f9dca80b ("mac80211: fix AP_VLAN crypto tailroom calculation")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b856db
    • Julian Anastasov's avatar
      neigh: do not modify unlinked entries · 914b0ef2
      Julian Anastasov authored
      [ Upstream commit 2c51a97f ]
      
      The lockless lookups can return entry that is unlinked.
      Sometimes they get reference before last neigh_cleanup_and_release,
      sometimes they do not need reference. Later, any
      modification attempts may result in the following problems:
      
      1. entry is not destroyed immediately because neigh_update
      can start the timer for dead entry, eg. on change to NUD_REACHABLE
      state. As result, entry lives for some time but is invisible
      and out of control.
      
      2. __neigh_event_send can run in parallel with neigh_destroy
      while refcnt=0 but if timer is started and expired refcnt can
      reach 0 for second time leading to second neigh_destroy and
      possible crash.
      
      Thanks to Eric Dumazet and Ying Xue for their work and analyze
      on the __neigh_event_send change.
      
      Fixes: 767e97e1 ("neigh: RCU conversion of struct neighbour")
      Fixes: a263b309 ("ipv4: Make neigh lookups directly in output packet path.")
      Fixes: 6fd6ce20 ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      914b0ef2
    • Willem de Bruijn's avatar
      packet: avoid out of bounds read in round robin fanout · 2c330edb
      Willem de Bruijn authored
      [ Upstream commit 468479e6 ]
      
      PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
      f->num_members. It returns the old value unconditionally, but
      f->num_members may have changed since the last store. Ensure
      that the return value is always < num.
      
      When modifying the logic, simplify it further by replacing the loop
      with an unconditional atomic increment.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c330edb
    • Eric Dumazet's avatar
      packet: read num_members once in packet_rcv_fanout() · d7884e43
      Eric Dumazet authored
      [ Upstream commit f98f4514 ]
      
      We need to tell compiler it must not read f->num_members multiple
      times. Otherwise testing if num is not zero is flaky, and we could
      attempt an invalid divide by 0 in fanout_demux_cpu()
      
      Note bug was present in packet_rcv_fanout_hash() and
      packet_rcv_fanout_lb() but final 3.1 had a simple location
      after commit 95ec3eb4 ("packet: Add 'cpu' fanout policy.")
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7884e43
    • Nikolay Aleksandrov's avatar
      bridge: fix br_stp_set_bridge_priority race conditions · 08be544e
      Nikolay Aleksandrov authored
      [ Upstream commit 2dab80a8 ]
      
      After the ->set() spinlocks were removed br_stp_set_bridge_priority
      was left running without any protection when used via sysfs. It can
      race with port add/del and could result in use-after-free cases and
      corrupted lists. Tested by running port add/del in a loop with stp
      enabled while setting priority in a loop, crashes are easily
      reproducible.
      The spinlocks around sysfs ->set() were removed in commit:
      14f98f25 ("bridge: range check STP parameters")
      There's also a race condition in the netlink priority support that is
      fixed by this change, but it was introduced recently and the fixes tag
      covers it, just in case it's needed the commit is:
      af615762 ("bridge: add ageing_time, stp_state, priority over netlink")
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Fixes: 14f98f25 ("bridge: range check STP parameters")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08be544e