1. 15 Jun, 2019 40 commits
    • Douglas Anderson's avatar
      soc: rockchip: Set the proper PWM for rk3288 · 55c355b1
      Douglas Anderson authored
      [ Upstream commit bbdc00a7 ]
      
      The rk3288 SoC has two PWM implementations available, the "old"
      implementation and the "new" one.  You can switch between the two of
      them by flipping a bit in the grf.
      
      The "old" implementation is the default at chip power up but isn't the
      one that's officially supposed to be used.  ...and, in fact, the
      driver that gets selected in Linux using the rk3288 device tree only
      supports the "new" implementation.
      
      Long ago I tried to get a switch to the right IP block landed in the
      PWM driver (search for "rk3288: Switch to use the proper PWM IP") but
      that got rejected.  In the mean time the grf has grown a full-fledged
      driver that already sets other random bits like this.  That means we
      can now get the fix landed.
      
      For those wondering how things could have possibly worked for the last
      4.5 years, folks have mostly been relying on the bootloader to set
      this bit.  ...but occasionally folks have pointed back to my old patch
      series [1] in downstream kernels.
      
      [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1391597.htmlSigned-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      55c355b1
    • Lu Baolu's avatar
      iommu/vt-d: Flush IOTLB for untrusted device in time · 189a9522
      Lu Baolu authored
      [ Upstream commit f7b0c4ce ]
      
      By default, for performance consideration, Intel IOMMU
      driver won't flush IOTLB immediately after a buffer is
      unmapped. It schedules a thread and flushes IOTLB in a
      batched mode. This isn't suitable for untrusted device
      since it still can access the memory even if it isn't
      supposed to do so.
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Tested-by: default avatarXu Pengfei <pengfei.xu@intel.com>
      Tested-by: default avatarMika Westerberg <mika.westerberg@intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      189a9522
    • Bartosz Golaszewski's avatar
      usb: ohci-da8xx: disable the regulator if the overcurrent irq fired · 80245980
      Bartosz Golaszewski authored
      [ Upstream commit d3273301 ]
      
      Historically the power supply management in this driver has been handled
      in two separate places in parallel. Device-tree users simply defined an
      appropriate regulator, while two boards with no DT support (da830-evm and
      omapl138-hawk) passed functions defined in their respective board files
      over platform data. These functions simply used legacy GPIO calls to
      watch the oc GPIO for interrupts and disable the vbus GPIO when the irq
      fires.
      
      Commit d193abf1 ("usb: ohci-da8xx: add vbus and overcurrent gpios")
      updated these GPIO calls to the modern API and moved them inside the
      driver.
      
      This however is not the optimal solution for the vbus GPIO which should
      be modeled as a fixed regulator that can be controlled with a GPIO.
      
      In order to keep the overcurrent protection available once we move the
      board files to using fixed regulators we need to disable the enable_reg
      regulator when the overcurrent indicator interrupt fires. Since we
      cannot call regulator_disable() from interrupt context, we need to
      switch to using a oneshot threaded interrupt.
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarSekhar Nori <nsekhar@ti.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80245980
    • Douglas Anderson's avatar
      clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288 · 4fce6bfe
      Douglas Anderson authored
      [ Upstream commit 57a20248 ]
      
      Experimentally it can be seen that going into deep sleep (specifically
      setting PMU_CLR_DMA and PMU_CLR_BUS in RK3288_PMU_PWRMODE_CON1)
      appears to fail unless "aclk_dmac1" is on.  The failure is that the
      system never signals that it made it into suspend on the GLOBAL_PWROFF
      pin and it just hangs.
      
      NOTE that it's confirmed that it's the actual suspend that fails, not
      one of the earlier calls to read/write registers.  Specifically if you
      comment out the "PMU_GLOBAL_INT_DISABLE" setting in
      rk3288_slp_mode_set() and then comment out the "cpu_do_idle()" call in
      rockchip_lpmode_enter() then you can exercise the whole suspend path
      without any crashing.
      
      This is currently not a problem with suspend upstream because there is
      no current way to exercise the deep suspend code.  However, anyone
      trying to make it work will run into this issue.
      
      This was not a problem on shipping rk3288-based Chromebooks because
      those devices all ran on an old kernel based on 3.14.  On that kernel
      "aclk_dmac1" appears to be left on all the time.
      
      There are several ways to skin this problem.
      
      A) We could add "aclk_dmac1" to the list of critical clocks and that
      apperas to work, but presumably that wastes power.
      
      B) We could keep a list of "struct clk" objects to enable at suspend
      time in clk-rk3288.c and use the standard clock APIs.
      
      C) We could make the rk3288-pmu driver keep a list of clocks to enable
      at suspend time.  Presumably this would require a dts and bindings
      change.
      
      D) We could just whack the clock on in the existing syscore suspend
      function where we whack a bunch of other clocks.  This is particularly
      easy because we know for sure that the clock's only parent
      ("aclk_cpu") is a critical clock so we don't need to do anything more
      than ungate it.
      
      In this case I have chosen D) because it seemed like the least work,
      but any of the other options would presumably also work fine.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarElaine Zhang <zhangqing@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4fce6bfe
    • Nathan Chancellor's avatar
      soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher · bd9704bc
      Nathan Chancellor authored
      [ Upstream commit 89e28da8 ]
      
      When building with -Wsometimes-uninitialized, Clang warns:
      
      drivers/soc/mediatek/mtk-pmic-wrap.c:1358:6: error: variable 'rdata' is
      used uninitialized whenever '||' condition is true
      [-Werror,-Wsometimes-uninitialized]
      
      If pwrap_write returns non-zero, pwrap_read will not be called to
      initialize rdata, meaning that we will use some random uninitialized
      stack value in our print statement. Zero initialize rdata in case this
      happens.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/401Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd9704bc
    • Kishon Vijay Abraham I's avatar
      PCI: keystone: Prevent ARM32 specific code to be compiled for ARM64 · 02cb726b
      Kishon Vijay Abraham I authored
      [ Upstream commit f316a2b5 ]
      
      hook_fault_code() is an ARM32 specific API for hooking into data abort.
      
      AM65X platforms (that integrate ARM v8 cores and select CONFIG_ARM64 as
      arch) rely on pci-keystone.c but on them the enumeration of a
      non-present BDF does not trigger a bus error, so the fixup exception
      provided by calling hook_fault_code() is not needed and can be guarded
      with CONFIG_ARM.
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      [lorenzo.pieralisi@arm.com: commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      02cb726b
    • Kishon Vijay Abraham I's avatar
      PCI: keystone: Invoke phy_reset() API before enabling PHY · e178094a
      Kishon Vijay Abraham I authored
      [ Upstream commit b22af42b ]
      
      SERDES connected to the PCIe controller in AM654 requires
      power on reset enable (POR_EN) to be set in the SERDES. The
      SERDES driver sets POR_EN in the reset ops and it has to be
      invoked before init or enable ops. In order for SERDES driver
      to set POR_EN, invoke the phy_reset() API in pci-keystone driver.
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e178094a
    • Enrico Granata's avatar
      platform/chrome: cros_ec_proto: check for NULL transfer function · 1f9dc1d5
      Enrico Granata authored
      [ Upstream commit 94d4e7af ]
      
      As new transfer mechanisms are added to the EC codebase, they may
      not support v2 of the EC protocol.
      
      If the v3 initial handshake transfer fails, the kernel will try
      and call cmd_xfer as a fallback. If v2 is not supported, cmd_xfer
      will be NULL, and the code will end up causing a kernel panic.
      
      Add a check for NULL before calling the transfer function, along
      with a helpful comment explaining how one might end up in this
      situation.
      Signed-off-by: default avatarEnrico Granata <egranata@chromium.org>
      Reviewed-by: default avatarJett Rink <jettrink@chromium.org>
      Signed-off-by: default avatarEnric Balletbo i Serra <enric.balletbo@collabora.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f9dc1d5
    • Tony Lindgren's avatar
      power: supply: cpcap-battery: Fix signed counter sample register · bec4c3b3
      Tony Lindgren authored
      [ Upstream commit c68b901a ]
      
      The accumulator sample register is signed 32-bits wide register on
      droid 4. And only the earlier version of cpcap has a signed 24-bits
      wide register. We're currently passing it around as unsigned, so
      let's fix that and use sign_extend32() for the earlier revision.
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bec4c3b3
    • Adam Ludkiewicz's avatar
      i40e: Queues are reserved despite "Invalid argument" error · 6ace9bde
      Adam Ludkiewicz authored
      [ Upstream commit 3e957b37 ]
      
      Added a new local variable in the i40e_setup_tc function named
      old_queue_pairs so num_queue_pairs can be restored to the correct
      value in case configuring queue channels fails. Additionally, moved
      the exit label in the i40e_setup_tc function so the if (need_reset)
      block can be executed.
      Also, fixed data packing in the i40e_setup_tc function.
      Signed-off-by: default avatarAdam Ludkiewicz <adam.ludkiewicz@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6ace9bde
    • Jon Hunter's avatar
      soc/tegra: pmc: Remove reset sysfs entries on error · ee3798fc
      Jon Hunter authored
      [ Upstream commit a46b51cd ]
      
      Commit 5f84bb1a ("soc/tegra: pmc: Add sysfs entries for reset info")
      added sysfs entries for Tegra reset source and level. However, these
      sysfs are not removed on error and so if the registering of PMC device
      is probe deferred, then the next time we attempt to probe the PMC device
      warnings such as the following will be displayed on boot ...
      
        sysfs: cannot create duplicate filename '/devices/platform/7000e400.pmc/reset_reason'
      
      Fix this by calling device_remove_file() for each sysfs entry added on
      failure. Note that we call device_remove_file() unconditionally without
      checking if the sysfs entry was created in the first place, but this
      should be OK because kernfs_remove_by_name_ns() will fail silently.
      
      Fixes: 5f84bb1a ("soc/tegra: pmc: Add sysfs entries for reset info")
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee3798fc
    • Wenwen Wang's avatar
      x86/PCI: Fix PCI IRQ routing table memory leak · 47de03be
      Wenwen Wang authored
      [ Upstream commit ea094d53 ]
      
      In pcibios_irq_init(), the PCI IRQ routing table 'pirq_table' is first
      found through pirq_find_routing_table().  If the table is not found and
      CONFIG_PCI_BIOS is defined, the table is then allocated in
      pcibios_get_irq_routing_table() using kmalloc().  Later, if the I/O APIC is
      used, this table is actually not used.  In that case, the allocated table
      is not freed, which is a memory leak.
      
      Free the allocated table if it is not used.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      [bhelgaas: added Ingo's reviewed-by, since the only change since v1 was to
      use the irq_routing_table local variable name he suggested]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47de03be
    • Mika Westerberg's avatar
      net: thunderbolt: Unregister ThunderboltIP protocol handler when suspending · 18d4decc
      Mika Westerberg authored
      [ Upstream commit 9872760e ]
      
      The XDomain protocol messages may start as soon as Thunderbolt control
      channel is started. This means that if the other host starts sending
      ThunderboltIP packets early enough they will be passed to the network
      driver which then gets confused because its resume hook is not called
      yet.
      
      Fix this by unregistering the ThunderboltIP protocol handler when
      suspending and registering it back on resume.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      18d4decc
    • Wesley Sheng's avatar
      switchtec: Fix unintended mask of MRPC event · e86a0475
      Wesley Sheng authored
      [ Upstream commit 083c1b5e ]
      
      When running application tool switchtec-user's `firmware update` and `event
      wait` commands concurrently, sometimes the firmware update speed reduced
      significantly.
      
      It is because when the MRPC event happened after MRPC event occurrence
      check but before the event mask loop reaches its header register in event
      ISR, the MRPC event would be masked unintentionally.  Since there's no
      chance to enable it again except for a module reload, all the following
      MRPC execution completion checks time out.
      
      Fix this bug by skipping the mask operation for MRPC event in event ISR,
      same as what we already do for LINK event.
      
      Fixes: 52eabba5 ("switchtec: Add IOCTLs to the Switchtec driver")
      Signed-off-by: default avatarWesley Sheng <wesley.sheng@microchip.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e86a0475
    • Will Deacon's avatar
      iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel · ddfffa59
      Will Deacon authored
      [ Upstream commit 3f54c447 ]
      
      Disabling the SMMU when probing from within a kdump kernel so that all
      incoming transactions are terminated can prevent the core of the crashed
      kernel from being transferred off the machine if all I/O devices are
      behind the SMMU.
      
      Instead, continue to probe the SMMU after it is disabled so that we can
      reinitialise it entirely and re-attach the DMA masters as they are reset.
      Since the kdump kernel may not have drivers for all of the active DMA
      masters, we suppress fault reporting to avoid spamming the console and
      swamping the IRQ threads.
      Reported-by: default avatar"Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
      Tested-by: default avatar"Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
      Tested-by: default avatarBhupesh Sharma <bhsharma@redhat.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ddfffa59
    • Farhan Ali's avatar
      vfio: Fix WARNING "do not call blocking ops when !TASK_RUNNING" · 6a3bde70
      Farhan Ali authored
      [ Upstream commit 41be3e26 ]
      
      vfio_dev_present() which is the condition to
      wait_event_interruptible_timeout(), will call vfio_group_get_device
      and try to acquire the mutex group->device_lock.
      
      wait_event_interruptible_timeout() will set the state of the current
      task to TASK_INTERRUPTIBLE, before doing the condition check. This
      means that we will try to acquire the mutex while already in a
      sleeping state. The scheduler warns us by giving the following
      warning:
      
      [ 4050.264464] ------------[ cut here ]------------
      [ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188
      [ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90
      ....
      
       4050.264756] Call Trace:
      [ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90)
      [ 4050.264774]  [<0000000000b97edc>] __mutex_lock+0x44/0x8c0
      [ 4050.264782]  [<0000000000b9878a>] mutex_lock_nested+0x32/0x40
      [ 4050.264793]  [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio]
      [ 4050.264803]  [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio]
      [ 4050.264813]  [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev]
      [ 4050.264825]  [<00000000008e01b0>] device_release_driver_internal+0x168/0x268
      [ 4050.264834]  [<00000000008de692>] bus_remove_device+0x162/0x190
      [ 4050.264843]  [<00000000008daf42>] device_del+0x1e2/0x368
      [ 4050.264851]  [<00000000008db12c>] device_unregister+0x64/0x88
      [ 4050.264862]  [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev]
      [ 4050.264872]  [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev]
      [ 4050.264881]  [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8
      [ 4050.264890]  [<00000000003c1530>] __vfs_write+0x38/0x1a8
      [ 4050.264899]  [<00000000003c187c>] vfs_write+0xb4/0x198
      [ 4050.264908]  [<00000000003c1af2>] ksys_write+0x5a/0xb0
      [ 4050.264916]  [<0000000000b9e270>] system_call+0xdc/0x2d8
      [ 4050.264925] 4 locks held by sh/35924:
      [ 4050.264933]  #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198
      [ 4050.264948]  #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8
      [ 4050.264963]  #2: 0000000034831ab8 (kn->count#297){++++}, at: kernfs_remove_self+0x12e/0x150
      [ 4050.264979]  #3: 00000000e152484f (&dev->mutex){....}, at: device_release_driver_internal+0x5c/0x268
      [ 4050.264993] Last Breaking-Event-Address:
      [ 4050.265002]  [<000000000017bbaa>] __might_sleep+0x72/0x90
      [ 4050.265010] irq event stamp: 7039
      [ 4050.265020] hardirqs last  enabled at (7047): [<00000000001cee7a>] console_unlock+0x6d2/0x740
      [ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>] console_unlock+0xd6/0x740
      [ 4050.265040] softirqs last  enabled at (6416): [<0000000000b8fe26>] __udelay+0xb6/0x100
      [ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>] __udelay+0x96/0x100
      [ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]---
      
      Let's fix this as described in the article
      https://lwn.net/Articles/628628/.
      Signed-off-by: default avatarFarhan Ali <alifm@linux.ibm.com>
      [remove now redundant vfio_dev_present()]
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6a3bde70
    • Arnd Bergmann's avatar
      nfsd: avoid uninitialized variable warning · 00cf52ed
      Arnd Bergmann authored
      [ Upstream commit 0ab88ca4 ]
      
      clang warns that 'contextlen' may be accessed without an initialization:
      
      fs/nfsd/nfs4xdr.c:2911:9: error: variable 'contextlen' is uninitialized when used here [-Werror,-Wuninitialized]
                                                                      contextlen);
                                                                      ^~~~~~~~~~
      fs/nfsd/nfs4xdr.c:2424:16: note: initialize the variable 'contextlen' to silence this warning
              int contextlen;
                            ^
                             = 0
      
      Presumably this cannot happen, as FATTR4_WORD2_SECURITY_LABEL is
      set if CONFIG_NFSD_V4_SECURITY_LABEL is enabled.
      Adding another #ifdef like the other two in this function
      avoids the warning.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00cf52ed
    • J. Bruce Fields's avatar
      nfsd: allow fh_want_write to be called twice · 82ade407
      J. Bruce Fields authored
      [ Upstream commit 0b8f6262 ]
      
      A fuzzer recently triggered lockdep warnings about potential sb_writers
      deadlocks caused by fh_want_write().
      
      Looks like we aren't careful to pair each fh_want_write() with an
      fh_drop_write().
      
      It's not normally a problem since fh_put() will call fh_drop_write() for
      us.  And was OK for NFSv3 where we'd do one operation that might call
      fh_want_write(), and then put the filehandle.
      
      But an NFSv4 protocol fuzzer can do weird things like call unlink twice
      in a compound, and then we get into trouble.
      
      I'm a little worried about this approach of just leaving everything to
      fh_put().  But I think there are probably a lot of
      fh_want_write()/fh_drop_write() imbalances so for now I think we need it
      to be more forgiving.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      82ade407
    • Kirill Smelkov's avatar
      fuse: retrieve: cap requested size to negotiated max_write · a5ee124e
      Kirill Smelkov authored
      [ Upstream commit 7640682e ]
      
      FUSE filesystem server and kernel client negotiate during initialization
      phase, what should be the maximum write size the client will ever issue.
      Correspondingly the filesystem server then queues sys_read calls to read
      requests with buffer capacity large enough to carry request header + that
      max_write bytes. A filesystem server is free to set its max_write in
      anywhere in the range between [1*page, fc->max_pages*page]. In particular
      go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages
      corresponds to 128K. Libfuse also allows users to configure max_write, but
      by default presets it to possible maximum.
      
      If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we
      allow to retrieve more than max_write bytes, corresponding prepared
      NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
      filesystem server, in full correspondence with server/client contract, will
      be only queuing sys_read with ~max_write buffer capacity, and
      fuse_dev_do_read throws away requests that cannot fit into server request
      buffer. In turn the filesystem server could get stuck waiting indefinitely
      for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is
      understood by clients as that NOTIFY_REPLY was queued and will be sent
      back.
      
      Cap requested size to negotiate max_write to avoid the problem.  This
      aligns with the way NOTIFY_RETRIEVE handler works, which already
      unconditionally caps requested retrieve size to fuse_conn->max_pages.  This
      way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than
      was originally requested.
      
      Please see [1] for context where the problem of stuck filesystem was hit
      for real, how the situation was traced and for more involving patch that
      did not make it into the tree.
      
      [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
      [2] https://github.com/hanwen/go-fuseSigned-off-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      Cc: Han-Wen Nienhuys <hanwen@google.com>
      Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a5ee124e
    • Chen-Yu Tsai's avatar
      nvmem: sunxi_sid: Support SID on A83T and H5 · 4c55b1f8
      Chen-Yu Tsai authored
      [ Upstream commit da75b890 ]
      
      The device tree binding already lists compatible strings for these two
      SoCs. They don't have the defect as seen on the H3, and the size and
      register layout is the same as the A64. Furthermore, the driver does
      not include nvmem cell definitions.
      
      Add support for these two compatible strings, re-using the config for
      the A64.
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Acked-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Signed-off-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4c55b1f8
    • Jorge Ramirez-Ortiz's avatar
      nvmem: core: fix read buffer in place · f6fdf2d1
      Jorge Ramirez-Ortiz authored
      [ Upstream commit 2fe518fe ]
      
      When the bit_offset in the cell is zero, the pointer to the msb will
      not be properly initialized (ie, will still be pointing to the first
      byte in the buffer).
      
      This being the case, if there are bits to clear in the msb, those will
      be left untouched while the mask will incorrectly clear bit positions
      on the first byte.
      
      This commit also makes sure that any byte unused in the cell is
      cleared.
      Signed-off-by: default avatarJorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
      Signed-off-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f6fdf2d1
    • Lu Baolu's avatar
      iommu/vt-d: Don't request page request irq under dmar_global_lock · 2a5fd4fa
      Lu Baolu authored
      [ Upstream commit a7755c3c ]
      
      Requesting page reqest irq under dmar_global_lock could cause
      potential lock race condition (caught by lockdep).
      
      [    4.100055] ======================================================
      [    4.100063] WARNING: possible circular locking dependency detected
      [    4.100072] 5.1.0-rc4+ #2169 Not tainted
      [    4.100078] ------------------------------------------------------
      [    4.100086] swapper/0/1 is trying to acquire lock:
      [    4.100094] 000000007dcbe3c3 (dmar_lock){+.+.}, at: dmar_alloc_hwirq+0x35/0x140
      [    4.100112] but task is already holding lock:
      [    4.100120] 0000000060bbe946 (dmar_global_lock){++++}, at: intel_iommu_init+0x191/0x1438
      [    4.100136] which lock already depends on the new lock.
      [    4.100146] the existing dependency chain (in reverse order) is:
      [    4.100155]
                     -> #2 (dmar_global_lock){++++}:
      [    4.100169]        down_read+0x44/0xa0
      [    4.100178]        intel_irq_remapping_alloc+0xb2/0x7b0
      [    4.100186]        mp_irqdomain_alloc+0x9e/0x2e0
      [    4.100195]        __irq_domain_alloc_irqs+0x131/0x330
      [    4.100203]        alloc_isa_irq_from_domain.isra.4+0x9a/0xd0
      [    4.100212]        mp_map_pin_to_irq+0x244/0x310
      [    4.100221]        setup_IO_APIC+0x757/0x7ed
      [    4.100229]        x86_late_time_init+0x17/0x1c
      [    4.100238]        start_kernel+0x425/0x4e3
      [    4.100247]        secondary_startup_64+0xa4/0xb0
      [    4.100254]
                     -> #1 (irq_domain_mutex){+.+.}:
      [    4.100265]        __mutex_lock+0x7f/0x9d0
      [    4.100273]        __irq_domain_add+0x195/0x2b0
      [    4.100280]        irq_domain_create_hierarchy+0x3d/0x40
      [    4.100289]        msi_create_irq_domain+0x32/0x110
      [    4.100297]        dmar_alloc_hwirq+0x111/0x140
      [    4.100305]        dmar_set_interrupt.part.14+0x1a/0x70
      [    4.100314]        enable_drhd_fault_handling+0x2c/0x6c
      [    4.100323]        apic_bsp_setup+0x75/0x7a
      [    4.100330]        x86_late_time_init+0x17/0x1c
      [    4.100338]        start_kernel+0x425/0x4e3
      [    4.100346]        secondary_startup_64+0xa4/0xb0
      [    4.100352]
                     -> #0 (dmar_lock){+.+.}:
      [    4.100364]        lock_acquire+0xb4/0x1c0
      [    4.100372]        __mutex_lock+0x7f/0x9d0
      [    4.100379]        dmar_alloc_hwirq+0x35/0x140
      [    4.100389]        intel_svm_enable_prq+0x61/0x180
      [    4.100397]        intel_iommu_init+0x1128/0x1438
      [    4.100406]        pci_iommu_init+0x16/0x3f
      [    4.100414]        do_one_initcall+0x5d/0x2be
      [    4.100422]        kernel_init_freeable+0x1f0/0x27c
      [    4.100431]        kernel_init+0xa/0x110
      [    4.100438]        ret_from_fork+0x3a/0x50
      [    4.100444]
                     other info that might help us debug this:
      
      [    4.100454] Chain exists of:
                       dmar_lock --> irq_domain_mutex --> dmar_global_lock
      [    4.100469]  Possible unsafe locking scenario:
      
      [    4.100476]        CPU0                    CPU1
      [    4.100483]        ----                    ----
      [    4.100488]   lock(dmar_global_lock);
      [    4.100495]                                lock(irq_domain_mutex);
      [    4.100503]                                lock(dmar_global_lock);
      [    4.100512]   lock(dmar_lock);
      [    4.100518]
                      *** DEADLOCK ***
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Reported-by: default avatarDave Jiang <dave.jiang@intel.com>
      Fixes: a222a7f0 ("iommu/vt-d: Implement page request handling")
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a5fd4fa
    • Valentin Schneider's avatar
      arm64: defconfig: Update UFSHCD for Hi3660 soc · 3b837ba0
      Valentin Schneider authored
      [ Upstream commit 7b3320e6 ]
      
      Commit 7ee7ef24 ("scsi: arm64: defconfig: enable configs for Hisilicon ufs")
      set 'CONFIG_SCSI_UFS_HISI=y', but the configs it depends
      on
      
        (CONFIG_SCSI_HFSHCD_PLATFORM && CONFIG_SCSI_UFSHCD)
      
      were left to being built as modules.
      
      Commit 1f4fa50d ("arm64: defconfig: Regenerate for v4.20") "fixed"
      that by reverting to 'CONFIG_SCSI_UFS_HISI=m'.
      
      Thing is, if the rootfs is stored in the on-board flash (which
      is the "canonical" way of doing things), we either need these drivers
      to be built-in, or we need to fiddle with an initramfs to access that
      flash and eventually load the modules installed over there.
      
      The former is the easiest, do that.
      Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3b837ba0
    • Nathan Fontenot's avatar
      powerpc/pseries: Track LMB nid instead of using device tree · 7bd4b91a
      Nathan Fontenot authored
      [ Upstream commit b2d3b5ee ]
      
      When removing memory we need to remove the memory from the node
      it was added to instead of looking up the node it should be in
      in the device tree.
      
      During testing we have seen scenarios where the affinity for a
      LMB changes due to a partition migration or PRRN event. In these
      cases the node the LMB exists in may not match the node the device
      tree indicates it belongs in. This can lead to a system crash
      when trying to DLPAR remove the LMB after a migration or PRRN
      event. The current code looks up the node in the device tree to
      remove the LMB from, the crash occurs when we try to offline this
      node and it does not have any data, i.e. node_data[nid] == NULL.
      
      36:mon> e
      cpu 0x36: Vector: 300 (Data Access) at [c0000001828b7810]
          pc: c00000000036d08c: try_offline_node+0x2c/0x1b0
          lr: c0000000003a14ec: remove_memory+0xbc/0x110
          sp: c0000001828b7a90
         msr: 800000000280b033
         dar: 9a28
       dsisr: 40000000
        current = 0xc0000006329c4c80
        paca    = 0xc000000007a55200   softe: 0        irq_happened: 0x01
          pid   = 76926, comm = kworker/u320:3
      
      36:mon> t
      [link register   ] c0000000003a14ec remove_memory+0xbc/0x110
      [c0000001828b7a90] c00000000006a1cc arch_remove_memory+0x9c/0xd0 (unreliable)
      [c0000001828b7ad0] c0000000003a14e0 remove_memory+0xb0/0x110
      [c0000001828b7b20] c0000000000c7db4 dlpar_remove_lmb+0x94/0x160
      [c0000001828b7b60] c0000000000c8ef8 dlpar_memory+0x7e8/0xd10
      [c0000001828b7bf0] c0000000000bf828 handle_dlpar_errorlog+0xf8/0x160
      [c0000001828b7c60] c0000000000bf8cc pseries_hp_work_fn+0x3c/0xa0
      [c0000001828b7c90] c000000000128cd8 process_one_work+0x298/0x5a0
      [c0000001828b7d20] c000000000129068 worker_thread+0x88/0x620
      [c0000001828b7dc0] c00000000013223c kthread+0x1ac/0x1c0
      [c0000001828b7e30] c00000000000b45c ret_from_kernel_thread+0x5c/0x80
      
      To resolve this we need to track the node a LMB belongs to when
      it is added to the system so we can remove it from that node instead
      of the node that the device tree indicates it should belong to.
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7bd4b91a
    • Takashi Iwai's avatar
      ALSA: hda - Register irq handler after the chip initialization · 996fffea
      Takashi Iwai authored
      [ Upstream commit f495222e ]
      
      Currently the IRQ handler in HD-audio controller driver is registered
      before the chip initialization.  That is, we have some window opened
      between the azx_acquire_irq() call and the CORB/RIRB setup.  If an
      interrupt is triggered in this small window, the IRQ handler may
      access to the uninitialized RIRB buffer, which leads to a NULL
      dereference Oops.
      
      This is usually no big problem since most of Intel chips do register
      the IRQ via MSI, and we've already fixed the order of the IRQ
      enablement and the CORB/RIRB setup in the former commit b61749a8
      ("sound: enable interrupt after dma buffer initialization"), hence the
      IRQ won't be triggered in that room.  However, some platforms use a
      shared IRQ, and this may allow the IRQ trigger by another source.
      
      Another possibility is the kdump environment: a stale interrupt might
      be present in there, the IRQ handler can be falsely triggered as well.
      
      For covering this small race, let's move the azx_acquire_irq() call
      after hda_intel_init_chip() call.  Although this is a bit radical
      change, it can cover more widely than checking the CORB/RIRB setup
      locally in the callee side.
      Reported-by: default avatarLiwei Song <liwei.song@windriver.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      996fffea
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: fix netdev refcnt leak · 11ae693f
      Taehee Yoo authored
      [ Upstream commit 26a302af ]
      
      flow_offload_alloc() calls nf_route() to get a dst_entry. Internally,
      nf_route() calls ip_route_output_key() that allocates a dst_entry and
      holds it. So, a dst_entry should be released by dst_release() if
      nf_route() is successful.
      
      Otherwise, netns exit routine cannot be finished and the following
      message is printed:
      
      [  257.490952] unregister_netdevice: waiting for lo to become free. Usage count = 1
      
      Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      11ae693f
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: check ttl value in flow offload data path · c155f374
      Taehee Yoo authored
      [ Upstream commit 33cc3c0c ]
      
      nf_flow_offload_ip_hook() and nf_flow_offload_ipv6_hook() do not check
      ttl value. So, ttl value overflow may occur.
      
      Fixes: 97add9f0 ("netfilter: flow table support for IPv4")
      Fixes: 09952107 ("netfilter: flow table support for IPv6")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c155f374
    • Keith Busch's avatar
      nvme-pci: shutdown on timeout during deletion · c6508f86
      Keith Busch authored
      [ Upstream commit 9dc1a38e ]
      
      We do not restart a controller in a deleting state for timeout errors.
      When in this state, unblock potential request dispatchers with failed
      completions by shutting down the controller on timeout detection.
      Reported-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c6508f86
    • Keith Busch's avatar
      nvme-pci: unquiesce admin queue on shutdown · ca6bcc03
      Keith Busch authored
      [ Upstream commit c8e9e9b7 ]
      
      Just like IO queues, the admin queue also will not be restarted after a
      controller shutdown. Unquiesce this queue so that we do not block
      request dispatch on a permanently disabled controller.
      Reported-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ca6bcc03
    • Kishon Vijay Abraham I's avatar
      PCI: designware-ep: Use aligned ATU window for raising MSI interrupts · d6f90fae
      Kishon Vijay Abraham I authored
      [ Upstream commit 6b733030 ]
      
      Certain platforms like K2G reguires the outbound ATU window to be
      aligned. The alignment size is already present in mem->page_size.
      Use the alignment size present in mem->page_size to configure an
      aligned ATU window. In order to raise an interrupt, CPU has to write
      to address offset from the start of the window unlike before where
      writes were always to the beginning of the ATU window.
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d6f90fae
    • Kishon Vijay Abraham I's avatar
      misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test · 449b9fd4
      Kishon Vijay Abraham I authored
      [ Upstream commit 8f220664 ]
      
      commit 834b9051 ("misc: pci_endpoint_test: Add support for
      PCI_ENDPOINT_TEST regs to be mapped to any BAR") while adding
      test_reg_bar in order to map PCI_ENDPOINT_TEST regs to be mapped to any
      BAR failed to update test_reg_bar in pci_endpoint_test, resulting in
      test_reg_bar having invalid value when used outside probe.
      
      Fix it.
      
      Fixes: 834b9051 ("misc: pci_endpoint_test: Add support for PCI_ENDPOINT_TEST regs to be mapped to any BAR")
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      449b9fd4
    • Greg Kurz's avatar
      vfio-pci/nvlink2: Fix potential VMA leak · dcaa5e20
      Greg Kurz authored
      [ Upstream commit 2c85f2bd ]
      
      If vfio_pci_register_dev_region() fails then we should rollback
      previous changes, ie. unmap the ATSD registers.
      
      Fixes: 7f928917 ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Reviewed-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dcaa5e20
    • Lu Baolu's avatar
      iommu/vt-d: Set intel_iommu_gfx_mapped correctly · 47e80978
      Lu Baolu authored
      [ Upstream commit cf1ec453 ]
      
      The intel_iommu_gfx_mapped flag is exported by the Intel
      IOMMU driver to indicate whether an IOMMU is used for the
      graphic device. In a virtualized IOMMU environment (e.g.
      QEMU), an include-all IOMMU is used for graphic device.
      This flag is found to be clear even the IOMMU is used.
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Reported-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      Fixes: c0771df8 ("intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.")
      Suggested-by: default avatarKevin Tian <kevin.tian@intel.com>
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47e80978
    • Ming Lei's avatar
      blk-mq: move cancel of requeue_work into blk_mq_release · 9e2c1b3d
      Ming Lei authored
      [ Upstream commit fbc2a15e ]
      
      With holding queue's kobject refcount, it is safe for driver
      to schedule requeue. However, blk_mq_kick_requeue_list() may
      be called after blk_sync_queue() is done because of concurrent
      requeue activities, then requeue work may not be completed when
      freeing queue, and kernel oops is triggered.
      
      So moving the cancel of requeue_work into blk_mq_release() for
      avoiding race between requeue and freeing queue.
      
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: James Smart <james.smart@broadcom.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: linux-scsi@vger.kernel.org,
      Cc: Martin K . Petersen <martin.petersen@oracle.com>,
      Cc: Christoph Hellwig <hch@lst.de>,
      Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>,
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9e2c1b3d
    • Vladimir Zapolskiy's avatar
      watchdog: fix compile time error of pretimeout governors · 1b6349c8
      Vladimir Zapolskiy authored
      [ Upstream commit a223770b ]
      
      CONFIG_WATCHDOG_PRETIMEOUT_GOV build symbol adds watchdog_pretimeout.o
      object to watchdog.o, the latter is compiled only if CONFIG_WATCHDOG_CORE
      is selected, so it rightfully makes sense to add it as a dependency.
      
      The change fixes the next compilation errors, if CONFIG_WATCHDOG_CORE=n
      and CONFIG_WATCHDOG_PRETIMEOUT_GOV=y are selected:
      
        drivers/watchdog/pretimeout_noop.o: In function `watchdog_gov_noop_register':
        drivers/watchdog/pretimeout_noop.c:35: undefined reference to `watchdog_register_governor'
        drivers/watchdog/pretimeout_noop.o: In function `watchdog_gov_noop_unregister':
        drivers/watchdog/pretimeout_noop.c:40: undefined reference to `watchdog_unregister_governor'
      
        drivers/watchdog/pretimeout_panic.o: In function `watchdog_gov_panic_register':
        drivers/watchdog/pretimeout_panic.c:35: undefined reference to `watchdog_register_governor'
        drivers/watchdog/pretimeout_panic.o: In function `watchdog_gov_panic_unregister':
        drivers/watchdog/pretimeout_panic.c:40: undefined reference to `watchdog_unregister_governor'
      Reported-by: default avatarKuo, Hsuan-Chi <hckuo2@illinois.edu>
      Fixes: ff84136c ("watchdog: add watchdog pretimeout governor framework")
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@linux-watchdog.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1b6349c8
    • Georg Hofmann's avatar
      watchdog: imx2_wdt: Fix set_timeout for big timeout values · 81a0eaaa
      Georg Hofmann authored
      [ Upstream commit b07e228e ]
      
      The documentated behavior is: if max_hw_heartbeat_ms is implemented, the
      minimum of the set_timeout argument and max_hw_heartbeat_ms should be used.
      This patch implements this behavior.
      Previously only the first 7bits were used and the input argument was
      returned.
      Signed-off-by: default avatarGeorg Hofmann <georg@hofmannsweb.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@linux-watchdog.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      81a0eaaa
    • Florian Westphal's avatar
      netfilter: nf_tables: fix base chain stat rcu_dereference usage · 1f468942
      Florian Westphal authored
      [ Upstream commit edbd82c5 ]
      
      Following splat gets triggered when nfnetlink monitor is running while
      xtables-nft selftests are running:
      
      net/netfilter/nf_tables_api.c:1272 suspicious rcu_dereference_check() usage!
      other info that might help us debug this:
      
      1 lock held by xtables-nft-mul/27006:
       #0: 00000000e0f85be9 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1a/0x50
      Call Trace:
       nf_tables_fill_chain_info.isra.45+0x6cc/0x6e0
       nf_tables_chain_notify+0xf8/0x1a0
       nf_tables_commit+0x165c/0x1740
      
      nf_tables_fill_chain_info() can be called both from dumps (rcu read locked)
      or from the transaction path if a userspace process subscribed to nftables
      notifications.
      
      In the 'table dump' case, rcu_access_pointer() cannot be used: We do not
      hold transaction mutex so the pointer can be NULLed right after the check.
      Just unconditionally fetch the value, then have the helper return
      immediately if its NULL.
      
      In the notification case we don't hold the rcu read lock, but updates are
      prevented due to transaction mutex. Use rcu_dereference_check() to make lockdep
      aware of this.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f468942
    • Serge Semin's avatar
      mips: Make sure dt memory regions are valid · b0e7ae14
      Serge Semin authored
      [ Upstream commit 93fa5b28 ]
      
      There are situations when memory regions coming from dts may be
      too big for the platform physical address space. This especially
      concerns XPA-capable systems. Bootloader may determine more than 4GB
      memory available and pass it to the kernel over dts memory node, while
      kernel is built without XPA/64BIT support. In this case the region
      may either simply be truncated by add_memory_region() method
      or by u64->phys_addr_t type casting. But in worst case the method
      can even drop the memory region if it exceeds PHYS_ADDR_MAX size.
      So lets make sure the retrieved from dts memory regions are valid,
      and if some of them aren't, just manually truncate them with a warning
      printed out.
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Serge Semin <Sergey.Semin@t-platforms.ru>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b0e7ae14
    • Jakub Jankowski's avatar
      netfilter: nf_conntrack_h323: restore boundary check correctness · 58c8298c
      Jakub Jankowski authored
      [ Upstream commit f5e85ce8 ]
      
      Since commit bc7d811a ("netfilter: nf_ct_h323: Convert
      CHECK_BOUND macro to function"), NAT traversal for H.323
      doesn't work, failing to parse H323-UserInformation.
      nf_h323_error_boundary() compares contents of the bitstring,
      not the addresses, preventing valid H.323 packets from being
      conntrack'd.
      
      This looks like an oversight from when CHECK_BOUND macro was
      converted to a function.
      
      To fix it, stop dereferencing bs->cur and bs->end.
      
      Fixes: bc7d811a ("netfilter: nf_ct_h323: Convert CHECK_BOUND macro to function")
      Signed-off-by: default avatarJakub Jankowski <shasta@toxcorp.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      58c8298c
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast · d34b034d
      Taehee Yoo authored
      [ Upstream commit 43c8f131 ]
      
      rhashtable_insert_fast() may return an error value when memory
      allocation fails, but flow_offload_add() does not check for errors.
      This patch just adds missing error checking.
      
      Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d34b034d