1. 31 Jul, 2018 6 commits
  2. 30 Jul, 2018 8 commits
  3. 26 Jul, 2018 17 commits
  4. 25 Jul, 2018 9 commits
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 6e77b267
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "One more round of updates for problems seen this -rc series. Drivers
        fixes are:
      
         - Amlogic Meson audio divider fix and CPU clk critical marking
      
         - Qualcomm multimedia GDSC marked as 'always on' to keep display
           working
      
         - Aspeed fixes for critical clks, resets causing clks to stay
           disabled, and an incorrect HPLL frequency calculation
      
         - Marvell Armada 3700 cpu clks would undervolt when switching from
           low frequencies to high frequencies because the voltage didn't
           stabilize in time so now we switch to an intermediate frequency
      
        Plus we have a core framework thinko that messed up the debugfs flag
        printing logic to make it not very useful"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: aspeed: Support HPLL strapping on ast2400
        clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz
        clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical
        clk/mmcc-msm8996: Make mmagic_bimc_gdsc ALWAYS_ON
        clk: aspeed: Treat a gate in reset as disabled
        clk: Really show symbolic clock flags in debugfs
        clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock
        clk: meson: audio-divider is one based
        clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
      6e77b267
    • Linus Torvalds's avatar
      Merge tag 'fscache-fixes-20180725' of... · 5c61ef1b
      Linus Torvalds authored
      Merge tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull fscache/cachefiles fixes from David Howells:
      
       - Allow cancelled operations to be queued so they can be cleaned up.
      
       - Fix a refcounting bug in the monitoring of reads on backend files
         whereby a race can occur between monitor objects being listed for
         work, the work processing being queued and the work processor running
         and destroying the monitor objects.
      
       - Fix a ref overput in object attachment, whereby a tentatively
         considered object is put in error handling without first being 'got'.
      
       - Fix a missing clear of the CACHEFILES_OBJECT_ACTIVE flag whereby an
         assertion occurs when we retry because it seems the object is now
         active.
      
       - Wait rather BUG'ing on an object collision in the depths of
         cachefiles as the active object should be being cleaned up - also
         depends on the one above.
      
      * tag 'fscache-fixes-20180725' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
        cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
        fscache: Fix reference overput in fscache_attach_object() error handling
        cachefiles: Fix refcounting bug in backing-file read monitoring
        fscache: Allow cancelled operations to be enqueued
      5c61ef1b
    • Kiran Kumar Modukuri's avatar
      cachefiles: Wait rather than BUG'ing on "Unexpected object collision" · c2412ac4
      Kiran Kumar Modukuri authored
      If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
      the active object tree, we have been emitting a BUG after logging
      information about it and the new object.
      
      Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
      on the old object (or return an error).  The ACTIVE flag should be cleared
      after it has been removed from the active object tree.  A timeout of 60s is
      used in the wait, so we shouldn't be able to get stuck there.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c2412ac4
    • Kiran Kumar Modukuri's avatar
      cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag · 5ce83d4b
      Kiran Kumar Modukuri authored
      In cachefiles_mark_object_active(), the new object is marked active and
      then we try to add it to the active object tree.  If a conflicting object
      is already present, we want to wait for that to go away.  After the wait,
      we go round again and try to re-mark the object as being active - but it's
      already marked active from the first time we went through and a BUG is
      issued.
      
      Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.
      
      Analysis from Kiran Kumar Modukuri:
      
      [Impact]
      Oops during heavy NFS + FSCache + Cachefiles
      
      CacheFiles: Error: Overlong wait for old active object to go away.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
      
      CacheFiles: Error: Object already active kernel BUG at
      fs/cachefiles/namei.c:163!
      
      [Cause]
      In a heavily loaded system with big files being read and truncated, an
      fscache object for a cookie is being dropped and a new object being
      looked. The new object being looked for has to wait for the old object
      to go away before the new object is moved to active state.
      
      [Fix]
      Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
      retrying the object lookup.
      
      [Testcase]
      Have run ~100 hours of NFS stress tests and have not seen this bug recur.
      
      [Regression Potential]
       - Limited to fscache/cachefiles.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5ce83d4b
    • Kiran Kumar Modukuri's avatar
      fscache: Fix reference overput in fscache_attach_object() error handling · f29507ce
      Kiran Kumar Modukuri authored
      When a cookie is allocated that causes fscache_object structs to be
      allocated, those objects are initialised with the cookie pointer, but
      aren't blessed with a ref on that cookie unless the attachment is
      successfully completed in fscache_attach_object().
      
      If attachment fails because the parent object was dying or there was a
      collision, fscache_attach_object() returns without incrementing the cookie
      counter - but upon failure of this function, the object is released which
      then puts the cookie, whether or not a ref was taken on the cookie.
      
      Fix this by taking a ref on the cookie when it is assigned in
      fscache_object_init(), even when we're creating a root object.
      
      
      Analysis from Kiran Kumar:
      
      This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel
      
      BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277
      
      fscache cookie ref count updated incorrectly during fscache object
      allocation resulting in following Oops.
      
      kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
      kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
      
      [Cause]
      Two threads are trying to do operate on a cookie and two objects.
      
      (1) One thread tries to unmount the filesystem and in process goes over a
          huge list of objects marking them dead and deleting the objects.
          cookie->usage is also decremented in following path:
      
            nfs_fscache_release_super_cookie
             -> __fscache_relinquish_cookie
              ->__fscache_cookie_put
              ->BUG_ON(atomic_read(&cookie->usage) <= 0);
      
      (2) A second thread tries to lookup an object for reading data in following
          path:
      
          fscache_alloc_object
          1) cachefiles_alloc_object
              -> fscache_object_init
                 -> assign cookie, but usage not bumped.
          2) fscache_attach_object -> fails in cant_attach_object because the
               cookie's backing object or cookie's->parent object are going away
          3) fscache_put_object
              -> cachefiles_put_object
                ->fscache_object_destroy
                  ->fscache_cookie_put
                     ->BUG_ON(atomic_read(&cookie->usage) <= 0);
      
      [NOTE from dhowells] It's unclear as to the circumstances in which (2) can
      take place, given that thread (1) is in nfs_kill_super(), however a
      conflicting NFS mount with slightly different parameters that creates a
      different superblock would do it.  A backtrace from Kiran seems to show
      that this is a possibility:
      
          kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
          ...
          RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
          Call Trace:
           __fscache_relinquish_cookie+0x87/0x120 [fscache]
           nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
           nfs_kill_super+0x29/0x40 [nfs]
           deactivate_locked_super+0x48/0x80
           deactivate_super+0x5c/0x60
           cleanup_mnt+0x3f/0x90
           __cleanup_mnt+0x12/0x20
           task_work_run+0x86/0xb0
           exit_to_usermode_loop+0xc2/0xd0
           syscall_return_slowpath+0x4e/0x60
           int_ret_from_sys_call+0x25/0x9f
      
      [Fix] Bump up the cookie usage in fscache_object_init, when it is first
      being assigned a cookie atomically such that the cookie is added and bumped
      up if its refcount is not zero.  Remove the assignment in
      fscache_attach_object().
      
      [Testcase]
      I have run ~100 hours of NFS stress tests and not seen this bug recur.
      
      [Regression Potential]
       - Limited to fscache/cachefiles.
      
      Fixes: ccc4fc3d ("FS-Cache: Implement the cookie management part of the netfs API")
      Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f29507ce
    • Kiran Kumar Modukuri's avatar
      cachefiles: Fix refcounting bug in backing-file read monitoring · 934140ab
      Kiran Kumar Modukuri authored
      cachefiles_read_waiter() has the right to access a 'monitor' object by
      virtue of being called under the waitqueue lock for one of the pages in its
      purview.  However, it has no ref on that monitor object or on the
      associated operation.
      
      What it is allowed to do is to move the monitor object to the operation's
      to_do list, but once it drops the work_lock, it's actually no longer
      permitted to access that object.  However, it is trying to enqueue the
      retrieval operation for processing - but it can only do this via a pointer
      in the monitor object, something it shouldn't be doing.
      
      If it doesn't enqueue the operation, the operation may not get processed.
      If the order is flipped so that the enqueue is first, then it's possible
      for the work processor to look at the to_do list before the monitor is
      enqueued upon it.
      
      Fix this by getting a ref on the operation so that we can trust that it
      will still be there once we've added the monitor to the to_do list and
      dropped the work_lock.  The op can then be enqueued after the lock is
      dropped.
      
      The bug can manifest in one of a couple of ways.  The first manifestation
      looks like:
      
       FS-Cache:
       FS-Cache: Assertion failed
       FS-Cache: 6 == 5 is false
       ------------[ cut here ]------------
       kernel BUG at fs/fscache/operation.c:494!
       RIP: 0010:fscache_put_operation+0x1e3/0x1f0
       ...
       fscache_op_work_func+0x26/0x50
       process_one_work+0x131/0x290
       worker_thread+0x45/0x360
       kthread+0xf8/0x130
       ? create_worker+0x190/0x190
       ? kthread_cancel_work_sync+0x10/0x10
       ret_from_fork+0x1f/0x30
      
      This is due to the operation being in the DEAD state (6) rather than
      INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
      fscache_put_operation().
      
      The bug can also manifest like the following:
      
       kernel BUG at fs/fscache/operation.c:69!
       ...
          [exception RIP: fscache_enqueue_operation+246]
       ...
       #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
       #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
       #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028
      
      I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
      entirely clear which assertion failed.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Reported-by: default avatarLei Xue <carmark.dlut@gmail.com>
      Reported-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
      Reported-by: default avatarAnthony DeRobertis <aderobertis@metrics.net>
      Reported-by: default avatarNeilBrown <neilb@suse.com>
      Reported-by: default avatarDaniel Axtens <dja@axtens.net>
      Reported-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      934140ab
    • Kiran Kumar Modukuri's avatar
      fscache: Allow cancelled operations to be enqueued · d0eb06af
      Kiran Kumar Modukuri authored
      Alter the state-check assertion in fscache_enqueue_operation() to allow
      cancelled operations to be given processing time so they can be cleaned up.
      
      Also fix a debugging statement that was requiring such operations to have
      an object assigned.
      
      Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
      Reported-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d0eb06af
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.18_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 9981b4fb
      Linus Torvalds authored
      Pull MIPS fixes from Paul Burton:
       "A couple more MIPS fixes for 4.18:
      
         - Fix an off-by-one in reporting PCI resource sizes to userland which
           regressed in v3.12.
      
         - Fix writes to DDR controller registers used to flush write buffers,
           which regressed with some refactoring in v4.2"
      
      * tag 'mips_fixes_4.18_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: ath79: fix register address in ath79_ddr_wb_flush()
        MIPS: Fix off-by-one in pci_resource_to_user()
      9981b4fb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 07230906
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Handle stations tied to AP_VLANs properly during mac80211 hw
          reconfig. From Manikanta Pubbisetty.
      
       2) Fix jump stack depth validation in nf_tables, from Taehee Yoo.
      
       3) Fix quota handling in aRFS flow expiration of mlx5 driver, from Eran
          Ben Elisha.
      
       4) Exit path handling fix in powerpc64 BPF JIT, from Daniel Borkmann.
      
       5) Use ptr_ring_consume_bh() in page pool code, from Tariq Toukan.
      
       6) Fix cached netdev name leak in nf_tables, from Florian Westphal.
      
       7) Fix memory leaks on chain rename, also from Florian Westphal.
      
       8) Several fixes to DCTCP congestion control ACK handling, from Yuchunk
          Cheng.
      
       9) Missing rcu_read_unlock() in CAIF protocol code, from Yue Haibing.
      
      10) Fix link local address handling with VRF, from David Ahern.
      
      11) Don't clobber 'err' on a successful call to __skb_linearize() in
          skb_segment(). From Eric Dumazet.
      
      12) Fix vxlan fdb notification races, from Roopa Prabhu.
      
      13) Hash UDP fragments consistently, from Paolo Abeni.
      
      14) If TCP receives lots of out of order tiny packets, we do really
          silly stuff. Make the out-of-order queue ending more robust to this
          kind of behavior, from Eric Dumazet.
      
      15) Don't leak netlink dump state in nf_tables, from Florian Westphal.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net: axienet: Fix double deregister of mdio
        qmi_wwan: fix interface number for DW5821e production firmware
        ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull
        bnx2x: Fix invalid memory access in rss hash config path.
        net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
        r8169: restore previous behavior to accept BIOS WoL settings
        cfg80211: never ignore user regulatory hint
        sock: fix sg page frag coalescing in sk_alloc_sg
        netfilter: nf_tables: move dumper state allocation into ->start
        tcp: add tcp_ooo_try_coalesce() helper
        tcp: call tcp_drop() from tcp_data_queue_ofo()
        tcp: detect malicious patterns in tcp_collapse_ofo_queue()
        tcp: avoid collapses in tcp_prune_queue() if possible
        tcp: free batches of packets in tcp_prune_ofo_queue()
        ip: hash fragments consistently
        ipv6: use fib6_info_hold_safe() when necessary
        can: xilinx_can: fix power management handling
        can: xilinx_can: fix incorrect clear of non-processed interrupts
        can: xilinx_can: fix RX overflow interrupt not being enabled
        can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
        ...
      07230906