1. 15 Jan, 2018 1 commit
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Bring back context level recursive checks · a0e3a18f
      Steven Rostedt (VMware) authored
      Commit 1a149d7d ("ring-buffer: Rewrite trace_recursive_(un)lock() to be
      simpler") replaced the context level recursion checks with a simple counter.
      This would prevent the ring buffer code from recursively calling itself more
      than the max number of contexts that exist (Normal, softirq, irq, nmi). But
      this change caused a lockup in a specific case, which was during suspend and
      resume using a global clock. Adding a stack dump to see where this occurred,
      the issue was in the trace global clock itself:
      
        trace_buffer_lock_reserve+0x1c/0x50
        __trace_graph_entry+0x2d/0x90
        trace_graph_entry+0xe8/0x200
        prepare_ftrace_return+0x69/0xc0
        ftrace_graph_caller+0x78/0xa8
        queued_spin_lock_slowpath+0x5/0x1d0
        trace_clock_global+0xb0/0xc0
        ring_buffer_lock_reserve+0xf9/0x390
      
      The function graph tracer traced queued_spin_lock_slowpath that was called
      by trace_clock_global. This pointed out that the trace_clock_global() is not
      reentrant, as it takes a spin lock. It depended on the ring buffer recursive
      lock from letting that happen.
      
      By removing the context detection and adding just a max number of allowable
      recursions, it allowed the trace_clock_global() to be entered again and try
      to retake the spinlock it already held, causing a deadlock.
      
      Fixes: 1a149d7d ("ring-buffer: Rewrite trace_recursive_(un)lock() to be simpler")
      Reported-by: default avatarDavid Weinehall <david.weinehall@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a0e3a18f
  2. 27 Dec, 2017 5 commits
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix possible double free on failure of allocating trace buffer · 4397f045
      Steven Rostedt (VMware) authored
      Jing Xia and Chunyan Zhang reported that on failing to allocate part of the
      tracing buffer, memory is freed, but the pointers that point to them are not
      initialized back to NULL, and later paths may try to free the freed memory
      again. Jing and Chunyan fixed one of the locations that does this, but
      missed a spot.
      
      Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com
      
      Cc: stable@vger.kernel.org
      Fixes: 737223fb ("tracing: Consolidate buffer allocation code")
      Reported-by: default avatarJing Xia <jing.xia@spreadtrum.com>
      Reported-by: default avatarChunyan Zhang <chunyan.zhang@spreadtrum.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      4397f045
    • Jing Xia's avatar
      tracing: Fix crash when it fails to alloc ring buffer · 24f2aaf9
      Jing Xia authored
      Double free of the ring buffer happens when it fails to alloc new
      ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured.
      The root cause is that the pointer is not set to NULL after the buffer
      is freed in allocate_trace_buffers(), and the freeing of the ring
      buffer is invoked again later if the pointer is not equal to Null,
      as:
      
      instance_mkdir()
          |-allocate_trace_buffers()
              |-allocate_trace_buffer(tr, &tr->trace_buffer...)
      	|-allocate_trace_buffer(tr, &tr->max_buffer...)
      
                // allocate fail(-ENOMEM),first free
                // and the buffer pointer is not set to null
              |-ring_buffer_free(tr->trace_buffer.buffer)
      
             // out_free_tr
          |-free_trace_buffers()
              |-free_trace_buffer(&tr->trace_buffer);
      
      	      //if trace_buffer is not null, free again
      	    |-ring_buffer_free(buf->buffer)
                      |-rb_free_cpu_buffer(buffer->buffers[cpu])
                          // ring_buffer_per_cpu is null, and
                          // crash in ring_buffer_per_cpu->pages
      
      Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com
      
      Cc: stable@vger.kernel.org
      Fixes: 737223fb ("tracing: Consolidate buffer allocation code")
      Signed-off-by: default avatarJing Xia <jing.xia@spreadtrum.com>
      Signed-off-by: default avatarChunyan Zhang <chunyan.zhang@spreadtrum.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      24f2aaf9
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Do no reuse reader page if still in use · ae415fa4
      Steven Rostedt (VMware) authored
      To free the reader page that is allocated with ring_buffer_alloc_read_page(),
      ring_buffer_free_read_page() must be called. For faster performance, this
      page can be reused by the ring buffer to avoid having to free and allocate
      new pages.
      
      The issue arises when the page is used with a splice pipe into the
      networking code. The networking code may up the page counter for the page,
      and keep it active while sending it is queued to go to the network. The
      incrementing of the page ref does not prevent it from being reused in the
      ring buffer, and this can cause the page that is being sent out to the
      network to be modified before it is sent by reading new data.
      
      Add a check to the page ref counter, and only reuse the page if it is not
      being used anywhere else.
      
      Cc: stable@vger.kernel.org
      Fixes: 73a757e6 ("ring-buffer: Return reader page back into existing ring buffer")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      ae415fa4
    • Steven Rostedt (VMware)'s avatar
      tracing: Remove extra zeroing out of the ring buffer page · 6b7e633f
      Steven Rostedt (VMware) authored
      The ring_buffer_read_page() takes care of zeroing out any extra data in the
      page that it returns. There's no need to zero it out again from the
      consumer. It was removed from one consumer of this function, but
      read_buffers_splice_read() did not remove it, and worse, it contained a
      nasty bug because of it.
      
      Cc: stable@vger.kernel.org
      Fixes: 2711ca23 ("ring-buffer: Move zeroing out excess in page to ring buffer code")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      6b7e633f
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Mask out the info bits when returning buffer page length · 45d8b80c
      Steven Rostedt (VMware) authored
      Two info bits were added to the "commit" part of the ring buffer data page
      when returned to be consumed. This was to inform the user space readers that
      events have been missed, and that the count may be stored at the end of the
      page.
      
      What wasn't handled, was the splice code that actually called a function to
      return the length of the data in order to zero out the rest of the page
      before sending it up to user space. These data bits were returned with the
      length making the value negative, and that negative value was not checked.
      It was compared to PAGE_SIZE, and only used if the size was less than
      PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
      unsigned compare, meaning the negative size value did not end up causing a
      large portion of memory to be randomly zeroed out.
      
      Cc: stable@vger.kernel.org
      Fixes: 66a8cb95 ("ring-buffer: Add place holder recording of dropped events")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      45d8b80c
  3. 18 Dec, 2017 1 commit
  4. 17 Dec, 2017 20 commits
  5. 16 Dec, 2017 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f3b5ad89
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "More fixes from testing done on the rc kernel, including more SELinux
        testing. Looking forward, lockdep found regression today in ipoib
        which is still being fixed.
      
        Summary:
      
         - Fix for SELinux on the umad SMI path. Some old hardware does not
           fill the PKey properly exposing another bug in the newer SELinux
           code.
      
         - Check the input port as we can exceed array bounds from this user
           supplied value
      
         - Users are unable to use the hash field support as they want due to
           incorrect checks on the field restrictions, correct that so the
           feature works as intended
      
         - User triggerable oops in the NETLINK_RDMA handler
      
         - cxgb4 driver fix for a bad interaction with CQ flushing in iser
           caused by patches in this merge window, and bad CQ flushing during
           normal close.
      
         - Unbalanced memalloc_noio in ipoib in an error path"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/ipoib: Restore MM behavior in case of tx_ring allocation failure
        iw_cxgb4: only insert drain cqes if wq is flushed
        iw_cxgb4: only clear the ARMED bit if a notification is needed
        RDMA/netlink: Fix general protection fault
        IB/mlx4: Fix RSS hash fields restrictions
        IB/core: Don't enforce PKey security on SMI MADs
        IB/core: Bound check alternate path port number
      f3b5ad89
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · f25e2295
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes for the AT24 I2C eeprom driver and some minor corrections
        for I2C bus drivers"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: piix4: Fix port number check on release
        i2c: stm32: Fix copyrights
        i2c-cht-wc: constify platform_device_id
        eeprom: at24: change nvmem stride to 1
        eeprom: at24: fix I2C device selection for runtime PM
      f25e2295
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · d025fbf1
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "This has two stable bugfixes, one to fix a BUG_ON() when
        nfs_commit_inode() is called with no outstanding commit requests and
        another to fix a race in the SUNRPC receive codepath.
      
        Additionally, there are also fixes for an NFS client deadlock and an
        xprtrdma performance regression.
      
        Summary:
      
        Stable bugfixes:
         - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a
           commit in the case that there were no commit requests.
         - SUNRPC: Fix a race in the receive code path
      
        Other fixes:
         - NFS: Fix a deadlock in nfs client initialization
         - xprtrdma: Fix a performance regression for small IOs"
      
      * tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        SUNRPC: Fix a race in the receive code path
        nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
        xprtrdma: Spread reply processing over more CPUs
        nfs: fix a deadlock in nfs client initialization
      d025fbf1
    • Linus Torvalds's avatar
      Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" · f6f37321
      Linus Torvalds authored
      This reverts commits 5c9d2d5c, c7da82b8, and e7fe7b5c.
      
      We'll probably need to revisit this, but basically we should not
      complicate the get_user_pages_fast() case, and checking the actual page
      table protection key bits will require more care anyway, since the
      protection keys depend on the exact state of the VM in question.
      
      Particularly when doing a "remote" page lookup (ie in somebody elses VM,
      not your own), you need to be much more careful than this was.  Dave
      Hansen says:
      
       "So, the underlying bug here is that we now a get_user_pages_remote()
        and then go ahead and do the p*_access_permitted() checks against the
        current PKRU. This was introduced recently with the addition of the
        new p??_access_permitted() calls.
      
        We have checks in the VMA path for the "remote" gups and we avoid
        consulting PKRU for them. This got missed in the pkeys selftests
        because I did a ptrace read, but not a *write*. I also didn't
        explicitly test it against something where a COW needed to be done"
      
      It's also not entirely clear that it makes sense to check the protection
      key bits at this level at all.  But one possible eventual solution is to
      make the get_user_pages_fast() case just abort if it sees protection key
      bits set, which makes us fall back to the regular get_user_pages() case,
      which then has a vma and can do the check there if we want to.
      
      We'll see.
      
      Somewhat related to this all: what we _do_ want to do some day is to
      check the PAGE_USER bit - it should obviously always be set for user
      pages, but it would be a good check to have back.  Because we have no
      generic way to test for it, we lost it as part of moving over from the
      architecture-specific x86 GUP implementation to the generic one in
      commit e585513b ("x86/mm/gup: Switch GUP to the generic
      get_user_page_fast() implementation").
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6f37321
  6. 15 Dec, 2017 9 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7a3c296a
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Clamp timeouts to INT_MAX in conntrack, from Jay Elliot.
      
       2) Fix broken UAPI for BPF_PROG_TYPE_PERF_EVENT, from Hendrik
          Brueckner.
      
       3) Fix locking in ieee80211_sta_tear_down_BA_sessions, from Johannes
          Berg.
      
       4) Add missing barriers to ptr_ring, from Michael S. Tsirkin.
      
       5) Don't advertise gigabit in sh_eth when not available, from Thomas
          Petazzoni.
      
       6) Check network namespace when delivering to netlink taps, from Kevin
          Cernekee.
      
       7) Kill a race in raw_sendmsg(), from Mohamed Ghannam.
      
       8) Use correct address in TCP md5 lookups when replying to an incoming
          segment, from Christoph Paasch.
      
       9) Add schedule points to BPF map alloc/free, from Eric Dumazet.
      
      10) Don't allow silly mtu values to be used in ipv4/ipv6 multicast, also
          from Eric Dumazet.
      
      11) Fix SKB leak in tipc, from Jon Maloy.
      
      12) Disable MAC learning on OVS ports of mlxsw, from Yuval Mintz.
      
      13) SKB leak fix in skB_complete_tx_timestamp(), from Willem de Bruijn.
      
      14) Add some new qmi_wwan device IDs, from Daniele Palmas.
      
      15) Fix static key imbalance in ingress qdisc, from Jiri Pirko.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net: qcom/emac: Reduce timeout for mdio read/write
        net: sched: fix static key imbalance in case of ingress/clsact_init error
        net: sched: fix clsact init error path
        ip_gre: fix wrong return value of erspan_rcv
        net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
        pkt_sched: Remove TC_RED_OFFLOADED from uapi
        net: sched: Move to new offload indication in RED
        net: sched: Add TCA_HW_OFFLOAD
        net: aquantia: Increment driver version
        net: aquantia: Fix typo in ethtool statistics names
        net: aquantia: Update hw counters on hw init
        net: aquantia: Improve link state and statistics check interval callback
        net: aquantia: Fill in multicast counter in ndev stats from hardware
        net: aquantia: Fill ndev stat couters from hardware
        net: aquantia: Extend stat counters to 64bit values
        net: aquantia: Fix hardware DMA stream overload on large MRRS
        net: aquantia: Fix actual speed capabilities reporting
        sock: free skb in skb_complete_tx_timestamp on error
        s390/qeth: update takeover IPs after configuration change
        s390/qeth: lock IP table while applying takeover changes
        ...
      7a3c296a
    • Linus Torvalds's avatar
      Merge tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c36c7a7c
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some USB fixes for 4.15-rc4.
      
        There is the usual handful gadget/dwc2/dwc3 fixes as always, for
        reported issues. But the most important things in here is the core fix
        from Alan Stern to resolve a nasty security bug (my first attempt is
        reverted, Alan's was much cleaner), as well as a number of usbip fixes
        from Shuah Khan to resolve those reported security issues.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: core: prevent malicious bNumInterfaces overflow
        Revert "USB: core: only clean up what we allocated"
        USB: core: only clean up what we allocated
        Revert "usb: gadget: allow to enable legacy drivers without USB_ETH"
        usb: gadget: webcam: fix V4L2 Kconfig dependency
        usb: dwc2: Fix TxFIFOn sizes and total TxFIFO size issues
        usb: dwc3: gadget: Fix PCM1 for ISOC EP with ep->mult less than 3
        usb: dwc3: of-simple: set dev_pm_ops
        usb: dwc3: of-simple: fix missing clk_disable_unprepare
        usb: dwc3: gadget: Wait longer for controller to end command processing
        usb: xhci: fix TDS for MTK xHCI1.1
        xhci: Don't add a virt_dev to the devs array before it's fully allocated
        usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
        usbip: prevent vhci_hcd driver from leaking a socket pointer address
        usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
        usbip: fix stub_rx: get_pipe() to validate endpoint number
        tools/usbip: fixes potential (minor) "buffer overflow" (detected on recent gcc with -Werror)
        USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
        usb: musb: da8xx: fix babble condition handling
      c36c7a7c
    • Linus Torvalds's avatar
      Merge tag 'staging-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · a84ec723
      Linus Torvalds authored
      Pull staging fixes from Greg KH:
       "Here are some small staging driver fixes for 4.15-rc4.
      
        One patch for the ccree driver to prevent an unitialized value from
        being returned to a caller, and the other fixes a logic error in the
        pi433 driver"
      
      * tag 'staging-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: pi433: Fixes issue with bit shift in rf69_get_modulation
        staging: ccree: Uninitialized return in ssi_ahash_import()
      a84ec723
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d6e47eed
      Linus Torvalds authored
      Pull virtio regression fixes from Michael Tsirkin:
       "Fixes two issues in the latest kernel"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_mmio: fix devm cleanup
        ptr_ring: fix up after recent ptr_ring changes
      d6e47eed
    • Linus Torvalds's avatar
      Merge tag 'for-4.15/dm-fixes' of... · ee1b43ec
      Linus Torvalds authored
      Merge tag 'for-4.15/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - fix a particularly nasty DM core bug in a 4.15 refcount_t conversion.
      
       - fix various targets to dm_register_target after module __init
         resources created; otherwise racing lvm2 commands could result in a
         NULL pointer during initialization of associated DM kernel module.
      
       - fix regression in bio-based DM multipath queue_if_no_path handling.
      
       - fix DM bufio's shrinker to reclaim more than one buffer per scan.
      
      * tag 'for-4.15/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm bufio: fix shrinker scans when (nr_to_scan < retain_target)
        dm mpath: fix bio-based multipath queue_if_no_path handling
        dm: fix various targets to dm_register_target after module __init resources created
        dm table: fix regression from improper dm_dev_internal.count refcount_t conversion
      ee1b43ec
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 66dbbd72
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "The most important one is the bfa fix because it's easy to oops the
        kernel with this driver (this includes the commit that corrects the
        compiler warning in the original), a regression in the new timespec
        conversion in aacraid and a regression in the Fibre Channel ELS
        handling patch.
      
        The other three are a theoretical problem with termination in the
        vendor/host matching code and a use after free in lpfc.
      
        The additional patches are a fix for an I/O hang in the mq code under
        certain circumstances and a rare oops in some debugging code"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: core: Fix a scsi_show_rq() NULL pointer dereference
        scsi: MAINTAINERS: change FCoE list to linux-scsi
        scsi: libsas: fix length error in sas_smp_handler()
        scsi: bfa: fix type conversion warning
        scsi: core: run queue if SCSI device queue isn't ready and queue is idle
        scsi: scsi_devinfo: cleanly zero-pad devinfo strings
        scsi: scsi_devinfo: handle non-terminated strings
        scsi: bfa: fix access to bfad_im_port_s
        scsi: aacraid: address UBSAN warning regression
        scsi: libfc: fix ELS request handling
        scsi: lpfc: Use after free in lpfc_rq_buf_free()
      66dbbd72
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 07a20ed1
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "A couple of MMC fixes:
      
         - fix use of uninitialized drv_typ variable
      
         - apply NO_CMD23 quirk to some specific SD cards to make them work"
      
      * tag 'mmc-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: apply NO_CMD23 quirk to some specific cards
        mmc: core: properly init drv_type
      07a20ed1
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.15-rc4' of git://github.com/ceph/ceph-client · dd3d66b8
      Linus Torvalds authored
      Pull ceph fix from Ilya Dryomov:
       "CephFS inode trimming fix from Zheng, marked for stable"
      
      * tag 'ceph-for-4.15-rc4' of git://github.com/ceph/ceph-client:
        ceph: drop negative child dentries before try pruning inode's alias
      dd3d66b8
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 227701e0
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
      
       - fix incomplete syncing of filesystem
      
       - fix regression in readdir on ovl over 9p
      
       - only follow redirects when needed
      
       - misc fixes and cleanups
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: fix overlay: warning prefix
        ovl: Use PTR_ERR_OR_ZERO()
        ovl: Sync upper dirty data when syncing overlayfs
        ovl: update ctx->pos on impure dir iteration
        ovl: Pass ovl_get_nlink() parameters in right order
        ovl: don't follow redirects if redirect_dir=off
      227701e0