1. 29 Dec, 2017 9 commits
    • Thomas Gleixner's avatar
      nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() · 5d62c183
      Thomas Gleixner authored
      The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
      subsequently invokes tick_nohz_stop_sched_tick() are:
      
        if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
      
      If need_resched() is not set, but a timer softirq is pending then this is
      an indication that the softirq code punted and delegated the execution to
      softirqd. need_resched() is not true because the current interrupted task
      takes precedence over softirqd.
      
      Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
      timer interrupts because the timer wheel contains an expired timer, but
      softirqs are not yet executed. So it returns an immediate expiry request,
      which causes the timer to fire immediately again. Lather, rinse and
      repeat....
      
      Prevent that by adding a check for a pending timer soft interrupt to the
      conditions in tick_nohz_stop_sched_tick() which avoid calling
      get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
      prevents a repetitive programming of an already expired timer.
      Reported-by: default avatarSebastian Siewior <bigeasy@linutronix.d>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
      5d62c183
    • Thomas Gleixner's avatar
      timers: Reinitialize per cpu bases on hotplug · 26456f87
      Thomas Gleixner authored
      The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
      them with a potentially stale clk and next_expiry valuem, which can cause
      trouble then the CPU is plugged.
      
      Add a prepare callback which forwards the clock, sets next_expiry to far in
      the future and reset the control flags to a known state.
      
      Set base->must_forward_clk so the first timer which is queued will try to
      forward the clock to current jiffies.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
      26456f87
    • Anna-Maria Gleixner's avatar
      timers: Use deferrable base independent of base::nohz_active · ced6d5c1
      Anna-Maria Gleixner authored
      During boot and before base::nohz_active is set in the timer bases, deferrable
      timers are enqueued into the standard timer base. This works correctly as
      long as base::nohz_active is false.
      
      Once it base::nohz_active is set and a timer which was enqueued before that
      is accessed the lock selector code choses the lock of the deferred
      base. This causes unlocked access to the standard base and in case the
      timer is removed it does not clear the pending flag in the standard base
      bitmap which causes get_next_timer_interrupt() to return bogus values.
      
      To prevent that, the deferrable timers must be enqueued in the deferrable
      base, even when base::nohz_active is not set. Those deferrable timers also
      need to be expired unconditional.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: default avatarAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
      ced6d5c1
    • Linus Torvalds's avatar
      Merge tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 61233580
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "This fixes a schedutil cpufreq governor regression from the 4.14 cycle
        that may cause a CPU idleness check to return incorrect results in
        some cases which leads to suboptimal decisions (Joel Fernandes)"
      
      * tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: schedutil: Use idle_calls counter of the remote CPU
      61233580
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2758b3e3
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) IPv6 gre tunnels end up with different default features enabled
          depending upon whether netlink or ioctls are used to bring them up.
          Fix from Alexey Kodanev.
      
       2) Fix read past end of user control message in RDS< from Avinash
          Repaka.
      
       3) Missing RCU barrier in mini qdisc code, from Cong Wang.
      
       4) Missing policy put when reusing per-cpu route entries, from Florian
          Westphal.
      
       5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G.
          Piccoli.
      
       6) Run nested transport mode IPSEC packets via tasklet, from Herbert
          Xu.
      
       7) Fix handling poll() for stream sockets in tipc, from Parthasarathy
          Bhuvaragan.
      
       8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert.
      
       9) Another zerocopy ubuf handling fix, from Willem de Bruijn.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        strparser: Call sock_owned_by_user_nocheck
        sock: Add sock_owned_by_user_nocheck
        skbuff: in skb_copy_ubufs unclone before releasing zerocopy
        tipc: fix hanging poll() for stream sockets
        sctp: Replace use of sockets_allocated with specified macro.
        bnx2x: Improve reliability in case of nested PCI errors
        tg3: Enable PHY reset in MTU change path for 5720
        tg3: Add workaround to restrict 5762 MRRS to 2048
        tg3: Update copyright
        net: fec: unmap the xmit buffer that are not transferred by DMA
        tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
        tipc: error path leak fixes in tipc_enable_bearer()
        RDS: Check cmsg_len before dereferencing CMSG_DATA
        tcp: Avoid preprocessor directives in tracepoint macro args
        tipc: fix memory leak of group member when peer node is lost
        net: sched: fix possible null pointer deref in tcf_block_put
        tipc: base group replicast ack counter on number of actual receivers
        net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()
        net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
        ip6_gre: fix device features for ioctl setup
        ...
      2758b3e3
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux · fd84b751
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "nouveau and i915 regression fixes"
      
      * tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux:
        drm/nouveau: fix race when adding delayed work items
        i915: Reject CCS modifiers for pipe C on Geminilake
        drm/i915/gvt: Fix pipe A enable as default for vgpu
      fd84b751
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · c0208a33
      Linus Torvalds authored
      Pull clk fix from Stephen Boyd:
       "One more fix for the runtime PM clk patches. We're calling a runtime
        PM API that may schedule from somewhere that we can't do that. We
        change to the async version of pm_runtime_put() to fix it"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: use atomic runtime pm api in clk_core_is_enabled
      c0208a33
    • Linus Torvalds's avatar
      Merge tag 'led_fixes_for_4.15-rc6' of... · 4f2382f3
      Linus Torvalds authored
      Merge tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
      
      Pull LED fix from Jacek Anaszewski:
       "A single LED fix for brightness setting when delay_off is 0"
      
      * tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
        led: core: Fix brightness setting when setting delay_off=0
      4f2382f3
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 19286e4a
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "This is the next batch of for-rc patches from RDMA. It includes the
        fix for the ipoib regression I mentioned last time, and the result of
        a fairly major debugging effort to get iser working reliably on cxgb4
        hardware - it turns out the cxgb4 driver was not handling QP error
        flushing properly causing iser to fail.
      
         - cxgb4 fix for an iser testing failure as debugged by Steve and
           Sagi. The problem was a driver bug in the handling of shutting down
           a QP.
      
         - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on
           error unwind and a use after free bug
      
         - Improper congestion counter values on mlx5 when link aggregation is
           enabled
      
         - ipoib lockdep regression introduced in this merge window
      
         - hfi1 regression supporting the device in a VM introduced in a
           recent patch
      
         - Typo that breaks future uAPI compatibility in the verbs core
      
         - More SELinux related oops fixing
      
         - Fix an oops during error unwind in mlx5"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/mlx5: Fix mlx5_ib_alloc_mr error flow
        IB/core: Verify that QP is security enabled in create and destroy
        IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
        IB/mlx5: Serialize access to the VMA list
        IB/hfi: Only read capability registers if the capability exists
        IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
        IB/mlx5: Fix congestion counters in LAG mode
        RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
        RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
        RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
        iw_cxgb4: when flushing, complete all wrs in a chain
        iw_cxgb4: reflect the original WR opcode in drain cqes
        iw_cxgb4: Only validate the MSN for successful completions
      19286e4a
  2. 28 Dec, 2017 7 commits
  3. 27 Dec, 2017 24 commits
    • Nitzan Carmi's avatar
      IB/mlx5: Fix mlx5_ib_alloc_mr error flow · 45e6ae7e
      Nitzan Carmi authored
      ibmr.device is being set only after ib_alloc_mr() is
      (successfully) complete. Therefore, in case mlx5_core_create_mkey()
      return with error, the error flow calls mlx5_free_priv_descs()
      which uses ibmr.device (which doesn't exist yet), causing
      a NULL dereference oops.
      
      To fix this, the IB device should be set in the mr struct earlier
      stage (e.g. prior to calling mlx5_core_create_mkey()).
      
      Fixes: 8a187ee5 ("IB/mlx5: Support the new memory registration API")
      Signed-off-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarNitzan Carmi <nitzanc@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      45e6ae7e
    • Moni Shoua's avatar
      IB/core: Verify that QP is security enabled in create and destroy · 4a50881b
      Moni Shoua authored
      The XRC target QP create flow sets up qp_sec only if there is an IB link with
      LSM security enabled. However, several other related uAPI entry points blindly
      follow the qp_sec NULL pointer, resulting in a possible oops.
      
      Check for NULL before using qp_sec.
      
      Cc: <stable@vger.kernel.org> # v4.12
      Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs")
      Reviewed-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      4a50881b
    • Moni Shoua's avatar
      IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() · 05d14e7b
      Moni Shoua authored
      If the input command length is larger than the kernel supports an error should
      be returned in case the unsupported bytes are not cleared, instead of the
      other way aroudn. This matches what all other callers of ib_is_udata_cleared
      do and will avoid user ABI problems in the future.
      
      Cc: <stable@vger.kernel.org> # v4.10
      Fixes: 189aba99 ("IB/uverbs: Extend modify_qp and support packet pacing")
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      05d14e7b
    • Majd Dibbiny's avatar
      IB/mlx5: Serialize access to the VMA list · ad9a3668
      Majd Dibbiny authored
      User-space applications can do mmap and munmap directly at
      any time.
      
      Since the VMA list is not protected with a mutex, concurrent
      accesses to the VMA list from the mmap and munmap can cause
      data corruption. Add a mutex around the list.
      
      Cc: <stable@vger.kernel.org> # v4.7
      Fixes: 7c2344c3 ("IB/mlx5: Implements disassociate_ucontext API")
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      ad9a3668
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 5f520fc3
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "While doing tests on tracing over the network, I found that the
        packets were getting corrupted.
      
        In the process I found three bugs.
      
        One was the culprit, but the other two scared me. After deeper
        investigation, they were not as major as I thought they were, due to a
        signed compared to an unsigned that prevented a negative number from
        doing actual harm.
      
        The two bigger bugs:
      
         - Mask the ring buffer data page length. There are data flags at the
           high bits of the length field. These were not cleared via the
           length function, and the length could return a negative number.
           (Although the number returned was unsigned, but was assigned to a
           signed number) Luckily, this value was compared to PAGE_SIZE which
           is unsigned and kept it from entering the path that could have
           caused damage.
      
         - Check the page usage before reusing the ring buffer reader page.
           TCP increments the page ref when passing the page off to the
           network. The page is passed back to the ring buffer for use on
           free. But the page could still be in use by the TCP stack.
      
        Minor bugs:
      
         - Related to the first bug. No need to clear out the unused ring
           buffer data before sending to user space. It is now done by the
           ring buffer code itself.
      
         - Reset pointers after free on error path. There were some cases in
           the error path that pointers were freed but not set to NULL, and
           could have them freed again, having a pointer freed twice"
      
      * tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix possible double free on failure of allocating trace buffer
        tracing: Fix crash when it fails to alloc ring buffer
        ring-buffer: Do no reuse reader page if still in use
        tracing: Remove extra zeroing out of the ring buffer page
        ring-buffer: Mask out the info bits when returning buffer page length
      5f520fc3
    • Linus Torvalds's avatar
      Merge tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 9b957794
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "It seems that Santa overslept with a bunch of gifts; the majority of
        changes here are various device-specific ASoC fixes, most notably the
        revert of rcar IOMMU support and fsl_ssi AC97 fixes, but also lots of
        small fixes for codecs. Besides that, the usual HD-audio quirks and
        fixes are included, too"
      
      * tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
        ALSA: hda - Fix missing COEF init for ALC225/295/299
        ALSA: hda: Drop useless WARN_ON()
        ALSA: hda - change the location for one mic on a Lenovo machine
        ALSA: hda - fix headset mic detection issue on a Dell machine
        ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
        ASoC: rsnd: fixup ADG register mask
        ASoC: rt5514-spi: only enable wakeup when fully initialized
        ASoC: nau8825: fix issue that pop noise when start capture
        ASoC: rt5663: Fix the wrong result of the first jack detection
        ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update
        ASoC: Intel: Change kern log level to avoid unwanted messages
        ASoC: atmel-classd: select correct Kconfig symbol
        ASoC: wm_adsp: Fix validation of firmware and coeff lengths
        ASoC: Intel: Skylake: Do not check dev_type for dmic link type
        ASoC: rockchip: disable clock on error
        ASoC: tlv320aic31xx: Fix GPIO1 register definition
        ASoC: codecs: msm8916-wcd: Fix supported formats
        ASoC: fsl_asrc: Fix typo in a field define
        ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes
        ASoC: da7218: Correct IRQ level in DT binding example
        ...
      9b957794
    • Matthieu CASTET's avatar
      led: core: Fix brightness setting when setting delay_off=0 · 2b83ff96
      Matthieu CASTET authored
      With the current code, the following sequence won't work :
      echo timer > trigger
      
      echo 0 >  delay_off
      * at this point we call
      ** led_delay_off_store
      ** led_blink_set
      *** stop timer
      ** led_blink_setup
      ** led_set_software_blink
      *** if !delay_on, led off
      *** if !delay_off, set led_set_brightness_nosleep <--- LED_BLINK_SW is set but timer is stop
      *** otherwise start timer/set LED_BLINK_SW flag
      
      echo xxx > brightness
      * led_set_brightness
      ** if LED_BLINK_SW
      *** if brightness=0, led off
      *** else apply brightness if next timer <--- timer is stop, and will never apply new setting
      ** otherwise set led_set_brightness_nosleep
      
      To fix that, when we delete the timer, we should clear LED_BLINK_SW.
      
      Cc: linux-leds@vger.kernel.org
      Signed-off-by: default avatarMatthieu CASTET <matthieu.castet@parrot.com>
      Signed-off-by: default avatarJacek Anaszewski <jacek.anaszewski@gmail.com>
      2b83ff96
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix possible double free on failure of allocating trace buffer · 4397f045
      Steven Rostedt (VMware) authored
      Jing Xia and Chunyan Zhang reported that on failing to allocate part of the
      tracing buffer, memory is freed, but the pointers that point to them are not
      initialized back to NULL, and later paths may try to free the freed memory
      again. Jing and Chunyan fixed one of the locations that does this, but
      missed a spot.
      
      Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com
      
      Cc: stable@vger.kernel.org
      Fixes: 737223fb ("tracing: Consolidate buffer allocation code")
      Reported-by: default avatarJing Xia <jing.xia@spreadtrum.com>
      Reported-by: default avatarChunyan Zhang <chunyan.zhang@spreadtrum.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      4397f045
    • Jing Xia's avatar
      tracing: Fix crash when it fails to alloc ring buffer · 24f2aaf9
      Jing Xia authored
      Double free of the ring buffer happens when it fails to alloc new
      ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured.
      The root cause is that the pointer is not set to NULL after the buffer
      is freed in allocate_trace_buffers(), and the freeing of the ring
      buffer is invoked again later if the pointer is not equal to Null,
      as:
      
      instance_mkdir()
          |-allocate_trace_buffers()
              |-allocate_trace_buffer(tr, &tr->trace_buffer...)
      	|-allocate_trace_buffer(tr, &tr->max_buffer...)
      
                // allocate fail(-ENOMEM),first free
                // and the buffer pointer is not set to null
              |-ring_buffer_free(tr->trace_buffer.buffer)
      
             // out_free_tr
          |-free_trace_buffers()
              |-free_trace_buffer(&tr->trace_buffer);
      
      	      //if trace_buffer is not null, free again
      	    |-ring_buffer_free(buf->buffer)
                      |-rb_free_cpu_buffer(buffer->buffers[cpu])
                          // ring_buffer_per_cpu is null, and
                          // crash in ring_buffer_per_cpu->pages
      
      Link: http://lkml.kernel.org/r/20171226071253.8968-1-chunyan.zhang@spreadtrum.com
      
      Cc: stable@vger.kernel.org
      Fixes: 737223fb ("tracing: Consolidate buffer allocation code")
      Signed-off-by: default avatarJing Xia <jing.xia@spreadtrum.com>
      Signed-off-by: default avatarChunyan Zhang <chunyan.zhang@spreadtrum.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      24f2aaf9
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Do no reuse reader page if still in use · ae415fa4
      Steven Rostedt (VMware) authored
      To free the reader page that is allocated with ring_buffer_alloc_read_page(),
      ring_buffer_free_read_page() must be called. For faster performance, this
      page can be reused by the ring buffer to avoid having to free and allocate
      new pages.
      
      The issue arises when the page is used with a splice pipe into the
      networking code. The networking code may up the page counter for the page,
      and keep it active while sending it is queued to go to the network. The
      incrementing of the page ref does not prevent it from being reused in the
      ring buffer, and this can cause the page that is being sent out to the
      network to be modified before it is sent by reading new data.
      
      Add a check to the page ref counter, and only reuse the page if it is not
      being used anywhere else.
      
      Cc: stable@vger.kernel.org
      Fixes: 73a757e6 ("ring-buffer: Return reader page back into existing ring buffer")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      ae415fa4
    • Steven Rostedt (VMware)'s avatar
      tracing: Remove extra zeroing out of the ring buffer page · 6b7e633f
      Steven Rostedt (VMware) authored
      The ring_buffer_read_page() takes care of zeroing out any extra data in the
      page that it returns. There's no need to zero it out again from the
      consumer. It was removed from one consumer of this function, but
      read_buffers_splice_read() did not remove it, and worse, it contained a
      nasty bug because of it.
      
      Cc: stable@vger.kernel.org
      Fixes: 2711ca23 ("ring-buffer: Move zeroing out excess in page to ring buffer code")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      6b7e633f
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2017-12-22-1' of... · 03bfd4e1
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2017-12-22-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      GLK pipe C related fix, and a gvt fix.
      
      * tag 'drm-intel-fixes-2017-12-22-1' of git://anongit.freedesktop.org/drm/drm-intel:
        i915: Reject CCS modifiers for pipe C on Geminilake
        drm/i915/gvt: Fix pipe A enable as default for vgpu
      03bfd4e1
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Mask out the info bits when returning buffer page length · 45d8b80c
      Steven Rostedt (VMware) authored
      Two info bits were added to the "commit" part of the ring buffer data page
      when returned to be consumed. This was to inform the user space readers that
      events have been missed, and that the count may be stored at the end of the
      page.
      
      What wasn't handled, was the splice code that actually called a function to
      return the length of the data in order to zero out the rest of the page
      before sending it up to user space. These data bits were returned with the
      length making the value negative, and that negative value was not checked.
      It was compared to PAGE_SIZE, and only used if the size was less than
      PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
      unsigned compare, meaning the negative size value did not end up causing a
      large portion of memory to be randomly zeroed out.
      
      Cc: stable@vger.kernel.org
      Fixes: 66a8cb95 ("ring-buffer: Add place holder recording of dropped events")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      45d8b80c
    • Tonghao Zhang's avatar
      sctp: Replace use of sockets_allocated with specified macro. · 8cb38a60
      Tonghao Zhang authored
      The patch(180d8cd9) replaces all uses of struct sock fields'
      memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem
      to accessor macros. But the sockets_allocated field of sctp sock is
      not replaced at all. Then replace it now for unifying the code.
      
      Fixes: 180d8cd9 ("foundations of per-cgroup memory pressure controlling.")
      Cc: Glauber Costa <glommer@parallels.com>
      Signed-off-by: default avatarTonghao Zhang <zhangtonghao@didichuxing.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cb38a60
    • Guilherme G. Piccoli's avatar
      bnx2x: Improve reliability in case of nested PCI errors · f7084059
      Guilherme G. Piccoli authored
      While in recovery process of PCI error (called EEH on PowerPC arch),
      another PCI transaction could be corrupted causing a situation of
      nested PCI errors. Also, this scenario could be reproduced with
      error injection mechanisms (for debug purposes).
      
      We observe that in case of nested PCI errors, bnx2x might attempt to
      initialize its shmem and cause a kernel crash due to bad addresses
      read from MCP. Multiple different stack traces were observed depending
      on the point the second PCI error happens.
      
      This patch avoids the crashes by:
      
       * failing PCI recovery in case of nested errors (since multiple
       PCI errors in a row are not expected to lead to a functional
       adapter anyway), and by,
      
       * preventing access to adapter FW when MCP is failed (we mark it as
       failed when shmem cannot get initialized properly).
      Reported-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-by: default avatarShahed Shaikh <Shahed.Shaikh@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7084059
    • David S. Miller's avatar
      Merge branch 'tg3-fixes' · 67538790
      David S. Miller authored
      Siva Reddy Kallam says:
      
      ====================
      tg3: update on copyright and couple of fixes
      
      First patch:
      	Update copyright
      
      Second patch:
      	Add workaround to restrict 5762 MRRS
      
      Third patch:
      	Add PHY reset in change MTU path for 5720
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67538790
    • Siva Reddy Kallam's avatar
      tg3: Enable PHY reset in MTU change path for 5720 · e60ee41a
      Siva Reddy Kallam authored
      A customer noticed RX path hang when MTU is changed on the fly while
      running heavy traffic with NCSI enabled for 5717 and 5719. Since 5720
      belongs to same ASIC family, we observed same issue and same fix
      could solve this problem for 5720.
      Signed-off-by: default avatarSiva Reddy Kallam <siva.kallam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e60ee41a
    • Siva Reddy Kallam's avatar
      tg3: Add workaround to restrict 5762 MRRS to 2048 · 4419bb1c
      Siva Reddy Kallam authored
      One of AMD based server with 5762 hangs with jumbo frame traffic.
      This AMD platform has southbridge limitation which is restricting MRRS
      to 4000. As a work around, driver to restricts the MRRS to 2048 for
      this particular 5762 NX1 card.
      Signed-off-by: default avatarSiva Reddy Kallam <siva.kallam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4419bb1c
    • Siva Reddy Kallam's avatar
      5a8bae97
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 65bbbf6c
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2017-12-22
      
      1) Check for valid id proto in validate_tmpl(), otherwise
         we may trigger a warning in xfrm_state_fini().
         From Cong Wang.
      
      2) Fix a typo on XFRMA_OUTPUT_MARK policy attribute.
         From Michal Kubecek.
      
      3) Verify the state is valid when encap_type < 0,
         otherwise we may crash on IPsec GRO .
         From Aviv Heller.
      
      4) Fix stack-out-of-bounds read on socket policy lookup.
         We access the flowi of the wrong address family in the
         IPv4 mapped IPv6 case, fix this by catching address
         family missmatches before we do the lookup.
      
      5) fix xfrm_do_migrate() with AEAD to copy the geniv
         field too. Otherwise the state is not fully initialized
         and migration fails. From Antony Antony.
      
      6) Fix stack-out-of-bounds with misconfigured transport
         mode policies. Our policy template validation is not
         strict enough. It is possible to configure policies
         with transport mode template where the address family
         of the template does not match the selectors address
         family. Fix this by refusing such a configuration,
         address family can not change on transport mode.
      
      7) Fix a policy reference leak when reusing pcpu xdst
         entry. From Florian Westphal.
      
      8) Reinject transport-mode packets through tasklet,
         otherwise it is possible to reate a recursion
         loop. From Herbert Xu.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65bbbf6c
    • Fugang Duan's avatar
      net: fec: unmap the xmit buffer that are not transferred by DMA · 178e5f57
      Fugang Duan authored
      The enet IP only support 32 bit, it will use swiotlb buffer to do dma
      mapping when xmit buffer DMA memory address is bigger than 4G in i.MX
      platform. After stress suspend/resume test, it will print out:
      
      log:
      [12826.352864] fec 5b040000.ethernet: swiotlb buffer is full (sz: 191 bytes)
      [12826.359676] DMA: Out of SW-IOMMU space for 191 bytes at device 5b040000.ethernet
      [12826.367110] fec 5b040000.ethernet eth0: Tx DMA memory map failed
      
      The issue is that the ready xmit buffers that are dma mapped but DMA still
      don't copy them into fifo, once MAC restart, these DMA buffers are not unmapped.
      So it should check the dma mapping buffer and unmap them.
      Signed-off-by: default avatarFugang Duan <fugang.duan@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      178e5f57
    • Tommi Rantala's avatar
      tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path · 642a8439
      Tommi Rantala authored
      Calling tipc_mon_delete() before the monitor has been created will oops.
      This can happen in tipc_enable_bearer() error path if tipc_disc_create()
      fails.
      
      [   48.589074] BUG: unable to handle kernel paging request at 0000000000001008
      [   48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc]
      [   48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0
      [   48.592230] Oops: 0000 [#1] SMP KASAN
      [   48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G    B            4.15.0-rc4-pc64-dirty #5
      [   48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc]
      [   48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282
      [   48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000
      [   48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297
      [   48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533
      [   48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050
      [   48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020
      [   48.608228] FS:  00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000
      [   48.610189] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0
      [   48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   48.615038] Call Trace:
      [   48.615635]  tipc_enable_bearer+0x415/0x5e0 [tipc]
      [   48.620623]  tipc_nl_bearer_enable+0x1ab/0x200 [tipc]
      [   48.625118]  genl_family_rcv_msg+0x36b/0x570
      [   48.631233]  genl_rcv_msg+0x5a/0xa0
      [   48.631867]  netlink_rcv_skb+0x1cc/0x220
      [   48.636373]  genl_rcv+0x24/0x40
      [   48.637306]  netlink_unicast+0x29c/0x350
      [   48.639664]  netlink_sendmsg+0x439/0x590
      [   48.642014]  SYSC_sendto+0x199/0x250
      [   48.649912]  do_syscall_64+0xfd/0x2c0
      [   48.650651]  entry_SYSCALL64_slow_path+0x25/0x25
      [   48.651843] RIP: 0033:0x7fc6859848e3
      [   48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3
      [   48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003
      [   48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c
      [   48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010
      [   48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000
      [   48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668
      [   48.664073] CR2: 0000000000001008
      [   48.664576] ---[ end trace e811818d54d5ce88 ]---
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      642a8439
    • Tommi Rantala's avatar
      tipc: error path leak fixes in tipc_enable_bearer() · 19142551
      Tommi Rantala authored
      Fix memory leak in tipc_enable_bearer() if enable_media() fails, and
      cleanup with bearer_disable() if tipc_mon_create() fails.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19142551
    • Avinash Repaka's avatar
      RDS: Check cmsg_len before dereferencing CMSG_DATA · 14e138a8
      Avinash Repaka authored
      RDS currently doesn't check if the length of the control message is
      large enough to hold the required data, before dereferencing the control
      message data. This results in following crash:
      
      BUG: KASAN: stack-out-of-bounds in rds_rdma_bytes net/rds/send.c:1013
      [inline]
      BUG: KASAN: stack-out-of-bounds in rds_sendmsg+0x1f02/0x1f90
      net/rds/send.c:1066
      Read of size 8 at addr ffff8801c928fb70 by task syzkaller455006/3157
      
      CPU: 0 PID: 3157 Comm: syzkaller455006 Not tainted 4.15.0-rc3+ #161
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:252
       kasan_report_error mm/kasan/report.c:351 [inline]
       kasan_report+0x25b/0x340 mm/kasan/report.c:409
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
       rds_rdma_bytes net/rds/send.c:1013 [inline]
       rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066
       sock_sendmsg_nosec net/socket.c:628 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:638
       ___sys_sendmsg+0x320/0x8b0 net/socket.c:2018
       __sys_sendmmsg+0x1ee/0x620 net/socket.c:2108
       SYSC_sendmmsg net/socket.c:2139 [inline]
       SyS_sendmmsg+0x35/0x60 net/socket.c:2134
       entry_SYSCALL_64_fastpath+0x1f/0x96
      RIP: 0033:0x43fe49
      RSP: 002b:00007fffbe244ad8 EFLAGS: 00000217 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fe49
      RDX: 0000000000000001 RSI: 000000002020c000 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004017b0
      R13: 0000000000401840 R14: 0000000000000000 R15: 0000000000000000
      
      To fix this, we verify that the cmsg_len is large enough to hold the
      data to be read, before proceeding further.
      Reported-by: default avatarsyzbot <syzkaller-bugs@googlegroups.com>
      Signed-off-by: default avatarAvinash Repaka <avinash.repaka@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14e138a8