1. 26 Oct, 2017 1 commit
    • Tejun Heo's avatar
      cgroup, sched: Move basic cpu stats from cgroup.stat to cpu.stat · d41bf8c9
      Tejun Heo authored
      The basic cpu stat is currently shown with "cpu." prefix in
      cgroup.stat, and the same information is duplicated in cpu.stat when
      cpu controller is enabled.  This is ugly and not very scalable as we
      want to expand the coverage of stat information which is always
      available.
      
      This patch makes cgroup core always create "cpu.stat" file and show
      the basic cpu stat there and calls the cpu controller to show the
      extra stats when enabled.  This ensures that the same information
      isn't presented in multiple places and makes future expansion of basic
      stats easier.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      d41bf8c9
  2. 29 Sep, 2017 2 commits
    • Tejun Heo's avatar
      sched: Implement interface for cgroup unified hierarchy · 0d593634
      Tejun Heo authored
      There are a couple interface issues which can be addressed in cgroup2
      interface.
      
      * Stats from cpuacct being reported separately from the cpu stats.
      
      * Use of different time units.  Writable control knobs use
        microseconds, some stat fields use nanoseconds while other cpuacct
        stat fields use centiseconds.
      
      * Control knobs which can't be used in the root cgroup still show up
        in the root.
      
      * Control knob names and semantics aren't consistent with other
        controllers.
      
      This patchset implements cpu controller's interface on cgroup2 which
      adheres to the controller file conventions described in
      Documentation/cgroups/cgroup-v2.txt.  Overall, the following changes
      are made.
      
      * cpuacct is implictly enabled and disabled by cpu and its information
        is reported through "cpu.stat" which now uses microseconds for all
        time durations.  All time duration fields now have "_usec" appended
        to them for clarity.
      
        Note that cpuacct.usage_percpu is currently not included in
        "cpu.stat".  If this information is actually called for, it will be
        added later.
      
      * "cpu.shares" is replaced with "cpu.weight" and operates on the
        standard scale defined by CGROUP_WEIGHT_MIN/DFL/MAX (1, 100, 10000).
        The weight is scaled to scheduler weight so that 100 maps to 1024
        and the ratio relationship is preserved - if weight is W and its
        scaled value is S, W / 100 == S / 1024.  While the mapped range is a
        bit smaller than the orignal scheduler weight range, the dead zones
        on both sides are relatively small and covers wider range than the
        nice value mappings.  This file doesn't make sense in the root
        cgroup and isn't created on root.
      
      * "cpu.weight.nice" is added. When read, it reads back the nice value
        which is closest to the current "cpu.weight".  When written, it sets
        "cpu.weight" to the weight value which matches the nice value.  This
        makes it easy to configure cgroups when they're competing against
        threads in threaded subtrees.
      
      * "cpu.cfs_quota_us" and "cpu.cfs_period_us" are replaced by "cpu.max"
        which contains both quota and period.
      
      v4: - Use cgroup2 basic usage stat as the information source instead
            of cpuacct.
      
      v3: - Added "cpu.weight.nice" to allow using nice values when
            configuring the weight.  The feature is requested by PeterZ.
          - Merge the patch to enable threaded support on cpu and cpuacct.
          - Dropped the bits about getting rid of cpuacct from patch
            description as there is a pretty strong case for making cpuacct
            an implicit controller so that basic cpu usage stats are always
            available.
          - Documentation updated accordingly.  "cpu.rt.max" section is
            dropped for now.
      
      v2: - cpu_stats_show() was incorrectly using CONFIG_FAIR_GROUP_SCHED
            for CFS bandwidth stats and also using raw division for u64.
            Use CONFIG_CFS_BANDWITH and do_div() instead.  "cpu.rt.max" is
            not included yet.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      0d593634
    • Tejun Heo's avatar
      sched: Misc preps for cgroup unified hierarchy interface · a1f7164c
      Tejun Heo authored
      Make the following changes in preparation for the cpu controller
      interface implementation for cgroup2.  This patch doesn't cause any
      functional differences.
      
      * s/cpu_stats_show()/cpu_cfs_stat_show()/
      
      * s/cpu_files/cpu_legacy_files/
      
      v2: Dropped cpuacct changes as it won't be used by cpu controller
          interface anymore.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      a1f7164c
  3. 25 Sep, 2017 5 commits
    • Tejun Heo's avatar
      sched/cputime: Add dummy cputime_adjust() implementation for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE · 8157a7fa
      Tejun Heo authored
      cfb766da ("sched/cputime: Expose cputime_adjust()") made
      cputime_adjust() public for cgroup basic cpu stat support; however,
      the commit forgot to add a dummy implementaiton for
      CONFIG_VIRT_CPU_ACCOUNTING_NATIVE leading to compiler errors on some
      s390 configurations.
      
      Fix it by adding the missing dummy implementation.
      Reported-by: default avatar“kbuild-all@01.org” <kbuild-all@01.org>
      Fixes: cfb766da ("sched/cputime: Expose cputime_adjust()")
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      8157a7fa
    • Tejun Heo's avatar
      cgroup: statically initialize init_css_set->dfl_cgrp · 38683148
      Tejun Heo authored
      Like other csets, init_css_set's dfl_cgrp is initialized when the cset
      gets linked.  init_css_set gets linked in cgroup_init().  This has
      been fine till now but the recently added basic CPU usage accounting
      may end up accessing dfl_cgrp of init before cgroup_init() leading to
      the following oops.
      
        SELinux:  Initializing.
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
        IP: account_system_index_time+0x60/0x90
        PGD 0 P4D 0
        Oops: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc2-00003-g041cd640 #10
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
        +1.9.3-20161025_171302-gandalf 04/01/2014
        task: ffffffff81e10480 task.stack: ffffffff81e00000
        RIP: 0010:account_system_index_time+0x60/0x90
        RSP: 0000:ffff880011e03cb8 EFLAGS: 00010002
        RAX: ffffffff81ef8800 RBX: ffffffff81e10480 RCX: 0000000000000003
        RDX: 0000000000000000 RSI: 00000000000f4240 RDI: 0000000000000000
        RBP: ffff880011e03cc0 R08: 0000000000010000 R09: 0000000000000000
        R10: 0000000000000020 R11: 0000003b9aca0000 R12: 000000000001c100
        R13: 0000000000000000 R14: ffffffff81e10480 R15: ffffffff81e03cd8
        FS:  0000000000000000(0000) GS:ffff880011e00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000b0 CR3: 0000000001e09000 CR4: 00000000000006b0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         <IRQ>
         account_system_time+0x45/0x60
         account_process_tick+0x5a/0x140
         update_process_times+0x22/0x60
         tick_periodic+0x2b/0x90
         tick_handle_periodic+0x25/0x70
         timer_interrupt+0x15/0x20
         __handle_irq_event_percpu+0x7e/0x1b0
         handle_irq_event_percpu+0x23/0x60
         handle_irq_event+0x42/0x70
         handle_level_irq+0x83/0x100
         handle_irq+0x6f/0x110
         do_IRQ+0x46/0xd0
         common_interrupt+0x9d/0x9d
      
      Fix it by statically initializing init_css_set.dfl_cgrp so that init's
      default cgroup is accessible from the get-go.
      
      Fixes: 041cd640 ("cgroup: Implement cgroup2 basic CPU usage accounting")
      Reported-by: default avatar“kbuild-all@01.org” <kbuild-all@01.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      38683148
    • Tejun Heo's avatar
      cgroup: Implement cgroup2 basic CPU usage accounting · 041cd640
      Tejun Heo authored
      In cgroup1, while cpuacct isn't actually controlling any resources, it
      is a separate controller due to combination of two factors -
      1. enabling cpu controller has significant side effects, and 2. we
      have to pick one of the hierarchies to account CPU usages on.  cpuacct
      controller is effectively used to designate a hierarchy to track CPU
      usages on.
      
      cgroup2's unified hierarchy removes the second reason and we can
      account basic CPU usages by default.  While we can use cpuacct for
      this purpose, both its interface and implementation leave a lot to be
      desired - it collects and exposes two sources of truth which don't
      agree with each other and some of the exposed statistics don't make
      much sense.  Also, it propagates all the way up the hierarchy on each
      accounting event which is unnecessary.
      
      This patch adds basic resource accounting mechanism to cgroup2's
      unified hierarchy and accounts CPU usages using it.
      
      * All accountings are done per-cpu and don't propagate immediately.
        It just bumps the per-cgroup per-cpu counters and links to the
        parent's updated list if not already on it.
      
      * On a read, the per-cpu counters are collected into the global ones
        and then propagated upwards.  Only the per-cpu counters which have
        changed since the last read are propagated.
      
      * CPU usage stats are collected and shown in "cgroup.stat" with "cpu."
        prefix.  Total usage is collected from scheduling events.  User/sys
        breakdown is sourced from tick sampling and adjusted to the usage
        using cputime_adjust().
      
      This keeps the accounting side hot path O(1) and per-cpu and the read
      side O(nr_updated_since_last_read).
      
      v2: Minor changes and documentation updates as suggested by Waiman and
          Roman.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Roman Gushchin <guro@fb.com>
      041cd640
    • Tejun Heo's avatar
      cpuacct: Introduce cgroup_account_cputime[_field]() · d2cc5ed6
      Tejun Heo authored
      Introduce cgroup_account_cputime[_field]() which wrap cpuacct_charge()
      and cgroup_account_field().  This doesn't introduce any functional
      changes and will be used to add cgroup basic resource accounting.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      d2cc5ed6
    • Tejun Heo's avatar
      sched/cputime: Expose cputime_adjust() · cfb766da
      Tejun Heo authored
      Will be used by basic cgroup resource stat reporting later.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      cfb766da
  4. 24 Sep, 2017 14 commits
  5. 23 Sep, 2017 18 commits
    • Linus Torvalds's avatar
      Merge branch 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · cd4175b1
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
      
       - Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert
         multiple byte-accesses into one word-access.
      
       - Add missing HWPOISON page fault handler code. I completely missed
         that when I added HWPOISON support during this merge window and it
         only showed up now with the madvise07 LTP test case.
      
       - Fix backtrace unwinding to stop when stack start has been reached.
      
       - Issue warning if initrd has been loaded into memory regions with
         broken RAM modules.
      
       - Fix HPMC handler (parisc hardware fault handler) to comply with
         architecture specification.
      
       - Avoid compiler warnings about too large frame sizes.
      
       - Minor init-section fixes.
      
      * 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Unbreak bootloader due to gcc-7 optimizations
        parisc: Reintroduce option to gzip-compress the kernel
        parisc: Add HWPOISON page fault handler code
        parisc: Move init_per_cpu() into init section
        parisc: Check if initrd was loaded into broken RAM
        parisc: Add PDCE_CHECK instruction to HPMC handler
        parisc: Add wrapper for pdc_instr() firmware function
        parisc: Move start_parisc() into init section
        parisc: Stop unwinding at start of stack
        parisc: Fix too large frame size warnings
      cd4175b1
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · ded85032
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
      
       - Smattering of miscellanous fixes
      
       - A five patch series for i40iw that had a patch (5/5) that was larger
         than I would like, but I took it because it's needed for large scale
         users
      
       - An 8 patch series for bnxt_re that landed right as I was leaving on
         PTO and so had to wait until now...they are all appropriate fixes for
         -rc IMO
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (22 commits)
        bnxt_re: Don't issue cmd to delete GID for QP1 GID entry before the QP is destroyed
        bnxt_re: Fix memory leak in FRMR path
        bnxt_re: Remove RTNL lock dependency in bnxt_re_query_port
        bnxt_re: Fix race between the netdev register and unregister events
        bnxt_re: Free up devices in module_exit path
        bnxt_re: Fix compare and swap atomic operands
        bnxt_re: Stop issuing further cmds to FW once a cmd times out
        bnxt_re: Fix update of qplib_qp.mtu when modified
        i40iw: Add support for port reuse on active side connections
        i40iw: Add missing VLAN priority
        i40iw: Call i40iw_cm_disconn on modify QP to disconnect
        i40iw: Prevent multiple netdev event notifier registrations
        i40iw: Fail open if there are no available MSI-X vectors
        RDMA/vmw_pvrdma: Fix reporting correct opcodes for completion
        IB/bnxt_re: Fix frame stack compilation warning
        IB/mlx5: fix debugfs cleanup
        IB/ocrdma: fix incorrect fall-through on switch statement
        IB/ipoib: Suppress the retry related completion errors
        iw_cxgb4: remove the stid on listen create failure
        iw_cxgb4: drop listen destroy replies if no ep found
        ...
      ded85032
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 71aa60f6
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix NAPI poll list corruption in enic driver, from Christian
          Lamparter.
      
       2) Fix route use after free, from Eric Dumazet.
      
       3) Fix regression in reuseaddr handling, from Josef Bacik.
      
       4) Assert the size of control messages in compat handling since we copy
          it in from userspace twice. From Meng Xu.
      
       5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
          from Ursula Braun.
      
       6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.
      
       7) Don't use ARRAY_SIZE on spinlock array which might have zero
          entries, from Geert Uytterhoeven.
      
       8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
          Kasiviswanathan.
      
       9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
          Long.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
        inet: fix improper empty comparison
        net: use inet6_rcv_saddr to compare sockets
        net: set tb->fast_sk_family
        net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
        MAINTAINERS: update git tree locations for ieee802154 subsystem
        net: prevent dst uses after free
        net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
        net/smc: no close wait in case of process shut down
        net/smc: introduce a delay
        net/smc: terminate link group if out-of-sync is received
        net/smc: longer delay for client link group removal
        net/smc: adapt send request completion notification
        net/smc: adjust net_device refcount
        net/smc: take RCU read lock for routing cache lookup
        net/smc: add receive timeout check
        net/smc: add missing dev_put
        net: stmmac: Cocci spatch "of_table"
        lan78xx: Use default values loaded from EEPROM/OTP after reset
        lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
        lan78xx: Fix for eeprom read/write when device auto suspend
        ...
      71aa60f6
    • Linus Torvalds's avatar
      Merge tag 'apparmor-pr-2017-09-22' of... · 79444df4
      Linus Torvalds authored
      Merge tag 'apparmor-pr-2017-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
      
      Pull apparmor updates from John Johansen:
       "This is the apparmor pull request, similar to SELinux and seccomp.
      
        It's the same series that I was sent to James' security tree + one
        regression fix that was found after the series was sent to James and
        would have been sent for v4.14-rc2.
      
        Features:
        - in preparation for secid mapping add support for absolute root view
          based labels
        - add base infastructure for socket mediation
        - add mount mediation
        - add signal mediation
      
        minor cleanups and changes:
        - be defensive, ensure unconfined profiles have dfas initialized
        - add more debug asserts to apparmorfs
        - enable policy unpacking to audit different reasons for failure
        - cleanup conditional check for label in label_print
        - Redundant condition: prev_ns. in [label.c:1498]
      
        Bug Fixes:
        - fix regression in apparmorfs DAC access permissions
        - fix build failure on sparc caused by undeclared signals
        - fix sparse report of incorrect type assignment when freeing label proxies
        - fix race condition in null profile creation
        - Fix an error code in aafs_create()
        - Fix logical error in verify_header()
        - Fix shadowed local variable in unpack_trans_table()"
      
      * tag 'apparmor-pr-2017-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
        apparmor: fix apparmorfs DAC access permissions
        apparmor: fix build failure on sparc caused by undeclared signals
        apparmor: fix incorrect type assignment when freeing proxies
        apparmor: ensure unconfined profiles have dfas initialized
        apparmor: fix race condition in null profile creation
        apparmor: move new_null_profile to after profile lookup fns()
        apparmor: add base infastructure for socket mediation
        apparmor: add more debug asserts to apparmorfs
        apparmor: make policy_unpack able to audit different info messages
        apparmor: add support for absolute root view based labels
        apparmor: cleanup conditional check for label in label_print
        apparmor: add mount mediation
        apparmor: add the ability to mediate signals
        apparmor: Redundant condition: prev_ns. in [label.c:1498]
        apparmor: Fix an error code in aafs_create()
        apparmor: Fix logical error in verify_header()
        apparmor: Fix shadowed local variable in unpack_trans_table()
      79444df4
    • Josh Poimboeuf's avatar
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf authored
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f5caf621
    • Josh Poimboeuf's avatar
      objtool: Handle another GCC stack pointer adjustment bug · 0d0970ee
      Josh Poimboeuf authored
      The kbuild bot reported the following warning with GCC 4.4 and a
      randconfig:
      
        net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0
      
      This is caused by another GCC non-optimization, where it backs up and
      restores the stack pointer for no apparent reason:
      
          2f91:       48 89 e0                mov    %rsp,%rax
          2f94:       4c 89 e7                mov    %r12,%rdi
          2f97:       4c 89 f6                mov    %r14,%rsi
          2f9a:       ba 20 00 00 00          mov    $0x20,%edx
          2f9f:       48 89 c4                mov    %rax,%rsp
      
      This issue would have been happily ignored before the following commit:
      
        dd88a0a0 ("objtool: Handle GCC stack pointer adjustment bug")
      
      But now that objtool is paying attention to such stack pointer writes
      to/from a register, it needs to understand them properly.  In this case
      that means recognizing that the "mov %rsp, %rax" instruction is
      potentially a backup of the stack pointer.
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: dd88a0a0 ("objtool: Handle GCC stack pointer adjustment bug")
      Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0d0970ee
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · c65da8e2
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix the initialization of resources in the ACPI WDAT watchdog
        driver, a recent regression in the ACPI device properties handling, a
        recent change in behavior causing the ACPI_HANDLE() macro to only work
        for GPL code and create a MAINTAINERS entry for ACPI PMIC drivers in
        order to specify the official reviewers for that code.
      
        Specifics:
      
         - Fix the initialization of resources in the ACPI WDAT watchdog
           driver that uses unititialized memory which causes compiler
           warnings to be triggered (Arnd Bergmann).
      
         - Fix a recent regression in the ACPI device properties handling that
           causes some device properties data to be skipped during enumeration
           (Sakari Ailus).
      
         - Fix a recent change in behavior that caused the ACPI_HANDLE() macro
           to stop working for non-GPL code which is a problem for the NVidia
           binary graphics driver, for example (John Hubbard).
      
         - Add a MAINTAINERS entry for the ACPI PMIC drivers to specify the
           official reviewers for that code (Rafael Wysocki)"
      
      * tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly
        ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again
        ACPI / watchdog: properly initialize resources
        ACPI / PMIC: Add code reviewers to MAINTAINERS
      c65da8e2
    • David S. Miller's avatar
      Merge branch 'net-fix-reuseaddr-regression' · 4e683f49
      David S. Miller authored
      Josef Bacik says:
      
      ====================
      net: fix reuseaddr regression
      
      I introduced a regression when reworking the fastreuse port stuff that allows
      bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
      The root cause is I reversed an if statement which caused us to set the tb as if
      there were no owners on the socket if there were, which obviously is not
      correct.
      
      Dave could you please queue these changes up for -stable, I've run them through
      the net tests and added another test to check for this problem specifically.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e683f49
    • Josef Bacik's avatar
      inet: fix improper empty comparison · fbed24bc
      Josef Bacik authored
      When doing my reuseport rework I screwed up and changed a
      
      if (hlist_empty(&tb->owners))
      
      to
      
      if (!hlist_empty(&tb->owners))
      
      This is obviously bad as all of the reuseport/reuse logic was reversed,
      which caused weird problems like allowing an ipv4 bind conflict if we
      opened an ipv4 only socket on a port followed by an ipv6 only socket on
      the same port.
      
      Fixes: b9470c27 ("inet: kill smallest_size and smallest_port")
      Reported-by: default avatarCole Robinson <crobinso@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbed24bc
    • Josef Bacik's avatar
      net: use inet6_rcv_saddr to compare sockets · 7a56673b
      Josef Bacik authored
      In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the
      ipv6 compare with the fast socket information to make sure we're doing
      the proper comparisons.
      
      Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk")
      Reported-and-tested-by: default avatarCole Robinson <crobinso@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a56673b
    • Josef Bacik's avatar
      net: set tb->fast_sk_family · cbb2fb5c
      Josef Bacik authored
      We need to set the tb->fast_sk_family properly so we can use the proper
      comparison function for all subsequent reuseport bind requests.
      
      Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk")
      Reported-and-tested-by: default avatarCole Robinson <crobinso@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbb2fb5c
    • Willem de Bruijn's avatar
      net: orphan frags on stand-alone ptype in dev_queue_xmit_nit · 581fe0ea
      Willem de Bruijn authored
      Zerocopy skbs frags are copied when the skb is looped to a local sock.
      Commit 1080e512 ("net: orphan frags on receive") introduced calls
      to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
      
      With msg_zerocopy, these skbs can also exist in the tx path and thus
      loop from dev_queue_xmit_nit. This already calls deliver_skb in its
      loop. But it does not orphan before a separate pt_prev->func().
      
      Add the missing skb_orphan_frags_rx.
      
      Changes
        v1->v2: handle skb_orphan_frags_rx failure
      
      Fixes: 1f8b977a ("sock: enable MSG_ZEROCOPY")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      581fe0ea
    • Linus Torvalds's avatar
      Merge tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 6876eb37
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a cpufreq regression introduced by recent changes related to
        the generic DT driver, an initialization time memory leak in cpuidle
        on ARM, a PM core bug that may cause system suspend/resume to fail on
        some systems, a request type validation issue in the PM QoS framework
        and two documentation-related issues.
      
        Specifics:
      
         - Fix a regression in cpufreq on systems using DT as the source of
           CPU configuration information where two different code paths
           attempt to create the cpufreq-dt device object (there can be only
           one) and fix up the "compatible" matching for some TI platforms on
           top of that (Viresh Kumar, Dave Gerlach).
      
         - Fix an initialization time memory leak in cpuidle on ARM which
           occurs if the cpuidle driver initialization fails (Stefan Wahren).
      
         - Fix a PM core function that checks whether or not there are any
           system suspend/resume callbacks for a device, but forgets to check
           legacy callbacks which then may be skipped incorrectly and the
           system may crash and/or the device may become unusable after a
           suspend-resume cycle (Rafael Wysocki).
      
         - Fix request type validation for latency tolerance PM QoS requests
           which may lead to unexpected behavior (Jan Schönherr).
      
         - Fix a broken link to PM documentation from a header file and a typo
           in a PM document (Geert Uytterhoeven, Rafael Wysocki)"
      
      * tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: ti-cpufreq: Support additional am43xx platforms
        ARM: cpuidle: Avoid memleak if init fail
        cpufreq: dt-platdev: Add some missing platforms to the blacklist
        PM: core: Fix device_pm_check_callbacks()
        PM: docs: Drop an excess character from devices.rst
        PM / QoS: Use the correct variable to check the QoS request type
        driver core: Fix link to device power management documentation
      6876eb37
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · d32e5f44
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
      
       - fixes for two long standing issues (lock up and a crash) in force
         feedback handling in uinput driver
      
       - tweak to firmware update timing in Elan I2C touchpad driver.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: elan_i2c - extend Flash-Write delay
        Input: uinput - avoid crash when sending FF request to device going away
        Input: uinput - avoid FF flush when destroying device
      d32e5f44
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · c0a3a64e
      Linus Torvalds authored
      Pull seccomp updates from Kees Cook:
       "Major additions:
      
         - sysctl and seccomp operation to discover available actions
           (tyhicks)
      
         - new per-filter configurable logging infrastructure and sysctl
           (tyhicks)
      
         - SECCOMP_RET_LOG to log allowed syscalls (tyhicks)
      
         - SECCOMP_RET_KILL_PROCESS as the new strictest possible action
      
         - self-tests for new behaviors"
      
      [ This is the seccomp part of the security pull request during the merge
        window that was nixed due to unrelated problems   - Linus ]
      
      * tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        samples: Unrename SECCOMP_RET_KILL
        selftests/seccomp: Test thread vs process killing
        seccomp: Implement SECCOMP_RET_KILL_PROCESS action
        seccomp: Introduce SECCOMP_RET_KILL_PROCESS
        seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
        seccomp: Action to log before allowing
        seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
        seccomp: Selftest for detection of filter flag support
        seccomp: Sysctl to configure actions that are allowed to be logged
        seccomp: Operation for checking if an action is available
        seccomp: Sysctl to display available actions
        seccomp: Provide matching filter for introspection
        selftests/seccomp: Refactor RET_ERRNO tests
        selftests/seccomp: Add simple seccomp overhead benchmark
        selftests/seccomp: Add tests for basic ptrace actions
      c0a3a64e
    • Linus Torvalds's avatar
      Merge tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of... · 69c902f5
      Linus Torvalds authored
      Merge tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6
      
      Pull cifs fixes from Steve French:
       "Various SMB3 fixes for stable and security improvements from the
        recently completed SMB3/Samba test events
      
      * tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6:
        SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
        SMB3: handle new statx fields
        SMB: Validate negotiate (to protect against downgrade) even if signing off
        cifs: release auth_key.response for reconnect.
        cifs: release cifs root_cred after exit_cifs
        CIFS: make arrays static const, reduces object code size
        [SMB3] Update session and share information displayed for debugging SMB2/SMB3
        cifs: show 'soft' in the mount options for hard mounts
        SMB3: Warn user if trying to sign connection that authenticated as guest
        SMB3: Fix endian warning
        Fix SMB3.1.1 guest authentication to Samba
      69c902f5
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client · b03fcfae
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Two small but important fixes: RADOS semantic change in upcoming v12.2.1
        release and a rare NULL dereference in create_session_open_msg()"
      
      * tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client:
        ceph: avoid panic in create_session_open_msg() if utsname() returns NULL
        libceph: don't allow bidirectional swap of pg-upmap-items
      b03fcfae
    • Stefan Schmidt's avatar
      MAINTAINERS: update git tree locations for ieee802154 subsystem · b9b95da9
      Stefan Schmidt authored
      Patches for ieee802154 will go through my new trees towards netdev from
      now on. The 6LoWPAN subsystem will stay as is (shared between ieee802154
      and bluetooth) and go through the bluetooth tree as usual.
      Signed-off-by: default avatarStefan Schmidt <stefan@osg.samsung.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9b95da9