1. 18 Dec, 2017 16 commits
  2. 17 Dec, 2017 18 commits
    • Josef Bacik's avatar
      trace: reenable preemption if we modify the ip · 46df3d20
      Josef Bacik authored
      Things got moved around between the original bpf_override_return patches
      and the final version, and now the ftrace kprobe dispatcher assumes if
      you modified the ip that you also enabled preemption.  Make a comment of
      this and enable preemption, this fixes the lockdep splat that happened
      when using this feature.
      
      Fixes: 9802d865 ("bpf: add a bpf_override_function helper")
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      46df3d20
    • Jakub Kicinski's avatar
      nfp: set flags in the correct member of netdev_bpf · 4a29c0db
      Jakub Kicinski authored
      netdev_bpf.flags is the input member for installing the program.
      netdev_bpf.prog_flags is the output member for querying.  Set
      the correct one on query.
      
      Fixes: 92f0292b ("net: xdp: report flags program was installed with on query")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4a29c0db
    • Jakub Kicinski's avatar
      libbpf: fix Makefile exit code if libelf not found · 21567ede
      Jakub Kicinski authored
      /bin/sh's exit does not recognize -1 as a number, leading to
      the following error message:
      
      /bin/sh: 1: exit: Illegal number: -1
      
      Use 1 as the exit code.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      21567ede
    • Daniel Borkmann's avatar
      Merge branch 'bpf-to-bpf-function-calls' · ef9fde06
      Daniel Borkmann authored
      Alexei Starovoitov says:
      
      ====================
      First of all huge thank you to Daniel, John, Jakub, Edward and others who
      reviewed multiple iterations of this patch set over the last many months
      and to Dave and others who gave critical feedback during netconf/netdev.
      
      The patch is solid enough and we thought through numerous corner cases,
      but it's not the end. More followups with code reorg and features to follow.
      
      TLDR: Allow arbitrary function calls from bpf function to another bpf function.
      
      Since the beginning of bpf all bpf programs were represented as a single function
      and program authors were forced to use always_inline for all functions
      in their C code. That was causing llvm to unnecessary inflate the code size
      and forcing developers to move code to header files with little code reuse.
      
      With a bit of additional complexity teach verifier to recognize
      arbitrary function calls from one bpf function to another as long as
      all of functions are presented to the verifier as a single bpf program.
      Extended program layout:
      ..
      r1 = ..    // arg1
      r2 = ..    // arg2
      call pc+1  // function call pc-relative
      exit
      .. = r1    // access arg1
      .. = r2    // access arg2
      ..
      call pc+20 // second level of function call
      ...
      
      It allows for better optimized code and finally allows to introduce
      the core bpf libraries that can be reused in different projects,
      since programs are no longer limited by single elf file.
      With function calls bpf can be compiled into multiple .o files.
      
      This patch is the first step. It detects programs that contain
      multiple functions and checks that calls between them are valid.
      It splits the sequence of bpf instructions (one program) into a set
      of bpf functions that call each other. Calls to only known
      functions are allowed. Since all functions are presented to
      the verifier at once conceptually it is 'static linking'.
      
      Future plans:
      - introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions
        to be loaded into the kernel that can be later linked to other
        programs with concrete program types. Aka 'dynamic linking'.
      
      - introduce function pointer type and indirect calls to allow
        bpf functions call other dynamically loaded bpf functions while
        the caller bpf function is already executing. Aka 'runtime linking'.
        This will be more generic and more flexible alternative
        to bpf_tail_calls.
      
      FAQ:
      Q: Interpreter and JIT changes mean that new instruction is introduced ?
      A: No. The call instruction technically stays the same. Now it can call
         both kernel helpers and other bpf functions.
         Calling convention stays the same as well.
         From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL
         similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn.
      
      Q: What had to change on LLVM side?
      A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release:
         https://reviews.llvm.org/rL318614
         with few bugfixes as well.
         Make sure to build the latest llvm to have bpf_call support.
      
      More details in the patches.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ef9fde06
    • Daniel Borkmann's avatar
      selftests/bpf: additional bpf_call tests · 28ab173e
      Daniel Borkmann authored
      Add some additional checks for few more corner cases.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      28ab173e
    • Alexei Starovoitov's avatar
      bpf: arm64: add JIT support for multi-function programs · db496944
      Alexei Starovoitov authored
      similar to x64 add support for bpf-to-bpf calls.
      When program has calls to in-kernel helpers the target call offset
      is known at JIT time and arm64 architecture needs 2 passes.
      With bpf-to-bpf calls the dynamically allocated function start
      is unknown until all functions of the program are JITed.
      Therefore (just like x64) arm64 JIT needs one extra pass over
      the program to emit correct call offsets.
      
      Implementation detail:
      Avoid being too clever in 64-bit immediate moves and
      always use 4 instructions (instead of 3-4 depending on the address)
      to make sure only one extra pass is needed.
      If some future optimization would make it worth while to optimize
      'call 64-bit imm' further, the JIT would need to do 4 passes
      over the program instead of 3 as in this patch.
      For typical bpf program address the mov needs 3 or 4 insns,
      so unconditional 4 insns to save extra pass is a worthy trade off
      at this state of JIT.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      db496944
    • Alexei Starovoitov's avatar
      bpf: x64: add JIT support for multi-function programs · 1c2a088a
      Alexei Starovoitov authored
      Typical JIT does several passes over bpf instructions to
      compute total size and relative offsets of jumps and calls.
      With multitple bpf functions calling each other all relative calls
      will have invalid offsets intially therefore we need to additional
      last pass over the program to emit calls with correct offsets.
      For example in case of three bpf functions:
      main:
        call foo
        call bpf_map_lookup
        exit
      foo:
        call bar
        exit
      bar:
        exit
      
      We will call bpf_int_jit_compile() indepedently for main(), foo() and bar()
      x64 JIT typically does 4-5 passes to converge.
      After these initial passes the image for these 3 functions
      will be good except call targets, since start addresses of
      foo() and bar() are unknown when we were JITing main()
      (note that call bpf_map_lookup will be resolved properly
      during initial passes).
      Once start addresses of 3 functions are known we patch
      call_insn->imm to point to right functions and call
      bpf_int_jit_compile() again which needs only one pass.
      Additional safety checks are done to make sure this
      last pass doesn't produce image that is larger or smaller
      than previous pass.
      
      When constant blinding is on it's applied to all functions
      at the first pass, since doing it once again at the last
      pass can change size of the JITed code.
      
      Tested on x64 and arm64 hw with JIT on/off, blinding on/off.
      x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter.
      All other JITs that support normal BPF_CALL will behave the same way
      since bpf-to-bpf call is equivalent to bpf-to-kernel call from
      JITs point of view.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1c2a088a
    • Alexei Starovoitov's avatar
      bpf: fix net.core.bpf_jit_enable race · 60b58afc
      Alexei Starovoitov authored
      global bpf_jit_enable variable is tested multiple times in JITs,
      blinding and verifier core. The malicious root can try to toggle
      it while loading the programs. This race condition was accounted
      for and there should be no issues, but it's safer to avoid
      this race condition.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      60b58afc
    • Alexei Starovoitov's avatar
      bpf: add support for bpf_call to interpreter · 1ea47e01
      Alexei Starovoitov authored
      though bpf_call is still the same call instruction and
      calling convention 'bpf to bpf' and 'bpf to helper' is the same
      the interpreter has to oparate on 'struct bpf_insn *'.
      To distinguish these two cases add a kernel internal opcode and
      mark call insns with it.
      This opcode is seen by interpreter only. JITs will never see it.
      Also add tiny bit of debug code to aid interpreter debugging.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1ea47e01
    • Alexei Starovoitov's avatar
      selftests/bpf: add xdp noinline test · b0b04fc4
      Alexei Starovoitov authored
      add large semi-artificial XDP test with 18 functions to stress test
      bpf call verification logic
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b0b04fc4
    • Alexei Starovoitov's avatar
      selftests/bpf: add bpf_call test · 3bc35c63
      Alexei Starovoitov authored
      strip always_inline from test_l4lb.c and compile it with -fno-inline
      to let verifier go through 11 function with various function arguments
      and return values
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      3bc35c63
    • Alexei Starovoitov's avatar
      libbpf: add support for bpf_call · 48cca7e4
      Alexei Starovoitov authored
      - recognize relocation emitted by llvm
      - since all regular function will be kept in .text section and llvm
        takes care of pc-relative offsets in bpf_call instruction
        simply copy all of .text to relevant program section while adjusting
        bpf_call instructions in program section to point to newly copied
        body of instructions from .text
      - do so for all programs in the elf file
      - set all programs types to the one passed to bpf_prog_load()
      
      Note for elf files with multiple programs that use different
      functions in .text section we need to do 'linker' style logic.
      This work is still TBD
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      48cca7e4
    • Alexei Starovoitov's avatar
      selftests/bpf: add tests for stack_zero tracking · d98588ce
      Alexei Starovoitov authored
      adjust two tests, since verifier got smarter
      and add new one to test stack_zero logic
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d98588ce
    • Alexei Starovoitov's avatar
      bpf: teach verifier to recognize zero initialized stack · cc2b14d5
      Alexei Starovoitov authored
      programs with function calls are often passing various
      pointers via stack. When all calls are inlined llvm
      flattens stack accesses and optimizes away extra branches.
      When functions are not inlined it becomes the job of
      the verifier to recognize zero initialized stack to avoid
      exploring paths that program will not take.
      The following program would fail otherwise:
      
      ptr = &buffer_on_stack;
      *ptr = 0;
      ...
      func_call(.., ptr, ...) {
        if (..)
          *ptr = bpf_map_lookup();
      }
      ...
      if (*ptr != 0) {
        // Access (*ptr)->field is valid.
        // Without stack_zero tracking such (*ptr)->field access
        // will be rejected
      }
      
      since stack slots are no longer uniform invalid | spill | misc
      add liveness marking to all slots, but do it in 8 byte chunks.
      So if nothing was read or written in [fp-16, fp-9] range
      it will be marked as LIVE_NONE.
      If any byte in that range was read, it will be marked LIVE_READ
      and stacksafe() check will perform byte-by-byte verification.
      If all bytes in the range were written the slot will be
      marked as LIVE_WRITTEN.
      This significantly speeds up state equality comparison
      and reduces total number of states processed.
      
                          before   after
      bpf_lb-DLB_L3.o       2051    2003
      bpf_lb-DLB_L4.o       3287    3164
      bpf_lb-DUNKNOWN.o     1080    1080
      bpf_lxc-DDROP_ALL.o   24980   12361
      bpf_lxc-DUNKNOWN.o    34308   16605
      bpf_netdev.o          15404   10962
      bpf_overlay.o         7191    6679
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      cc2b14d5
    • Alexei Starovoitov's avatar
      selftests/bpf: add verifier tests for bpf_call · a7ff3eca
      Alexei Starovoitov authored
      Add extensive set of tests for bpf_call verification logic:
      
      calls: basic sanity
      calls: using r0 returned by callee
      calls: callee is using r1
      calls: callee using args1
      calls: callee using wrong args2
      calls: callee using two args
      calls: callee changing pkt pointers
      calls: two calls with args
      calls: two calls with bad jump
      calls: recursive call. test1
      calls: recursive call. test2
      calls: unreachable code
      calls: invalid call
      calls: jumping across function bodies. test1
      calls: jumping across function bodies. test2
      calls: call without exit
      calls: call into middle of ld_imm64
      calls: call into middle of other call
      calls: two calls with bad fallthrough
      calls: two calls with stack read
      calls: two calls with stack write
      calls: spill into caller stack frame
      calls: two calls with stack write and void return
      calls: ambiguous return value
      calls: two calls that return map_value
      calls: two calls that return map_value with bool condition
      calls: two calls that return map_value with incorrect bool check
      calls: two calls that receive map_value via arg=ptr_stack_of_caller. test1
      calls: two calls that receive map_value via arg=ptr_stack_of_caller. test2
      calls: two jumps that receive map_value via arg=ptr_stack_of_jumper. test3
      calls: two calls that receive map_value_ptr_or_null via arg. test1
      calls: two calls that receive map_value_ptr_or_null via arg. test2
      calls: pkt_ptr spill into caller stack
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a7ff3eca
    • Alexei Starovoitov's avatar
      bpf: introduce function calls (verification) · f4d7e40a
      Alexei Starovoitov authored
      Allow arbitrary function calls from bpf function to another bpf function.
      
      To recognize such set of bpf functions the verifier does:
      1. runs control flow analysis to detect function boundaries
      2. proceeds with verification of all functions starting from main(root) function
      It recognizes that the stack of the caller can be accessed by the callee
      (if the caller passed a pointer to its stack to the callee) and the callee
      can store map_value and other pointers into the stack of the caller.
      3. keeps track of the stack_depth of each function to make sure that total
      stack depth is still less than 512 bytes
      4. disallows pointers to the callee stack to be stored into the caller stack,
      since they will be invalid as soon as the callee returns
      5. to reuse all of the existing state_pruning logic each function call
      is considered to be independent call from the verifier point of view.
      The verifier pretends to inline all function calls it sees are being called.
      It stores the callsite instruction index as part of the state to make sure
      that two calls to the same callee from two different places in the caller
      will be different from state pruning point of view
      6. more safety checks are added to liveness analysis
      
      Implementation details:
      . struct bpf_verifier_state is now consists of all stack frames that
        led to this function
      . struct bpf_func_state represent one stack frame. It consists of
        registers in the given frame and its stack
      . propagate_liveness() logic had a premature optimization where
        mark_reg_read() and mark_stack_slot_read() were manually inlined
        with loop iterating over parents for each register or stack slot.
        Undo this optimization to reuse more complex mark_*_read() logic
      . skip_callee() logic is not necessary from safety point of view,
        but without it mark_*_read() markings become too conservative,
        since after returning from the funciton call a read of r6-r9
        will incorrectly propagate the read marks into callee causing
        inefficient pruning later
      . mark_*_read() logic is now aware of control flow which makes it
        more complex. In the future the plan is to rewrite liveness
        to be hierarchical. So that liveness can be done within
        basic block only and control flow will be responsible for
        propagation of liveness information along cfg and between calls.
      . tail_calls and ld_abs insns are not allowed in the programs with
        bpf-to-bpf calls
      . returning stack pointers to the caller or storing them into stack
        frame of the caller is not allowed
      
      Testing:
      . no difference in cilium processed_insn numbers
      . large number of tests follows in next patches
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f4d7e40a
    • Alexei Starovoitov's avatar
      bpf: introduce function calls (function boundaries) · cc8b0b92
      Alexei Starovoitov authored
      Allow arbitrary function calls from bpf function to another bpf function.
      
      Since the beginning of bpf all bpf programs were represented as a single function
      and program authors were forced to use always_inline for all functions
      in their C code. That was causing llvm to unnecessary inflate the code size
      and forcing developers to move code to header files with little code reuse.
      
      With a bit of additional complexity teach verifier to recognize
      arbitrary function calls from one bpf function to another as long as
      all of functions are presented to the verifier as a single bpf program.
      New program layout:
      r6 = r1    // some code
      ..
      r1 = ..    // arg1
      r2 = ..    // arg2
      call pc+1  // function call pc-relative
      exit
      .. = r1    // access arg1
      .. = r2    // access arg2
      ..
      call pc+20 // second level of function call
      ...
      
      It allows for better optimized code and finally allows to introduce
      the core bpf libraries that can be reused in different projects,
      since programs are no longer limited by single elf file.
      With function calls bpf can be compiled into multiple .o files.
      
      This patch is the first step. It detects programs that contain
      multiple functions and checks that calls between them are valid.
      It splits the sequence of bpf instructions (one program) into a set
      of bpf functions that call each other. Calls to only known
      functions are allowed. In the future the verifier may allow
      calls to unresolved functions and will do dynamic linking.
      This logic supports statically linked bpf functions only.
      
      Such function boundary detection could have been done as part of
      control flow graph building in check_cfg(), but it's cleaner to
      separate function boundary detection vs control flow checks within
      a subprogram (function) into logically indepedent steps.
      Follow up patches may split check_cfg() further, but not check_subprogs().
      
      Only allow bpf-to-bpf calls for root only and for non-hw-offloaded programs.
      These restrictions can be relaxed in the future.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      cc8b0b92
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c30abd5e
      David S. Miller authored
      Three sets of overlapping changes, two in the packet scheduler
      and one in the meson-gxl PHY driver.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c30abd5e
  3. 16 Dec, 2017 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f3b5ad89
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "More fixes from testing done on the rc kernel, including more SELinux
        testing. Looking forward, lockdep found regression today in ipoib
        which is still being fixed.
      
        Summary:
      
         - Fix for SELinux on the umad SMI path. Some old hardware does not
           fill the PKey properly exposing another bug in the newer SELinux
           code.
      
         - Check the input port as we can exceed array bounds from this user
           supplied value
      
         - Users are unable to use the hash field support as they want due to
           incorrect checks on the field restrictions, correct that so the
           feature works as intended
      
         - User triggerable oops in the NETLINK_RDMA handler
      
         - cxgb4 driver fix for a bad interaction with CQ flushing in iser
           caused by patches in this merge window, and bad CQ flushing during
           normal close.
      
         - Unbalanced memalloc_noio in ipoib in an error path"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/ipoib: Restore MM behavior in case of tx_ring allocation failure
        iw_cxgb4: only insert drain cqes if wq is flushed
        iw_cxgb4: only clear the ARMED bit if a notification is needed
        RDMA/netlink: Fix general protection fault
        IB/mlx4: Fix RSS hash fields restrictions
        IB/core: Don't enforce PKey security on SMI MADs
        IB/core: Bound check alternate path port number
      f3b5ad89
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · f25e2295
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes for the AT24 I2C eeprom driver and some minor corrections
        for I2C bus drivers"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: piix4: Fix port number check on release
        i2c: stm32: Fix copyrights
        i2c-cht-wc: constify platform_device_id
        eeprom: at24: change nvmem stride to 1
        eeprom: at24: fix I2C device selection for runtime PM
      f25e2295
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · d025fbf1
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "This has two stable bugfixes, one to fix a BUG_ON() when
        nfs_commit_inode() is called with no outstanding commit requests and
        another to fix a race in the SUNRPC receive codepath.
      
        Additionally, there are also fixes for an NFS client deadlock and an
        xprtrdma performance regression.
      
        Summary:
      
        Stable bugfixes:
         - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a
           commit in the case that there were no commit requests.
         - SUNRPC: Fix a race in the receive code path
      
        Other fixes:
         - NFS: Fix a deadlock in nfs client initialization
         - xprtrdma: Fix a performance regression for small IOs"
      
      * tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        SUNRPC: Fix a race in the receive code path
        nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
        xprtrdma: Spread reply processing over more CPUs
        nfs: fix a deadlock in nfs client initialization
      d025fbf1
    • Linus Torvalds's avatar
      Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" · f6f37321
      Linus Torvalds authored
      This reverts commits 5c9d2d5c, c7da82b8, and e7fe7b5c.
      
      We'll probably need to revisit this, but basically we should not
      complicate the get_user_pages_fast() case, and checking the actual page
      table protection key bits will require more care anyway, since the
      protection keys depend on the exact state of the VM in question.
      
      Particularly when doing a "remote" page lookup (ie in somebody elses VM,
      not your own), you need to be much more careful than this was.  Dave
      Hansen says:
      
       "So, the underlying bug here is that we now a get_user_pages_remote()
        and then go ahead and do the p*_access_permitted() checks against the
        current PKRU. This was introduced recently with the addition of the
        new p??_access_permitted() calls.
      
        We have checks in the VMA path for the "remote" gups and we avoid
        consulting PKRU for them. This got missed in the pkeys selftests
        because I did a ptrace read, but not a *write*. I also didn't
        explicitly test it against something where a COW needed to be done"
      
      It's also not entirely clear that it makes sense to check the protection
      key bits at this level at all.  But one possible eventual solution is to
      make the get_user_pages_fast() case just abort if it sees protection key
      bits set, which makes us fall back to the regular get_user_pages() case,
      which then has a vma and can do the check there if we want to.
      
      We'll see.
      
      Somewhat related to this all: what we _do_ want to do some day is to
      check the PAGE_USER bit - it should obviously always be set for user
      pages, but it would be a good check to have back.  Because we have no
      generic way to test for it, we lost it as part of moving over from the
      architecture-specific x86 GUP implementation to the generic one in
      commit e585513b ("x86/mm/gup: Switch GUP to the generic
      get_user_page_fast() implementation").
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6f37321
  4. 15 Dec, 2017 2 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7a3c296a
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Clamp timeouts to INT_MAX in conntrack, from Jay Elliot.
      
       2) Fix broken UAPI for BPF_PROG_TYPE_PERF_EVENT, from Hendrik
          Brueckner.
      
       3) Fix locking in ieee80211_sta_tear_down_BA_sessions, from Johannes
          Berg.
      
       4) Add missing barriers to ptr_ring, from Michael S. Tsirkin.
      
       5) Don't advertise gigabit in sh_eth when not available, from Thomas
          Petazzoni.
      
       6) Check network namespace when delivering to netlink taps, from Kevin
          Cernekee.
      
       7) Kill a race in raw_sendmsg(), from Mohamed Ghannam.
      
       8) Use correct address in TCP md5 lookups when replying to an incoming
          segment, from Christoph Paasch.
      
       9) Add schedule points to BPF map alloc/free, from Eric Dumazet.
      
      10) Don't allow silly mtu values to be used in ipv4/ipv6 multicast, also
          from Eric Dumazet.
      
      11) Fix SKB leak in tipc, from Jon Maloy.
      
      12) Disable MAC learning on OVS ports of mlxsw, from Yuval Mintz.
      
      13) SKB leak fix in skB_complete_tx_timestamp(), from Willem de Bruijn.
      
      14) Add some new qmi_wwan device IDs, from Daniele Palmas.
      
      15) Fix static key imbalance in ingress qdisc, from Jiri Pirko.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net: qcom/emac: Reduce timeout for mdio read/write
        net: sched: fix static key imbalance in case of ingress/clsact_init error
        net: sched: fix clsact init error path
        ip_gre: fix wrong return value of erspan_rcv
        net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support
        pkt_sched: Remove TC_RED_OFFLOADED from uapi
        net: sched: Move to new offload indication in RED
        net: sched: Add TCA_HW_OFFLOAD
        net: aquantia: Increment driver version
        net: aquantia: Fix typo in ethtool statistics names
        net: aquantia: Update hw counters on hw init
        net: aquantia: Improve link state and statistics check interval callback
        net: aquantia: Fill in multicast counter in ndev stats from hardware
        net: aquantia: Fill ndev stat couters from hardware
        net: aquantia: Extend stat counters to 64bit values
        net: aquantia: Fix hardware DMA stream overload on large MRRS
        net: aquantia: Fix actual speed capabilities reporting
        sock: free skb in skb_complete_tx_timestamp on error
        s390/qeth: update takeover IPs after configuration change
        s390/qeth: lock IP table while applying takeover changes
        ...
      7a3c296a
    • Linus Torvalds's avatar
      Merge tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c36c7a7c
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some USB fixes for 4.15-rc4.
      
        There is the usual handful gadget/dwc2/dwc3 fixes as always, for
        reported issues. But the most important things in here is the core fix
        from Alan Stern to resolve a nasty security bug (my first attempt is
        reverted, Alan's was much cleaner), as well as a number of usbip fixes
        from Shuah Khan to resolve those reported security issues.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: core: prevent malicious bNumInterfaces overflow
        Revert "USB: core: only clean up what we allocated"
        USB: core: only clean up what we allocated
        Revert "usb: gadget: allow to enable legacy drivers without USB_ETH"
        usb: gadget: webcam: fix V4L2 Kconfig dependency
        usb: dwc2: Fix TxFIFOn sizes and total TxFIFO size issues
        usb: dwc3: gadget: Fix PCM1 for ISOC EP with ep->mult less than 3
        usb: dwc3: of-simple: set dev_pm_ops
        usb: dwc3: of-simple: fix missing clk_disable_unprepare
        usb: dwc3: gadget: Wait longer for controller to end command processing
        usb: xhci: fix TDS for MTK xHCI1.1
        xhci: Don't add a virt_dev to the devs array before it's fully allocated
        usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
        usbip: prevent vhci_hcd driver from leaking a socket pointer address
        usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
        usbip: fix stub_rx: get_pipe() to validate endpoint number
        tools/usbip: fixes potential (minor) "buffer overflow" (detected on recent gcc with -Werror)
        USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
        usb: musb: da8xx: fix babble condition handling
      c36c7a7c