1. 05 Feb, 2019 9 commits
    • Björn Töpel's avatar
      selftests/bpf: add "any alignment" annotation for some tests · e2c6f50e
      Björn Töpel authored
      RISC-V does, in-general, not have "efficient unaligned access". When
      testing the RISC-V BPF JIT, some selftests failed in the verification
      due to misaligned access. Annotate these tests with the
      F_NEEDS_EFFICIENT_UNALIGNED_ACCESS flag.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e2c6f50e
    • Björn Töpel's avatar
      bpf, doc: add RISC-V JIT to BPF documentation · e8cb0167
      Björn Töpel authored
      Update Documentation/networking/filter.txt and
      Documentation/sysctl/net.txt to mention RISC-V.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e8cb0167
    • Björn Töpel's avatar
      MAINTAINERS: add RISC-V BPF JIT maintainer · 8a9e0aff
      Björn Töpel authored
      Add Björn Töpel as RISC-V BPF JIT maintainer.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      8a9e0aff
    • Björn Töpel's avatar
      bpf, riscv: add BPF JIT for RV64G · 2353ecc6
      Björn Töpel authored
      This commit adds a BPF JIT for RV64G.
      
      The JIT is a two-pass JIT, and has a dynamic prolog/epilogue (similar
      to the MIPS64 BPF JIT) instead of static ones (e.g. x86_64).
      
      At the moment the RISC-V Linux port does not support
      CONFIG_HAVE_KPROBES, which means that CONFIG_BPF_EVENTS is not
      supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT,
      BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and
      BPF_PROG_TYPE_RAW_TRACEPOINT passes.
      
      The implementation does not support "far branching" (>4KiB).
      
      Test results:
        # modprobe test_bpf
        test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed]
      
        # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
        # ./test_verifier
        ...
        Summary: 761 PASSED, 507 SKIPPED, 2 FAILED
      
      Note that "test_verifier" was run with one build with
      CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise
      many of the the tests that require unaligned access were skipped.
      
      CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y:
        # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
        # ./test_verifier | grep -c 'NOTE.*unknown align'
        0
      
      No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS:
        # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
        # ./test_verifier | grep -c 'NOTE.*unknown align'
        59
      
      The two failing test_verifier tests are:
        "ld_abs: vlan + abs, test 1"
        "ld_abs: jump around ld_abs"
      
      This is due to that "far branching" involved in those tests.
      
      All tests where done on QEMU (QEMU emulator version 3.1.50
      (v3.1.0-688-g8ae951fbc106)).
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      2353ecc6
    • Daniel Borkmann's avatar
      Merge branch 'bpf-btf-dedup' · 31de3897
      Daniel Borkmann authored
      Andrii Nakryiko says:
      
      ====================
      This patch series adds BTF deduplication algorithm to libbpf. This algorithm
      allows to take BTF type information containing duplicate per-compilation unit
      information and reduce it to equivalent set of BTF types with no duplication without
      loss of information. It also deduplicates strings and removes those strings that
      are not referenced from any BTF type (and line information in .BTF.ext section,
      if any).
      
      Algorithm also resolves struct/union forward declarations into concrete BTF types
      across multiple compilation units to facilitate better deduplication ratio. If
      undesired, this resolution can be disabled through specifying corresponding options.
      
      When applied to BTF data emitted by pahole's DWARF->BTF converter, it reduces
      the overall size of .BTF section by about 65x, from about 112MB to 1.75MB, leaving
      only 29247 out of initial 3073497 BTF type descriptors.
      
      Algorithm with minor differences and preliminary results before FUNC/FUNC_PROTO
      support is also described more verbosely at:
      
      https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html
      
      v1->v2:
      - rebase on latest bpf-next
      - err_log/elog -> pr_debug
      - btf__dedup, btf__get_strings, btf__get_nr_types listed under 0.0.2 version
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      31de3897
    • Andrii Nakryiko's avatar
      selftests/btf: add initial BTF dedup tests · 9c651127
      Andrii Nakryiko authored
      This patch sets up a new kind of tests (BTF dedup tests) and tests few aspects of
      BTF dedup algorithm. More complete set of tests will come in follow up patches.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9c651127
    • Andrii Nakryiko's avatar
      btf: add BTF types deduplication algorithm · d5caef5b
      Andrii Nakryiko authored
      This patch implements BTF types deduplication algorithm. It allows to
      greatly compress typical output of pahole's DWARF-to-BTF conversion or
      LLVM's compilation output by detecting and collapsing identical types emitted in
      isolation per compilation unit. Algorithm also resolves struct/union forward
      declarations into concrete BTF types representing referenced struct/union. If
      undesired, this resolution can be disabled through specifying corresponding options.
      
      Algorithm itself and its application to Linux kernel's BTF types is
      described in details at:
      https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.htmlSigned-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d5caef5b
    • Andrii Nakryiko's avatar
      btf: extract BTF type size calculation · 69eaab04
      Andrii Nakryiko authored
      This pre-patch extracts calculation of amount of space taken by BTF type descriptor
      for later reuse by btf_dedup functionality.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      69eaab04
    • Stanislav Fomichev's avatar
      libbpf: fix libbpf_print · a8a1f7d0
      Stanislav Fomichev authored
      With the recent print rework we now have the following problem:
      pr_{warning,info,debug} expand to __pr which calls libbpf_print.
      libbpf_print does va_start and calls __libbpf_pr with va_list argument.
      In __base_pr we again do va_start. Because the next argument is a
      va_list, we don't get correct pointer to the argument (and print noting
      in my case, I don't know why it doesn't crash tbh).
      
      Fix this by changing libbpf_print_fn_t signature to accept va_list and
      remove unneeded calls to va_start in the existing users.
      
      Alternatively, this can we solved by exporting __libbpf_pr and
      changing __pr macro to (and killing libbpf_print):
      {
      	if (__libbpf_pr)
      		__libbpf_pr(level, "libbpf: " fmt, ##__VA_ARGS__)
      }
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a8a1f7d0
  2. 04 Feb, 2019 13 commits
    • Alexei Starovoitov's avatar
      Merge branch 'libbpf-btf_ext' · 1728b111
      Alexei Starovoitov authored
      Yonghong Song says:
      
      ====================
      This patch set exposed a few functions in libbpf.
      All these newly added API functions are helpful for
      JIT based bpf compilation where .BTF and .BTF.ext
      are available as in-memory data blobs.
      
      Patch #1 exposed several btf_ext__* API functions which
      are used to handle .BTF.ext ELF sections.
      Patch #2 refactored the function bpf_map_find_btf_info()
      and exposed API function btf__get_map_kv_tids() to
      retrieve the map key/value type id's generated by
      bpf program through BPF_ANNOTATE_KV_PAIR macro.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1728b111
    • Yonghong Song's avatar
      tools/bpf: implement libbpf btf__get_map_kv_tids() API function · 96408c43
      Yonghong Song authored
      Currently, to get map key/value type id's, the macro
        BPF_ANNOTATE_KV_PAIR(<map_name>, <key_type>, <value_type>)
      needs to be defined in the bpf program for the
      corresponding map.
      
      During program/map loading time,
      the local static function bpf_map_find_btf_info()
      in libbpf.c is implemented to retrieve the key/value
      type ids given the map name.
      
      The patch refactored function bpf_map_find_btf_info()
      to create an API btf__get_map_kv_tids() which includes
      the bulk of implementation for the original function.
      The API btf__get_map_kv_tids() can be used by bcc,
      a JIT based bpf compilation system, which uses the
      same BPF_ANNOTATE_KV_PAIR to record map key/value types.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      96408c43
    • Yonghong Song's avatar
      tools/bpf: expose functions btf_ext__* as API functions · b8dcf8d1
      Yonghong Song authored
      The following set of functions, which manipulates .BTF.ext
      section, are exposed as API functions:
        . btf_ext__new
        . btf_ext__free
        . btf_ext__reloc_func_info
        . btf_ext__reloc_line_info
        . btf_ext__func_info_rec_size
        . btf_ext__line_info_rec_size
      
      These functions are useful for JIT based bpf codegen, e.g.,
      bcc, to manipulate in-memory .BTF.ext sections.
      
      The signature of function btf_ext__reloc_func_info()
      is also changed to be the same as its definition in btf.c.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b8dcf8d1
    • Stanislav Fomichev's avatar
      selftests/bpf: use localhost in tcp_{server,client}.py · 7e8a5903
      Stanislav Fomichev authored
      Bind and connect to localhost. There is no reason for this test to
      use non-localhost interface. This lets us run this test in a network
      namespace.
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      7e8a5903
    • Heiko Carstens's avatar
      s390: bpf: fix JMP32 code-gen · ecc15f11
      Heiko Carstens authored
      Commit 626a5f66 ("s390: bpf: implement jitting of JMP32") added
      JMP32 code-gen support for s390. However it triggers the warning below
      due to some unusual gotos in the original s390 bpf jit code.
      
      Add a couple of additional "is_jmp32" initializations to fix this.
      Also fix the wrong opcode for the "llilf" instruction that was
      introduced with the same commit.
      
      arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_insn':
      arch/s390/net/bpf_jit_comp.c:248:55: warning: 'is_jmp32' may be used uninitialized in this function [-Wmaybe-uninitialized]
        _EMIT6(op1 | reg(b1, b2) << 16 | (rel & 0xffff), op2 | mask); \
                                                             ^
      arch/s390/net/bpf_jit_comp.c:1211:8: note: 'is_jmp32' was declared here
         bool is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;
      
      Fixes: 626a5f66 ("s390: bpf: implement jitting of JMP32")
      Cc: Jiong Wang <jiong.wang@netronome.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ecc15f11
    • Alexei Starovoitov's avatar
      Merge branch 'change-libbpf-print-api' · 9fa3b473
      Alexei Starovoitov authored
      Yonghong Song says:
      
      ====================
      These are patches responding to my comments for
      Magnus's patch (https://patchwork.ozlabs.org/patch/1032848/).
      The goal is to make pr_* macros available to other C files
      than libbpf.c, and to simplify API function libbpf_set_print().
      
      Specifically, Patch #1 used global functions
      to facilitate pr_* macros in the header files so they
      are available in different C files.
      Patch #2 removes the global function libbpf_print_level_available()
      which is added in Patch 1.
      Patch #3 simplified libbpf_set_print() which takes only one print
      function with a debug level argument among others.
      
      Changelogs:
       v3 -> v4:
         . rename libbpf internal header util.h to libbpf_util.h
         . rename libbpf internal function libbpf_debug_print() to libbpf_print()
       v2 -> v3:
         . bailed out earlier in libbpf_debug_print() if __libbpf_pr is NULL
         . added missing LIBBPF_DEBUG level check in libbpf.c __base_pr().
       v1 -> v2:
         . Renamed global function libbpf_dprint() to libbpf_debug_print()
           to be more expressive.
         . Removed libbpf_dprint_level_available() as it is used only
           once in btf.c and we can remove it by optimizing for common cases.
      ====================
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9fa3b473
    • Yonghong Song's avatar
      tools/bpf: simplify libbpf API function libbpf_set_print() · 6f1ae8b6
      Yonghong Song authored
      Currently, the libbpf API function libbpf_set_print()
      takes three function pointer parameters for warning, info
      and debug printout respectively.
      
      This patch changes the API to have just one function pointer
      parameter and the function pointer has one additional
      parameter "debugging level". So if in the future, if
      the debug level is increased, the function signature
      won't change.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6f1ae8b6
    • Yonghong Song's avatar
      tools/bpf: print out btf log at LIBBPF_WARN level · 9d100a19
      Yonghong Song authored
      Currently, the btf log is allocated and printed out in case
      of error at LIBBPF_DEBUG level.
      Such logs from kernel are very important for debugging.
      For example, bpf syscall BPF_PROG_LOAD command can get
      verifier logs back to user space. In function load_program()
      of libbpf.c, the log buffer is allocated unconditionally
      and printed out at pr_warning() level.
      
      Let us do the similar thing here for btf. Allocate buffer
      unconditionally and print out error logs at pr_warning() level.
      This can reduce one global function and
      optimize for common situations where pr_warning()
      is activated either by default or by user supplied
      debug output function.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9d100a19
    • Yonghong Song's avatar
      tools/bpf: move libbpf pr_* debug print functions to headers · 8461ef8b
      Yonghong Song authored
      A global function libbpf_print, which is invisible
      outside the shared library, is defined to print based
      on levels. The pr_warning, pr_info and pr_debug
      macros are moved into the newly created header
      common.h. So any .c file including common.h can
      use these macros directly.
      
      Currently btf__new and btf_ext__new API has an argument getting
      __pr_debug function pointer into btf.c so the debugging information
      can be printed there. This patch removed this parameter
      from btf__new and btf_ext__new and directly using pr_debug in btf.c.
      
      Another global function libbpf_print_level_available, also
      invisible outside the shared library, can test
      whether a particular level debug printing is
      available or not. It is used in btf.c to
      test whether DEBUG level debug printing is availabl or not,
      based on which the log buffer will be allocated when loading
      btf to the kernel.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8461ef8b
    • Stephen Rothwell's avatar
      socket: fix for Add SO_TIMESTAMP[NS]_NEW · cc733578
      Stephen Rothwell authored
      Fixes: 887feae3 ("socket: Add SO_TIMESTAMP[NS]_NEW")
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc733578
    • Joe Perches's avatar
      netdevice.h: Add __cold to netdev_<level> logging functions · ce3fdb69
      Joe Perches authored
      Add __cold to the netdev_<level> logging functions similar to
      the use of __cold in the generic printk function.
      
      Using __cold moves all the netdev_<level> logging functions
      out-of-line possibly improving code locality and runtime
      performance.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce3fdb69
    • David S. Miller's avatar
      net: Fix fall through warning in y2038 tstamp changes. · ff7653f9
      David S. Miller authored
      net/core/sock.c: In function 'sock_setsockopt':
      net/core/sock.c:914:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
         sock_set_flag(sk, SOCK_TSTAMP_NEW);
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      net/core/sock.c:915:2: note: here
        case SO_TIMESTAMPING_OLD:
        ^~~~
      
      Fixes: 9718475e ("socket: Add SO_TIMESTAMPING_NEW")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff7653f9
    • Masahiro Yamada's avatar
      bpfilter: remove extra header search paths for bpfilter_umh · 303a339f
      Masahiro Yamada authored
      Currently, the header search paths -Itools/include and
      -Itools/include/uapi are not used. Let's drop the unused code.
      
      We can remove -I. too by fixing up one C file.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      303a339f
  3. 03 Feb, 2019 18 commits
    • David S. Miller's avatar
      Merge branch 'phy-aquantia-improvements' · ee825e8b
      David S. Miller authored
      Heiner Kallweit says:
      
      ====================
      net: phy: aquantia: number of improvements
      
      This patch series is based on work from Andrew. I adjusted and added
      certain parts. The series improves few aspects of driver, no functional
      change intended.
      
      v2:
      - add my SoB to patch 1
      - leave kernel.h in in patch 2
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee825e8b
    • Heiner Kallweit's avatar
      net: phy: aquantia: replace magic numbers with constants · 278f6b67
      Heiner Kallweit authored
      Replace magic numbers with proper constants. The original patch is
      from Andrew, I extended / adjusted certain parts:
      - Use decimal bit numbers. The datasheet uses hex bit numbers 0 .. F.
      - Order defines from highest to lowest bit numbers
      - correct some typos
      - add constant MDIO_AN_TX_VEND_INT_MASK2_LINK
      - Remove few functional improvements from the patch, they will come as
        a separate patch.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278f6b67
    • Heiner Kallweit's avatar
      net: phy: aquantia: use macro PHY_ID_MATCH_MODEL · 4d5dfb66
      Heiner Kallweit authored
      Make use of macro PHY_ID_MATCH_MODEL to simplify the code.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d5dfb66
    • Heiner Kallweit's avatar
      net: phy: aquantia: remove unneeded includes · 81e6578c
      Heiner Kallweit authored
      Remove unneeded header includes.
      
      v2:
      - leave kernel.h in
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81e6578c
    • Andrew Lunn's avatar
      net: phy: aquantia: Shorten name space prefix to aqr_ · b37ecb59
      Andrew Lunn authored
      aquantia_ as a name space prefix is rather long, resulting in lots of
      lines needing wrapping, reducing readability. Use the prefix aqr_
      instead, which fits with the vendor naming there devices aqr107, for
      example.
      
      v2:
      - add SoB from Heiner
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b37ecb59
    • Florian Fainelli's avatar
      net: Fix ip_mc_{dec,inc}_group allocation context · 9fb20801
      Florian Fainelli authored
      After 4effd28c ("bridge: join all-snoopers multicast address"), I
      started seeing the following sleep in atomic warnings:
      
      [   26.763893] BUG: sleeping function called from invalid context at mm/slab.h:421
      [   26.771425] in_atomic(): 1, irqs_disabled(): 0, pid: 1658, name: sh
      [   26.777855] INFO: lockdep is turned off.
      [   26.781916] CPU: 0 PID: 1658 Comm: sh Not tainted 5.0.0-rc4 #20
      [   26.787943] Hardware name: BCM97278SV (DT)
      [   26.792118] Call trace:
      [   26.794645]  dump_backtrace+0x0/0x170
      [   26.798391]  show_stack+0x24/0x30
      [   26.801787]  dump_stack+0xa4/0xe4
      [   26.805182]  ___might_sleep+0x208/0x218
      [   26.809102]  __might_sleep+0x78/0x88
      [   26.812762]  kmem_cache_alloc_trace+0x64/0x28c
      [   26.817301]  igmp_group_dropped+0x150/0x230
      [   26.821573]  ip_mc_dec_group+0x1b0/0x1f8
      [   26.825585]  br_ip4_multicast_leave_snoopers.isra.11+0x174/0x190
      [   26.831704]  br_multicast_toggle+0x78/0xcc
      [   26.835887]  store_bridge_parm+0xc4/0xfc
      [   26.839894]  multicast_snooping_store+0x3c/0x4c
      [   26.844517]  dev_attr_store+0x44/0x5c
      [   26.848262]  sysfs_kf_write+0x50/0x68
      [   26.852006]  kernfs_fop_write+0x14c/0x1b4
      [   26.856102]  __vfs_write+0x60/0x190
      [   26.859668]  vfs_write+0xc8/0x168
      [   26.863059]  ksys_write+0x70/0xc8
      [   26.866449]  __arm64_sys_write+0x24/0x30
      [   26.870458]  el0_svc_common+0xa0/0x11c
      [   26.874291]  el0_svc_handler+0x38/0x70
      [   26.878120]  el0_svc+0x8/0xc
      
      while toggling the bridge's multicast_snooping attribute dynamically.
      
      Pass a gfp_t down to igmpv3_add_delrec(), introduce
      __igmp_group_dropped() and introduce __ip_mc_dec_group() to take a gfp_t
      argument.
      
      Similarly introduce ____ip_mc_inc_group() and __ip_mc_inc_group() to
      allow caller to specify gfp_t.
      
      IPv6 part of the patch appears fine.
      
      Fixes: 4effd28c ("bridge: join all-snoopers multicast address")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fb20801
    • Jakub Kicinski's avatar
      net: devlink: report cell size of shared buffers · bff5731d
      Jakub Kicinski authored
      Shared buffer allocation is usually done in cell increments.
      Drivers will either round up the allocation or refuse the
      configuration if it's not an exact multiple of cell size.
      Drivers know exactly the cell size of shared buffer, so help
      out users by providing this information in dumps.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bff5731d
    • David S. Miller's avatar
      Merge branch 'net-y2038-safe-socket-timestamps' · a98dc6ae
      David S. Miller authored
      Deepa Dinamani says:
      
      ====================
      net: y2038-safe socket timestamps
      
      The series introduces new socket timestamps that are
      y2038 safe.
      
      The time data types used for the existing socket timestamp
      options: SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING
      are not y2038 safe. The series introduces SO_TIMESTAMP_NEW,
      SO_TIMESTAMPNS_NEW and SO_TIMESTAMPING_NEW to replace these.
      These new timestamps can be used on all architectures.
      
      The alternative considered was to extend the sys_setsockopt()
      by using the flags. We did not receive any strong opinions about
      either of the approaches. Hence, this was chosen, as glibc folks
      preferred this.
      
      The series does not deal with updating the internal kernel socket
      calls like rxrpc to make them y2038 safe. This will be dealt
      with separately.
      
      Note that the timestamps behavior already does not match the
      man page specific behavior:
      SIOCGSTAMP
          This ioctl should only be used if the socket option SO_TIMESTAMP
      	is not set on the socket. Otherwise, it returns the timestamp of
      	the last packet that was received while SO_TIMESTAMP was not set,
      	or it fails if no such packet has been received,
      	(i.e., ioctl(2) returns -1 with errno set to ENOENT).
      
      The recommendation is to update the man page to remove the above statement.
      
      The overview of the socket timestamp series is as below:
      1. Delete asm specific socket.h when possible.
      2. Support SO/SCM_TIMESTAMP* options only in userspace.
      3. Rename current SO/SCM_TIMESTAMP* to SO/SCM_TIMESTAMP*_OLD.
      3. Alter socket options so that SOCK_RCVTSTAMPNS does
         not rely on SOCK_RCVTSTAMP.
      4. Introduce y2038 safe types for socket timestamp.
      5. Introduce new y2038 safe socket options SO/SCM_TIMESTAMP*_NEW.
      6. Intorduce new y2038 safe socket timeout options.
      
      Changes since v4:
      * Fixed the typo in calling sock_get_timeout()
      
      Changes since v3:
      * Rebased onto net-next and fixups as per review comments
      * Merged the socket timeout series
      * Integrated Arnd's patch to simplify compat handling of timeout syscalls
      
      Changes since v2:
      * Removed extra functions to reduce diff churn as per code review
      
      Changes since v1:
      * Dropped the change to disentangle sock flags
      * Renamed sock_timeval to __kernel_sock_timeval
      * Updated a few comments
      * Added documentation changes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a98dc6ae
    • Deepa Dinamani's avatar
      sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW · a9beb86a
      Deepa Dinamani authored
      Add new socket timeout options that are y2038 safe.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: ccaulfie@redhat.com
      Cc: davem@davemloft.net
      Cc: deller@gmx.de
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: cluster-devel@redhat.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9beb86a
    • Deepa Dinamani's avatar
      socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes · 45bdc661
      Deepa Dinamani authored
      SO_RCVTIMEO and SO_SNDTIMEO socket options use struct timeval
      as the time format. struct timeval is not y2038 safe.
      The subsequent patches in the series add support for new socket
      timeout options with _NEW suffix that will use y2038 safe
      data structures. Although the existing struct timeval layout
      is sufficiently wide to represent timeouts, because of the way
      libc will interpret time_t based on user defined flag, these
      new flags provide a way of having a structure that is the same
      for all architectures consistently.
      Rename the existing options with _OLD suffix forms so that the
      right option is enabled for userspace applications according
      to the architecture and time_t definition of libc.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: ccaulfie@redhat.com
      Cc: deller@gmx.de
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: cluster-devel@redhat.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45bdc661
    • Deepa Dinamani's avatar
      socket: Update timestamping Documentation · 9dd49211
      Deepa Dinamani authored
      With the new y2038 safe timestamping options added, update the
      documentation to reflect the changes.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dd49211
    • Deepa Dinamani's avatar
      socket: Add SO_TIMESTAMPING_NEW · 9718475e
      Deepa Dinamani authored
      Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
      This is the y2038 safe versions of the SO_TIMESTAMPING_OLD
      for all architectures.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: chris@zankel.net
      Cc: fenghua.yu@intel.com
      Cc: rth@twiddle.net
      Cc: tglx@linutronix.de
      Cc: ubraun@linux.ibm.com
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9718475e
    • Deepa Dinamani's avatar
      socket: Add SO_TIMESTAMP[NS]_NEW · 887feae3
      Deepa Dinamani authored
      Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
      socket timestamp options.
      These are the y2038 safe versions of the SO_TIMESTAMP_OLD
      and SO_TIMESTAMPNS_OLD for all architectures.
      
      Note that the format of scm_timestamping.ts[0] is not changed
      in this patch.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: jejb@parisc-linux.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-rdma@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      887feae3
    • Deepa Dinamani's avatar
      socket: Add struct __kernel_sock_timeval · 98bb03c8
      Deepa Dinamani authored
      The new type is meant to be used as a y2038 safe structure
      to be used as part of cmsg data.
      Presently the SO_TIMESTAMP socket option uses struct timeval
      for timestamps. This is not y2038 safe.
      Subsequent patches in the series add new y2038 safe socket
      option to be used in the place of SO_TIMESTAMP_OLD.
      struct __kernel_sock_timeval will be used as the timestamp
      format at that time.
      
      struct __kernel_sock_timeval also maintains the same layout
      across 32 bit and 64 bit ABIs.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98bb03c8
    • Deepa Dinamani's avatar
      socket: Use old_timeval types for socket timestamps · 13c6ee2a
      Deepa Dinamani authored
      As part of y2038 solution, all internal uses of
      struct timeval are replaced by struct __kernel_old_timeval
      and struct compat_timeval by struct old_timeval32.
      Make socket timestamps use these new types.
      
      This is mainly to be able to verify that the kernel build
      is y2038 safe when such non y2038 safe types are not
      supported anymore.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: isdn@linux-pingi.de
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13c6ee2a
    • Deepa Dinamani's avatar
      arch: sparc: Override struct __kernel_old_timeval · bcb3fc32
      Deepa Dinamani authored
      struct __kernel_old_timeval is supposed to have the same
      layout as struct timeval. But, it was inadvarently missed
      that __kernel_suseconds has a different definition for
      sparc64.
      Provide an asm-specific override that fixes it.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcb3fc32
    • Deepa Dinamani's avatar
      sockopt: Rename SO_TIMESTAMP* to SO_TIMESTAMP*_OLD · 7f1bc6e9
      Deepa Dinamani authored
      SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
      way they are currently defined, are not y2038 safe.
      Subsequent patches in the series add new y2038 safe versions
      of these options which provide 64 bit timestamps on all
      architectures uniformly.
      Hence, rename existing options with OLD tag suffixes.
      
      Also note that kernel will not use the untagged SO_TIMESTAMP*
      and SCM_TIMESTAMP* options internally anymore.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: deller@gmx.de
      Cc: dhowells@redhat.com
      Cc: jejb@parisc-linux.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: linux-afs@lists.infradead.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-rdma@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f1bc6e9
    • Deepa Dinamani's avatar
      arch: Use asm-generic/socket.h when possible · 2edfd8e0
      Deepa Dinamani authored
      Many architectures maintain an arch specific copy of the
      file even though there are no differences with the asm-generic
      one. Allow these architectures to use the generic one instead.
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: chris@zankel.net
      Cc: fenghua.yu@intel.com
      Cc: tglx@linutronix.de
      Cc: schwidefsky@de.ibm.com
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: linux-s390@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2edfd8e0