1. 26 Feb, 2018 3 commits
    • Ido Schimmel's avatar
      mlxsw: spectrum_switchdev: Allow port enslavement to a VLAN-unaware bridge · 65b53bfd
      Ido Schimmel authored
      Up until now we only allowed VLAN devices to be put in a VLAN-unaware
      bridge, but some users need the ability to enslave physical ports as
      well.
      
      This is achieved by mapping the port and VID 1 to the bridge's vFID,
      instead of the port and the VID used by the VLAN device.
      
      The above is valid because as long as the port is not enslaved to a
      bridge, VID 1 is guaranteed to be configured as PVID and egress
      untagged.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Tested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65b53bfd
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ba6056a4
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2018-02-26
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) Various improvements for BPF kselftests: i) skip unprivileged tests
         when kernel.unprivileged_bpf_disabled sysctl knob is set, ii) count
         the number of skipped tests from unprivileged, iii) when a test case
         had an unexpected error then print the actual but also the unexpected
         one for better comparison, from Joe.
      
      2) Add a sample program for collecting CPU state statistics with regards
         to how long the CPU resides in cstate and pstate levels. Based on
         cpu_idle and cpu_frequency trace points, from Leo.
      
      3) Various x64 BPF JIT optimizations to further shrink the generated
         image size in order to make it more icache friendly. When tested on
         the Cilium generated programs, image size reduced by approx 4-5% in
         best case mainly due to how LLVM emits unsigned 32 bit constants,
         from Daniel.
      
      4) Improvements and fixes on the BPF sockmap sample programs: i) fix
         the sockmap's Makefile to include nlattr.o for libbpf, ii) detach
         the sock ops programs from the cgroup before exit, from Prashant.
      
      5) Avoid including xdp.h in filter.h by just forward declaring the
         struct xdp_rxq_info in filter.h, from Jesper.
      
      6) Fix the BPF kselftests Makefile for cgroup_helpers.c by only declaring
         it a dependency for test_dev_cgroup.c but not every other test case
         where it is not needed, from Jesper.
      
      7) Adjust rlimit RLIMIT_MEMLOCK for test_tcpbpf_user selftest since the
         default is insufficient for creating the 'global_map' used in the
         corresponding BPF program, from Yonghong.
      
      8) Likewise, for the xdp_redirect sample, Tushar ran into the same when
         invoking xdp_redirect and xdp_monitor at the same time, therefore
         in order to have the sample generically work bump the limit here,
         too. Fix from Tushar.
      
      9) Avoid an unnecessary NULL check in BPF_CGROUP_RUN_PROG_INET_SOCK()
         since sk is always guaranteed to be non-NULL, from Yafang.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba6056a4
    • Leo Yan's avatar
      samples/bpf: Add program for CPU state statistics · c5350777
      Leo Yan authored
      CPU is active when have running tasks on it and CPUFreq governor can
      select different operating points (OPP) according to different workload;
      we use 'pstate' to present CPU state which have running tasks with one
      specific OPP.  On the other hand, CPU is idle which only idle task on
      it, CPUIdle governor can select one specific idle state to power off
      hardware logics; we use 'cstate' to present CPU idle state.
      
      Based on trace events 'cpu_idle' and 'cpu_frequency' we can accomplish
      the duration statistics for every state.  Every time when CPU enters
      into or exits from idle states, the trace event 'cpu_idle' is recorded;
      trace event 'cpu_frequency' records the event for CPU OPP changing, so
      it's easily to know how long time the CPU stays in the specified OPP,
      and the CPU must be not in any idle state.
      
      This patch is to utilize the mentioned trace events for pstate and
      cstate statistics.  To achieve more accurate profiling data, the program
      uses below sequence to insure CPU running/idle time aren't missed:
      
      - Before profiling the user space program wakes up all CPUs for once, so
        can avoid to missing account time for CPU staying in idle state for
        long time; the program forces to set 'scaling_max_freq' to lowest
        frequency and then restore 'scaling_max_freq' to highest frequency,
        this can ensure the frequency to be set to lowest frequency and later
        after start to run workload the frequency can be easily to be changed
        to higher frequency;
      
      - User space program reads map data and update statistics for every 5s,
        so this is same with other sample bpf programs for avoiding big
        overload introduced by bpf program self;
      
      - When send signal to terminate program, the signal handler wakes up
        all CPUs, set lowest frequency and restore highest frequency to
        'scaling_max_freq'; this is exactly same with the first step so
        avoid to missing account CPU pstate and cstate time during last
        stage.  Finally it reports the latest statistics.
      
      The program has been tested on Hikey board with octa CA53 CPUs, below
      is one example for statistics result, the format mainly follows up
      Jesper Dangaard Brouer suggestion.
      
      Jesper reminds to 'get printf to pretty print with thousands separators
      use %' and setlocale(LC_NUMERIC, "en_US")', tried three different arm64
      GCC toolchains (5.4.0 20160609, 6.2.1 20161016, 6.3.0 20170516) but all
      of them cannot support printf flag character %' on arm64 platform, so go
      back print number without grouping mode.
      
      CPU states statistics:
      state(ms)  cstate-0    cstate-1    cstate-2    pstate-0    pstate-1    pstate-2    pstate-3    pstate-4
      CPU-0      767         6111        111863      561         31          756         853         190
      CPU-1      241         10606       107956      484         125         646         990         85
      CPU-2      413         19721       98735       636         84          696         757         89
      CPU-3      84          11711       79989       17516       909         4811        5773        341
      CPU-4      152         19610       98229       444         53          649         708         1283
      CPU-5      185         8781        108697      666         91          671         677         1365
      CPU-6      157         21964       95825       581         67          566         684         1284
      CPU-7      125         15238       102704      398         20          665         786         1197
      
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      c5350777
  2. 24 Feb, 2018 8 commits
  3. 23 Feb, 2018 28 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 9cb9c07d
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix TTL offset calculation in mac80211 mesh code, from Peter Oh.
      
       2) Fix races with procfs in ipt_CLUSTERIP, from Cong Wang.
      
       3) Memory leak fix in lpm_trie BPF map code, from Yonghong Song.
      
       4) Need to use GFP_ATOMIC in BPF cpumap allocations, from Jason Wang.
      
       5) Fix potential deadlocks in netfilter getsockopt() code paths, from
          Paolo Abeni.
      
       6) Netfilter stackpointer size checks really are needed to validate
          user input, from Florian Westphal.
      
       7) Missing timer init in x_tables, from Paolo Abeni.
      
       8) Don't use WQ_MEM_RECLAIM in mac80211 hwsim, from Johannes Berg.
      
       9) When an ibmvnic device is brought down then back up again, it can be
          sent queue entries from a previous session, handle this properly
          instead of crashing. From Thomas Falcon.
      
      10) Fix TCP checksum on LRO buffers in mlx5e, from Gal Pressman.
      
      11) When we are dumping filters in cls_api, the output SKB is empty, and
          the filter we are dumping is too large for the space in the SKB, we
          should return -EMSGSIZE like other netlink dump operations do.
          Otherwise userland has no signal that is needs to increase the size
          of its read buffer. From Roman Kapl.
      
      12) Several XDP fixes for virtio_net, from Jesper Dangaard Brouer.
      
      13) Module refcount leak in netlink when a dump start fails, from Jason
          Donenfeld.
      
      14) Handle sub-optimal GSO sizes better in TCP BBR congestion control,
          from Eric Dumazet.
      
      15) Releasing bpf per-cpu arraymaps can take a long time, add a
          condtional scheduling point. From Eric Dumazet.
      
      16) Implement retpolines for tail calls in x64 and arm64 bpf JITs. From
          Daniel Borkmann.
      
      17) Fix page leak in gianfar driver, from Andy Spencer.
      
      18) Missed clearing of estimator scratch buffer, from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net_sched: gen_estimator: fix broken estimators based on percpu stats
        gianfar: simplify FCS handling and fix memory leak
        ipv6 sit: work around bogus gcc-8 -Wrestrict warning
        macvlan: fix use-after-free in macvlan_common_newlink()
        bpf, arm64: fix out of bounds access in tail call
        bpf, x64: implement retpoline for tail call
        rxrpc: Fix send in rxrpc_send_data_packet()
        net: aquantia: Fix error handling in aq_pci_probe()
        bpf: fix rcu lockdep warning for lpm_trie map_free callback
        bpf: add schedule points in percpu arrays management
        regulatory: add NUL to request alpha2
        ibmvnic: Fix early release of login buffer
        net/smc9194: Remove bogus CONFIG_MAC reference
        net: ipv4: Set addr_type in hash_keys for forwarded case
        tcp_bbr: better deal with suboptimal GSO
        smsc75xx: fix smsc75xx_set_features()
        netlink: put module reference if dump start fails
        selftests/bpf/test_maps: exit child process without error in ENOMEM case
        selftests/bpf: update gitignore with test_libbpf_open
        selftests/bpf: tcpbpf_kern: use in6_* macros from glibc
        ..
      9cb9c07d
    • Linus Torvalds's avatar
      Merge branch 'fixes-v4.16-rc3' of... · 2eb02aa9
      Linus Torvalds authored
      Merge branch 'fixes-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull security subsystem fixes from James Morris:
      
       - keys fixes via David Howells:
            "A collection of fixes for Linux keyrings, mostly thanks to Eric
             Biggers:
      
              - Fix some PKCS#7 verification issues.
      
              - Fix handling of unsupported crypto in X.509.
      
              - Fix too-large allocation in big_key"
      
       - Seccomp updates via Kees Cook:
            "These are fixes for the get_metadata interface that landed during
             -rc1. While the new selftest is strictly not a bug fix, I think
             it's in the same spirit of avoiding bugs"
      
       - an IMA build fix from Randy Dunlap
      
      * 'fixes-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        integrity/security: fix digsig.c build error with header file
        KEYS: Use individual pages in big_key for crypto buffers
        X.509: fix NULL dereference when restricting key with unsupported_sig
        X.509: fix BUG_ON() when hash algorithm is unsupported
        PKCS#7: fix direct verification of SignerInfo signature
        PKCS#7: fix certificate blacklisting
        PKCS#7: fix certificate chain verification
        seccomp: add a selftest for get_metadata
        ptrace, seccomp: tweak get_metadata behavior slightly
        seccomp, ptrace: switch get_metadata types to arch independent
      2eb02aa9
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 65738c6b
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
       "arm64 and perf fixes:
      
         - build error when accessing MPIDR_HWID_BITMASK from .S
      
         - fix CTR_EL0 field definitions
      
         - remove/disable some kernel messages on user faults (unhandled
           signals, unimplemented syscalls)
      
         - fix kernel page fault in unwind_frame() with function graph tracing
      
         - fix perf sleeping while atomic errors when booting with ACPI"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: fix unwind_frame() for filtered out fn for function graph tracing
        arm64: Enforce BBM for huge IO/VMAP mappings
        arm64: perf: correct PMUVer probing
        arm_pmu: acpi: request IRQs up-front
        arm_pmu: note IRQs and PMUs per-cpu
        arm_pmu: explicitly enable/disable SPIs at hotplug
        arm_pmu: acpi: check for mismatched PPIs
        arm_pmu: add armpmu_alloc_atomic()
        arm_pmu: fold platform helpers into platform code
        arm_pmu: kill arm_pmu_platdata
        ARM: ux500: remove PMU IRQ bouncer
        arm64: __show_regs: Only resolve kernel symbols when running at EL1
        arm64: Remove unimplemented syscall log message
        arm64: Disable unhandled signal log messages by default
        arm64: cpufeature: Fix CTR_EL0 field definitions
        arm64: uaccess: Formalise types for access_ok()
        arm64: Fix compilation error while accessing MPIDR_HWID_BITMASK from .S files
      65738c6b
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips · 2bd06ce7
      Linus Torvalds authored
      Pull MIPS fix from James Hogan:
       "A single MIPS fix for mismatching struct compat_flock, resulting in
        bus errors starting Firefox on Debian 8 since 4.13"
      
      * tag 'mips_fixes_4.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
        MIPS: Drop spurious __unused in struct compat_flock
      2bd06ce7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk · 13f514be
      Linus Torvalds authored
      Pull printk fixlet from Petr Mladek:
       "People expect to see the real pointer value for %px.
      
        Let's substitute '(null)' only for the other %p? format modifiers that
        need to deference the pointer"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
        vsprintf: avoid misleading "(null)" for %px
      13f514be
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 938e1426
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes, one v4.16 regression fix, and two documentation fixes"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: designware: Consider SCL GPIO optional
        i2c: busses: i2c-sirf: Fix spelling: "formular" -> "formula".
        i2c: bcm2835: Set up the rising/falling edge delays
        i2c: i801: Add missing documentation entries for Braswell and Kaby Lake
        i2c: designware: must wait for enable
      938e1426
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 170e07bf
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "These are mostly fixes for problems with merge window code.
      
        In addition we have one doc update (alua) and two dead code removals
        (aiclib and octogon) a spurious assignment removal (csiostor) and a
        performance improvement for storvsc involving better interrupt
        spreading and increasing the command per lun handling"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla4xxx: skip error recovery in case of register disconnect.
        scsi: aacraid: fix shutdown crash when init fails
        scsi: qedi: Cleanup local str variable
        scsi: qedi: Fix truncation of CHAP name and secret
        scsi: qla2xxx: Fix incorrect handle for abort IOCB
        scsi: qla2xxx: Fix double free bug after firmware timeout
        scsi: storvsc: Increase cmd_per_lun for higher speed devices
        scsi: qla2xxx: Fix a locking imbalance in qlt_24xx_handle_els()
        scsi: scsi_dh: Document alua_rtpg_queue() arguments
        scsi: Remove Makefile entry for oktagon files
        scsi: aic7xxx: remove aiclib.c
        scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
        scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
        scsi: sym53c8xx_2: iterator underflow in sym_getsync()
        scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
        scsi: csiostor: remove redundant assignment to pointer 'ln'
        scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
        scsi: ibmvfc: fix misdefined reserved field in ibmvfc_fcp_rsp_info
        scsi: qla2xxx: Fix memory corruption during hba reset test
        scsi: mpt3sas: fix an out of bound write
      170e07bf
    • Donald Sharp's avatar
      net: fib_rules: Add new attribute to set protocol · 1b71af60
      Donald Sharp authored
      For ages iproute2 has used `struct rtmsg` as the ancillary header for
      FIB rules and in the process set the protocol value to RTPROT_BOOT.
      Until ca56209a66 ("net: Allow a rule to track originating protocol")
      the kernel rules code ignored the protocol value sent from userspace
      and always returned 0 in notifications. To avoid incompatibility with
      existing iproute2, send the protocol as a new attribute.
      
      Fixes: cac56209 ("net: Allow a rule to track originating protocol")
      Signed-off-by: default avatarDonald Sharp <sharpd@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b71af60
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.16-rc3' of git://people.freedesktop.org/~airlied/linux · 8961ca44
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "A bunch of fixes for rc3:
      
        Exynos:
         - fixes for using monotonic timestamps
         - register definitions
         - removal of unused file
      
        ipu-v3L
         - minor changes
         - make some register arrays const+static
         - fix some leaks
      
        meson:
         - fix for vsync
      
        atomic:
         - fix for memory leak
      
        EDID parser:
         - add quirks for some more non-desktop devices
         - 6-bit panel fix.
      
        drm_mm:
         - fix a bug in the core drm mm hole handling
      
        cirrus:
         - fix lut loading regression
      
        Lastly there is a deadlock fix around runtime suspend for secondary
        GPUs.
      
        There was a deadlock between one thread trying to wait for a workqueue
        job to finish in the runtime suspend path, and the workqueue job it
        was waiting for in turn waiting for a runtime_get_sync to return.
      
        The fixes avoids it by not doing the runtime sync in the workqueue as
        then we always wait for all those tasks to complete before we runtime
        suspend"
      
      * tag 'drm-fixes-for-v4.16-rc3' of git://people.freedesktop.org/~airlied/linux: (25 commits)
        drm/tve200: fix kernel-doc documentation comment include
        drm/edid: quirk Sony PlayStation VR headset as non-desktop
        drm/edid: quirk Windows Mixed Reality headsets as non-desktop
        drm/edid: quirk Oculus Rift headsets as non-desktop
        drm/meson: fix vsync buffer update
        drm: Handle unexpected holes in color-eviction
        drm: exynos: Use proper macro definition for HDMI_I2S_PIN_SEL_1
        drm/exynos: remove exynos_drm_rotator.h
        drm/exynos: g2d: Delete an error message for a failed memory allocation in two functions
        drm/exynos: fix comparison to bitshift when dealing with a mask
        drm/exynos: g2d: use monotonic timestamps
        drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA
        gpu: ipu-csi: add 10/12-bit grayscale support to mbus_code_to_bus_cfg
        gpu: ipu-cpmem: add 16-bit grayscale support to ipu_cpmem_set_image
        gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
        gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
        drm/amdgpu: Fix deadlock on runtime suspend
        drm/radeon: Fix deadlock on runtime suspend
        drm/nouveau: Fix deadlock on runtime suspend
        drm: Allow determining if current task is output poll worker
        ...
      8961ca44
    • Willem de Bruijn's avatar
      selftests/net: ignore background traffic in psock_fanout · cc30c93f
      Willem de Bruijn authored
      The packet fanout test generates UDP traffic and reads this with
      a pair of packet sockets, testing the various fanout algorithms.
      
      Avoid non-determinism from reading unrelated background traffic.
      Fanout decisions are made before unrelated packets can be dropped with
      a filter, so that is an insufficient strategy [*]. Run the packet
      socket tests in a network namespace, similar to msg_zerocopy.
      
      It it still good practice to install a filter on a packet socket
      before accepting traffic. Because this is example code, demonstrate
      that pattern. Open the socket initially bound to no protocol, install
      a filter, and only then bind to ETH_P_IP.
      
      Another source of non-determinism is hash collisions in FANOUT_HASH.
      The hash function used to select a socket in the fanout group includes
      the pseudorandom number hashrnd, which is not visible from userspace.
      To work around this, the test tries to find a pair of UDP source ports
      that do not collide. It gives up too soon (5 times, every 32 runs) and
      output is confusing. Increase tries to 20 and revise the error msg.
      
      [*] another approach would be to add a third socket to the fanout
          group and direct all unexpected traffic here. This is possible
          only when reimplementing methods like RR or HASH alongside this
          extra catch-all bucket, using the BPF fanout method.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc30c93f
    • Colin Ian King's avatar
      atm: idt77252: remove redundant bit-wise or'ing of zero · 3c69f792
      Colin Ian King authored
      Zero is being bit-wise or'd in a calculation twice; these are redundant
      and can be removed.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c69f792
    • Eric Dumazet's avatar
      net_sched: gen_estimator: fix broken estimators based on percpu stats · a5f7add3
      Eric Dumazet authored
      pfifo_fast got percpu stats lately, uncovering a bug I introduced last
      year in linux-4.10.
      
      I missed the fact that we have to clear our temporary storage
      before calling __gnet_stats_copy_basic() in the case of percpu stats.
      
      Without this fix, rate estimators (tc qd replace dev xxx root est 1sec
      4sec pfifo_fast) are utterly broken.
      
      Fixes: 1c0d32fd ("net_sched: gen_estimator: complete rewrite of rate estimators")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5f7add3
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 22170094
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2018-02-22
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) two urgent fixes for bpf_tail_call logic for x64 and arm64 JITs, from Daniel.
      
      2) cond_resched points in percpu array alloc/free paths, from Eric.
      
      3) lockdep and other minor fixes, from Yonghong, Arnd, Anders, Li.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22170094
    • Sowmini Varadhan's avatar
      rds: rds_msg_zcopy should return error of null rm->data.op_mmp_znotifier · 79a5b972
      Sowmini Varadhan authored
      if either or both of MSG_ZEROCOPY and SOCK_ZEROCOPY have not been
      specified, the rm->data.op_mmp_znotifier allocation will be skipped.
      In this case, it is invalid ot pass down a cmsghdr with
      RDS_CMSG_ZCOPY_COOKIE, so return EINVAL from rds_msg_zcopy for this
      case.
      
      Reported-by: syzbot+f893ae7bb2f6456dfbc3@syzkaller.appspotmail.com
      Fixes: 0cebacce ("rds: zerocopy Tx support.")
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79a5b972
    • Heiner Kallweit's avatar
      r8169: simplify and improve check for dash · 9dbe7896
      Heiner Kallweit authored
      r8168_check_dash() returns false anyway for all chip versions not
      supporting dash. So we can simplify the check conditions.
      
      In addition change the check functions to return bool instead of int,
      because they actually return a bool value.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dbe7896
    • Heiner Kallweit's avatar
      r8169: disable WOL per default · 7edf6d31
      Heiner Kallweit authored
      Currently, if BIOS enables WOL in the chip, settings are inconsistent
      because the device isn't marked as wakeup-enabled (if not done
      explicitly via userspace tools). This causes issues with suspend/
      resume because mdio_bus_phy_may_suspend() checks whether device is
      wakeup-enabled. In detail MDIO bus access in phy_suspend() can fail
      because the MDIO bus is disabled.
      
      In the history of the driver we find two competing approaches:
      8f9d5138 "r8169: remember WOL preferences on driver load" prefers
      to preserve what the BIOS may have set, whilst bde135a6
      "r8169: only enable PCI wakeups when WOL is active" disabled PCI
      wakeup per default to work around a bug on one platform.
      
      Seems like nobody complained after the latter patch about non-working
      WOL, what makes me think that nobody uses WOL w/o configuring it
      explicitly.
      
      My opinion:
      Vast majority of users doesn't use WOL even if the BIOS enables it in
      the chip. And having WOL being active keeps the PHY(s) from powering
      down if being idle.
      If somebody needs WOL, he can enable it during boot, e.g. by
      configuring systemd.link/WakeOnLan.
      
      Therefore, to make WOL consistent again, disable it per default.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7edf6d31
    • Andy Spencer's avatar
      gianfar: simplify FCS handling and fix memory leak · d903ec77
      Andy Spencer authored
      Previously, buffer descriptors containing only the frame check sequence
      (FCS) were skipped and not added to the skb. However, the page reference
      count was still incremented, leading to a memory leak.
      
      Fixing this inside gfar_add_rx_frag() is difficult due to reserved
      memory handling and page reuse. Instead, move the FCS handling to
      gfar_process_frame() and trim off the FCS before passing the skb up the
      networking stack.
      Signed-off-by: default avatarAndy Spencer <aspencer@spacex.com>
      Signed-off-by: default avatarJim Gruen <jgruen@spacex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d903ec77
    • Arnd Bergmann's avatar
      ipv6 sit: work around bogus gcc-8 -Wrestrict warning · ca79bec2
      Arnd Bergmann authored
      gcc-8 has a new warning that detects overlapping input and output arguments
      in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
      which is actually correct:
      
      net/ipv6/sit.c: In function 'sit_init_net':
      net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
      
      The problem here is that the logic detecting the memcpy() arguments finds them
      to be the same, but the conditional that tests for the input and output of
      ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.
      
      We know that netdev_priv(t->dev) is the same as t for a tunnel device,
      and comparing "dev" directly here lets the compiler figure out as well
      that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
      it no longer warns.
      
      This code is old, so Cc stable to make sure that we don't get the warning
      for older kernels built with new gcc.
      
      Cc: Martin Sebor <msebor@gmail.com>
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca79bec2
    • Alexey Kodanev's avatar
      macvlan: fix use-after-free in macvlan_common_newlink() · 4e14bf42
      Alexey Kodanev authored
      The following use-after-free was reported by KASan when running
      LTP macvtap01 test on 4.16-rc2:
      
      [10642.528443] BUG: KASAN: use-after-free in
                     macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450
      ...
      [10642.963873] Call Trace:
      [10642.994352]  dump_stack+0x5c/0x7c
      [10643.035325]  print_address_description+0x75/0x290
      [10643.092938]  kasan_report+0x28d/0x390
      [10643.137971]  ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.207963]  macvlan_common_newlink+0x12ef/0x14a0 [macvlan]
      [10643.275978]  macvtap_newlink+0x171/0x260 [macvtap]
      [10643.334532]  rtnl_newlink+0xd4f/0x1300
      ...
      [10646.256176] Allocated by task 18450:
      [10646.299964]  kasan_kmalloc+0xa6/0xd0
      [10646.343746]  kmem_cache_alloc_trace+0xf1/0x210
      [10646.397826]  macvlan_common_newlink+0x6de/0x14a0 [macvlan]
      [10646.464386]  macvtap_newlink+0x171/0x260 [macvtap]
      [10646.522728]  rtnl_newlink+0xd4f/0x1300
      ...
      [10647.022028] Freed by task 18450:
      [10647.061549]  __kasan_slab_free+0x138/0x180
      [10647.111468]  kfree+0x9e/0x1c0
      [10647.147869]  macvlan_port_destroy+0x3db/0x650 [macvlan]
      [10647.211411]  rollback_registered_many+0x5b9/0xb10
      [10647.268715]  rollback_registered+0xd9/0x190
      [10647.319675]  register_netdevice+0x8eb/0xc70
      [10647.370635]  macvlan_common_newlink+0xe58/0x14a0 [macvlan]
      [10647.437195]  macvtap_newlink+0x171/0x260 [macvtap]
      
      Commit d02fd6e7 ("macvlan: Fix one possible double free") handles
      the case when register_netdevice() invokes ndo_uninit() on error and
      as a result free the port. But 'macvlan_port_get_rtnl(dev))' check
      (returns dev->rx_handler_data), which was added by this commit in order
      to prevent double free, is not quite correct:
      
      * for macvlan it always returns NULL because 'lowerdev' is the one that
        was used to register rx handler (port) in macvlan_port_create() as
        well as to unregister it in macvlan_port_destroy().
      * for macvtap it always returns a valid pointer because macvtap registers
        its own rx handler before macvlan_common_newlink().
      
      Fixes: d02fd6e7 ("macvlan: Fix one possible double free")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e14bf42
    • Yafang Shao's avatar
      bpf: NULL pointer check is not needed in BPF_CGROUP_RUN_PROG_INET_SOCK · ee07862f
      Yafang Shao authored
      sk is already allocated in inet_create/inet6_create, hence when
      BPF_CGROUP_RUN_PROG_INET_SOCK is executed sk will never be NULL.
      
      The logic is as bellow,
      	sk = sk_alloc();
      	if (!sk)
      		goto out;
      	BPF_CGROUP_RUN_PROG_INET_SOCK(sk);
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ee07862f
    • Pratyush Anand's avatar
      arm64: fix unwind_frame() for filtered out fn for function graph tracing · 9f416319
      Pratyush Anand authored
      do_task_stat() calls get_wchan(), which further does unwind_frame().
      unwind_frame() restores frame->pc to original value in case function
      graph tracer has modified a return address (LR) in a stack frame to hook
      a function return. However, if function graph tracer has hit a filtered
      function, then we can't unwind it as ftrace_push_return_trace() has
      biased the index(frame->graph) with a 'huge negative'
      offset(-FTRACE_NOTRACE_DEPTH).
      
      Moreover, arm64 stack walker defines index(frame->graph) as unsigned
      int, which can not compare a -ve number.
      
      Similar problem we can have with calling of walk_stackframe() from
      save_stack_trace_tsk() or dump_backtrace().
      
      This patch fixes unwind_frame() to test the index for -ve value and
      restore index accordingly before we can restore frame->pc.
      
      Reproducer:
      
      cd /sys/kernel/debug/tracing/
      echo schedule > set_graph_notrace
      echo 1 > options/display-graph
      echo wakeup > current_tracer
      ps -ef | grep -i agent
      
      Above commands result in:
      Unable to handle kernel paging request at virtual address ffff801bd3d1e000
      pgd = ffff8003cbe97c00
      [ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
      Internal error: Oops: 96000006 [#1] SMP
      [...]
      CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
      [...]
      task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
      PC is at unwind_frame+0x12c/0x180
      LR is at get_wchan+0xd4/0x134
      pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
      sp : ffff8003cc6c3ab0
      x29: ffff8003cc6c3ab0 x28: 0000000000000001
      x27: 0000000000000026 x26: 0000000000000026
      x25: 00000000000012d8 x24: 0000000000000000
      x23: ffff8003c1c04000 x22: ffff000008c83000
      x21: ffff8003c1c00000 x20: 000000000000000f
      x19: ffff8003c1bc0000 x18: 0000fffffc593690
      x17: 0000000000000000 x16: 0000000000000001
      x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
      x13: 0000000000000001 x12: 0000000000000000
      x11: 00000000e8f4883e x10: 0000000154f47ec8
      x9 : 0000000070f367c0 x8 : 0000000000000000
      x7 : 00008003f7290000 x6 : 0000000000000018
      x5 : 0000000000000000 x4 : ffff8003c1c03cb0
      x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
      x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000
      
      Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
      Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
      [...]
      [<ffff00000808892c>] unwind_frame+0x12c/0x180
      [<ffff000008305008>] do_task_stat+0x864/0x870
      [<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
      [<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
      [<ffff0000082b27e0>] seq_read+0x160/0x414
      [<ffff000008289e6c>] __vfs_read+0x58/0x164
      [<ffff00000828b164>] vfs_read+0x88/0x144
      [<ffff00000828c2e8>] SyS_read+0x60/0xc0
      [<ffff0000080834a0>] __sys_trace_return+0x0/0x4
      
      Fixes: 20380bb3 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
      Signed-off-by: default avatarPratyush Anand <panand@redhat.com>
      Signed-off-by: default avatarJerome Marchand <jmarchan@redhat.com>
      [catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      9f416319
    • Randy Dunlap's avatar
      integrity/security: fix digsig.c build error with header file · 120f3b11
      Randy Dunlap authored
      security/integrity/digsig.c has build errors on some $ARCH due to a
      missing header file, so add it.
      
        security/integrity/digsig.c:146:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: linux-integrity@vger.kernel.org
      Link: http://kisskb.ellerman.id.au/kisskb/head/13396/Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      120f3b11
    • James Morris's avatar
      Merge tag 'keys-fixes-20180222-2' of... · 16c4db3b
      James Morris authored
      Merge tag 'keys-fixes-20180222-2' of https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into fixes-v4.16-rc3
      
      Keyrings fixes.
      16c4db3b
    • Dave Airlie's avatar
      Merge tag 'imx-drm-next-2018-02-22' of git://git.pengutronix.de/git/pza/linux into drm-fixes · b17800e9
      Dave Airlie authored
      drm/imx: ipu-v3 fixups and grayscale support
      
      - Make const interrupt register arrays static, reduces object size.
      - Fix device_node leaks in PRE/PRG phandle lookup functions.
      - Add 8-bit and 16-bit grayscale buffer support to ipu_cpmem_set_image,
      - add 10-bit and 12-bit grayscale media bus support to ipu-csi,
        to be used by the imx-media driver.
      
      * tag 'imx-drm-next-2018-02-22' of git://git.pengutronix.de/git/pza/linux:
        gpu: ipu-csi: add 10/12-bit grayscale support to mbus_code_to_bus_cfg
        gpu: ipu-cpmem: add 16-bit grayscale support to ipu_cpmem_set_image
        gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
        gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
        gpu: ipu-cpmem: add 8-bit grayscale support to ipu_cpmem_set_image
        gpu: ipu-v3: make const arrays int_reg static, shrinks object size
      b17800e9
    • Kees Cook's avatar
      MIPS: boot: Define __ASSEMBLY__ for its.S build · 0f9da844
      Kees Cook authored
      The MIPS %.its.S compiler command did not define __ASSEMBLY__, which meant
      when compiler_types.h was added to kconfig.h, unexpected things appeared
      (e.g. struct declarations) which should not have been present. As done in
      the general %.S compiler command, __ASSEMBLY__ is now included here too.
      
      The failure was:
      
          Error: arch/mips/boot/vmlinux.gz.its:201.1-2 syntax error
          FATAL ERROR: Unable to parse input tree
          /usr/bin/mkimage: Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
          /usr/bin/mkimage Can't add hashes to FIT blob
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Fixes: 28128c61 ("kconfig.h: Include compiler types to avoid missed struct attributes")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f9da844
    • Linus Torvalds's avatar
      Merge branch 'siginfo-linus' of... · bae6cfe8
      Linus Torvalds authored
      Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull siginfo fix from Eric Biederman:
       "This fixes a build error that only shows up on blackfin"
      
      * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        fs/signalfd: fix build error for BUS_MCEERR_AR
      bae6cfe8
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 0bb78166
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "Fix an oops in the s5p-sss driver when used with ecb(aes)"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: s5p-sss - Fix kernel Oops in AES-ECB mode
      0bb78166
    • Daniel Borkmann's avatar
      bpf, arm64: fix out of bounds access in tail call · 16338a9b
      Daniel Borkmann authored
      I recently noticed a crash on arm64 when feeding a bogus index
      into BPF tail call helper. The crash would not occur when the
      interpreter is used, but only in case of JIT. Output looks as
      follows:
      
        [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
        [...]
        [  347.043065] [fffb850e96492510] address between user and kernel address ranges
        [  347.050205] Internal error: Oops: 96000004 [#1] SMP
        [...]
        [  347.190829] x13: 0000000000000000 x12: 0000000000000000
        [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
        [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
        [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
        [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
        [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
        [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
        [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
        [  347.235221] Call trace:
        [  347.237656]  0xffff000002f3a4fc
        [  347.240784]  bpf_test_run+0x78/0xf8
        [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
        [  347.248694]  SyS_bpf+0x77c/0x1110
        [  347.251999]  el0_svc_naked+0x30/0x34
        [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
        [...]
      
      In this case the index used in BPF r3 is the same as in r1
      at the time of the call, meaning we fed a pointer as index;
      here, it had the value 0xffff808fd7cf0500 which sits in x2.
      
      While I found tail calls to be working in general (also for
      hitting the error cases), I noticed the following in the code
      emission:
      
        # bpftool p d j i 988
        [...]
        38:   ldr     w10, [x1,x10]
        3c:   cmp     w2, w10
        40:   b.ge    0x000000000000007c              <-- signed cmp
        44:   mov     x10, #0x20                      // #32
        48:   cmp     x26, x10
        4c:   b.gt    0x000000000000007c
        50:   add     x26, x26, #0x1
        54:   mov     x10, #0x110                     // #272
        58:   add     x10, x1, x10
        5c:   lsl     x11, x2, #3
        60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
        64:   cbz     x11, 0x000000000000007c
        [...]
      
      Meaning, the tests passed because commit ddb55992 ("arm64:
      bpf: implement bpf_tail_call() helper") was using signed compares
      instead of unsigned which as a result had the test wrongly passing.
      
      Change this but also the tail call count test both into unsigned
      and cap the index as u32. Latter we did as well in 90caccdd
      ("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
      too. Tested on HiSilicon Hi1616.
      
      Result after patch:
      
        # bpftool p d j i 268
        [...]
        38:	ldr	w10, [x1,x10]
        3c:	add	w2, w2, #0x0
        40:	cmp	w2, w10
        44:	b.cs	0x0000000000000080
        48:	mov	x10, #0x20                  	// #32
        4c:	cmp	x26, x10
        50:	b.hi	0x0000000000000080
        54:	add	x26, x26, #0x1
        58:	mov	x10, #0x110                 	// #272
        5c:	add	x10, x1, x10
        60:	lsl	x11, x2, #3
        64:	ldr	x11, [x10,x11]
        68:	cbz	x11, 0x0000000000000080
        [...]
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      16338a9b
  4. 22 Feb, 2018 1 commit
    • Daniel Borkmann's avatar
      bpf, x64: implement retpoline for tail call · a493a87f
      Daniel Borkmann authored
      Implement a retpoline [0] for the BPF tail call JIT'ing that converts
      the indirect jump via jmp %rax that is used to make the long jump into
      another JITed BPF image. Since this is subject to speculative execution,
      we need to control the transient instruction sequence here as well
      when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
      The latter aligns also with what gcc / clang emits (e.g. [1]).
      
      JIT dump after patch:
      
        # bpftool p d x i 1
         0: (18) r2 = map[id:1]
         2: (b7) r3 = 0
         3: (85) call bpf_tail_call#12
         4: (b7) r0 = 2
         5: (95) exit
      
      With CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000072  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000072  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000072  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	callq  0x000000000000006d  |+
        66:	pause                      |
        68:	lfence                     |
        6b:	jmp    0x0000000000000066  |
        6d:	mov    %rax,(%rsp)         |
        71:	retq                       |
        72:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        + retpoline for indirect jump
      
      Without CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000063  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000063  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000063  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	jmpq   *%rax               |-
        63:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        - plain indirect jump as before
      
        [0] https://support.google.com/faqs/answer/7625886
        [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2bSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a493a87f