1. 09 Aug, 2019 1 commit
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: use-after-free in failing rule with bound set · 6a0a8d10
      Pablo Neira Ayuso authored
      If a rule that has already a bound anonymous set fails to be added, the
      preparation phase releases the rule and the bound set. However, the
      transaction object from the abort path still has a reference to the set
      object that is stale, leading to a use-after-free when checking for the
      set->bound field. Add a new field to the transaction that specifies if
      the set is bound, so the abort path can skip releasing it since the rule
      command owns it and it takes care of releasing it. After this update,
      the set->bound field is removed.
      
      [   24.649883] Unable to handle kernel paging request at virtual address 0000000000040434
      [   24.657858] Mem abort info:
      [   24.660686]   ESR = 0x96000004
      [   24.663769]   Exception class = DABT (current EL), IL = 32 bits
      [   24.669725]   SET = 0, FnV = 0
      [   24.672804]   EA = 0, S1PTW = 0
      [   24.675975] Data abort info:
      [   24.678880]   ISV = 0, ISS = 0x00000004
      [   24.682743]   CM = 0, WnR = 0
      [   24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000
      [   24.692207] [0000000000040434] pgd=0000000000000000
      [   24.697119] Internal error: Oops: 96000004 [#1] SMP
      [...]
      [   24.889414] Call trace:
      [   24.891870]  __nf_tables_abort+0x3f0/0x7a0
      [   24.895984]  nf_tables_abort+0x20/0x40
      [   24.899750]  nfnetlink_rcv_batch+0x17c/0x588
      [   24.904037]  nfnetlink_rcv+0x13c/0x190
      [   24.907803]  netlink_unicast+0x18c/0x208
      [   24.911742]  netlink_sendmsg+0x1b0/0x350
      [   24.915682]  sock_sendmsg+0x4c/0x68
      [   24.919185]  ___sys_sendmsg+0x288/0x2c8
      [   24.923037]  __sys_sendmsg+0x7c/0xd0
      [   24.926628]  __arm64_sys_sendmsg+0x2c/0x38
      [   24.930744]  el0_svc_common.constprop.0+0x94/0x158
      [   24.935556]  el0_svc_handler+0x34/0x90
      [   24.939322]  el0_svc+0x8/0xc
      [   24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863)
      [   24.948336] ---[ end trace cebbb9dcbed3b56f ]---
      
      Fixes: f6ac8585 ("netfilter: nf_tables: unbind set in rule from commit path")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6a0a8d10
  2. 05 Aug, 2019 2 commits
    • Florian Westphal's avatar
      netfilter: nf_flow_table: fix offload for flows that are subject to xfrm · 589b474a
      Florian Westphal authored
      This makes the previously added 'encap test' pass.
      Because its possible that the xfrm dst entry becomes stale while such
      a flow is offloaded, we need to call dst_check() -- the notifier that
      handles this for non-tunneled traffic isn't sufficient, because SA or
      or policies might have changed.
      
      If dst becomes stale the flow offload entry will be tagged for teardown
      and packets will be passed to 'classic' forwarding path.
      
      Removing the entry right away is problematic, as this would
      introduce a race condition with the gc worker.
      
      In case flow is long-lived, it could eventually be offloaded again
      once the gc worker removes the entry from the flow table.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      589b474a
    • Florian Westphal's avatar
      selftests: netfilter: extend flowtable test script for ipsec · 0ca1bbb7
      Florian Westphal authored
      'flow offload' expression should not offload flows that will be subject
      to ipsec, but it does.
      
      This results in a connectivity blackhole for the affected flows -- first
      packets will go through (offload happens after established state is
      reached), but all remaining ones bypass ipsec encryption and are thus
      discarded by the peer.
      
      This can be worked around by adding "rt ipsec exists accept"
      before the 'flow offload' rule matches.
      
      This test case will fail, support for such flows is added in
      next patch.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0ca1bbb7
  3. 03 Aug, 2019 7 commits
    • Qian Cai's avatar
      net/socket: fix GCC8+ Wpacked-not-aligned warnings · 5e5412c3
      Qian Cai authored
      There are a lot of those warnings with GCC8+ 64-bit,
      
      In file included from ./include/linux/sctp.h:42,
                       from net/core/skbuff.c:47:
      ./include/uapi/linux/sctp.h:395:1: warning: alignment 4 of 'struct
      sctp_paddr_change' is less than 8 [-Wpacked-not-aligned]
       } __attribute__((packed, aligned(4)));
       ^
      ./include/uapi/linux/sctp.h:728:1: warning: alignment 4 of 'struct
      sctp_setpeerprim' is less than 8 [-Wpacked-not-aligned]
       } __attribute__((packed, aligned(4)));
       ^
      ./include/uapi/linux/sctp.h:727:26: warning: 'sspp_addr' offset 4 in
      'struct sctp_setpeerprim' isn't aligned to 8 [-Wpacked-not-aligned]
        struct sockaddr_storage sspp_addr;
                                ^~~~~~~~~
      ./include/uapi/linux/sctp.h:741:1: warning: alignment 4 of 'struct
      sctp_prim' is less than 8 [-Wpacked-not-aligned]
       } __attribute__((packed, aligned(4)));
       ^
      ./include/uapi/linux/sctp.h:740:26: warning: 'ssp_addr' offset 4 in
      'struct sctp_prim' isn't aligned to 8 [-Wpacked-not-aligned]
        struct sockaddr_storage ssp_addr;
                                ^~~~~~~~
      ./include/uapi/linux/sctp.h:792:1: warning: alignment 4 of 'struct
      sctp_paddrparams' is less than 8 [-Wpacked-not-aligned]
       } __attribute__((packed, aligned(4)));
       ^
      ./include/uapi/linux/sctp.h:784:26: warning: 'spp_address' offset 4 in
      'struct sctp_paddrparams' isn't aligned to 8 [-Wpacked-not-aligned]
        struct sockaddr_storage spp_address;
                                ^~~~~~~~~~~
      ./include/uapi/linux/sctp.h:905:1: warning: alignment 4 of 'struct
      sctp_paddrinfo' is less than 8 [-Wpacked-not-aligned]
       } __attribute__((packed, aligned(4)));
       ^
      ./include/uapi/linux/sctp.h:899:26: warning: 'spinfo_address' offset 4
      in 'struct sctp_paddrinfo' isn't aligned to 8 [-Wpacked-not-aligned]
        struct sockaddr_storage spinfo_address;
                                ^~~~~~~~~~~~~~
      
      This is because the commit 20c9c825 ("[SCTP] Fix SCTP socket options
      to work with 32-bit apps on 64-bit kernels.") added "packed, aligned(4)"
      GCC attributes to some structures but one of the members, i.e, "struct
      sockaddr_storage" in those structures has the attribute,
      "aligned(__alignof__ (struct sockaddr *)" which is 8-byte on 64-bit
      systems, so the commit overwrites the designed alignments for
      "sockaddr_storage".
      
      To fix this, "struct sockaddr_storage" needs to be aligned to 4-byte as
      it is only used in those packed sctp structure which is part of UAPI,
      and "struct __kernel_sockaddr_storage" is used in some other
      places of UAPI that need not to change alignments in order to not
      breaking userspace.
      
      Use an implicit alignment for "struct __kernel_sockaddr_storage" so it
      can keep the same alignments as a member in both packed and un-packed
      structures without breaking UAPI.
      Suggested-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e5412c3
    • Kevin Lo's avatar
      r8152: fix typo in register name · 59c0b47a
      Kevin Lo authored
      It is likely that PAL_BDC_CR should be PLA_BDC_CR.
      Signed-off-by: default avatarKevin Lo <kevlo@kevlo.org>
      Acked-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59c0b47a
    • Heiner Kallweit's avatar
      net: phy: fix race in genphy_update_link · aa6b1956
      Heiner Kallweit authored
      In phy_start_aneg() autoneg is started, and immediately after that
      link and autoneg status are read. As reported in [0] it can happen that
      at time of this read the PHY has reset the "aneg complete" bit but not
      yet the "link up" bit, what can result in a false link-up detection.
      To fix this don't report link as up if we're in aneg mode and PHY
      doesn't signal "aneg complete".
      
      [0] https://marc.info/?t=156413509900003&r=1&w=2
      
      Fixes: 4950c2ba ("net: phy: fix autoneg mismatch case in genphy_read_status")
      Reported-by: default avatarliuyonglong <liuyonglong@huawei.com>
      Tested-by: default avatarliuyonglong <liuyonglong@huawei.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa6b1956
    • YueHaibing's avatar
      enetc: Select PHYLIB while CONFIG_FSL_ENETC_VF is set · 2802d2cf
      YueHaibing authored
      Like FSL_ENETC, when CONFIG_FSL_ENETC_VF is set,
      we should select PHYLIB, otherwise building still fails:
      
      drivers/net/ethernet/freescale/enetc/enetc.o: In function `enetc_open':
      enetc.c:(.text+0x2744): undefined reference to `phy_start'
      enetc.c:(.text+0x282c): undefined reference to `phy_disconnect'
      drivers/net/ethernet/freescale/enetc/enetc.o: In function `enetc_close':
      enetc.c:(.text+0x28f8): undefined reference to `phy_stop'
      enetc.c:(.text+0x2904): undefined reference to `phy_disconnect'
      drivers/net/ethernet/freescale/enetc/enetc_ethtool.o:(.rodata+0x3f8): undefined reference to `phy_ethtool_get_link_ksettings'
      drivers/net/ethernet/freescale/enetc/enetc_ethtool.o:(.rodata+0x400): undefined reference to `phy_ethtool_set_link_ksettings'
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2802d2cf
    • Wang Xiayang's avatar
      net/ethernet/qlogic/qed: force the string buffer NULL-terminated · 3690c8c9
      Wang Xiayang authored
      strncpy() does not ensure NULL-termination when the input string
      size equals to the destination buffer size 30.
      The output string is passed to qed_int_deassertion_aeu_bit()
      which calls DP_INFO() and relies NULL-termination.
      
      Use strlcpy instead. The other conditional branch above strncpy()
      needs no fix as snprintf() ensures NULL-termination.
      
      This issue is identified by a Coccinelle script.
      Signed-off-by: default avatarWang Xiayang <xywang.sjtu@sjtu.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3690c8c9
    • Gustavo A. R. Silva's avatar
      atm: iphase: Fix Spectre v1 vulnerability · ea443e5e
      Gustavo A. R. Silva authored
      board is controlled by user-space, hence leading to a potential
      exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/atm/iphase.c:2765 ia_ioctl() warn: potential spectre issue 'ia_dev' [r] (local cap)
      drivers/atm/iphase.c:2774 ia_ioctl() warn: possible spectre second half.  'iadev'
      drivers/atm/iphase.c:2782 ia_ioctl() warn: possible spectre second half.  'iadev'
      drivers/atm/iphase.c:2816 ia_ioctl() warn: possible spectre second half.  'iadev'
      drivers/atm/iphase.c:2823 ia_ioctl() warn: possible spectre second half.  'iadev'
      drivers/atm/iphase.c:2830 ia_ioctl() warn: potential spectre issue '_ia_dev' [r] (local cap)
      drivers/atm/iphase.c:2845 ia_ioctl() warn: possible spectre second half.  'iadev'
      drivers/atm/iphase.c:2856 ia_ioctl() warn: possible spectre second half.  'iadev'
      
      Fix this by sanitizing board before using it to index ia_dev and _ia_dev
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea443e5e
    • Dexuan Cui's avatar
      hv_sock: Fix hang when a connection is closed · 685703b4
      Dexuan Cui authored
      There is a race condition for an established connection that is being closed
      by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the
      'remove_sock' is false):
      
      1 for the initial value;
      1 for the sk being in the bound list;
      1 for the sk being in the connected list;
      1 for the delayed close_work.
      
      After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may*
      decrease the refcnt to 3.
      
      Concurrently, hvs_close_connection() runs in another thread:
        calls vsock_remove_sock() to decrease the refcnt by 2;
        call sock_put() to decrease the refcnt to 0, and free the sk;
        next, the "release_sock(sk)" may hang due to use-after-free.
      
      In the above, after hvs_release() finishes, if hvs_close_connection() runs
      faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
      because at the beginning of hvs_close_connection(), the refcnt is still 4.
      
      The issue can be resolved if an extra reference is taken when the
      connection is established.
      
      Fixes: a9eeb998 ("hv_sock: Add support for delayed close")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Reviewed-by: default avatarSunil Muthuswamy <sunilmut@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      685703b4
  4. 01 Aug, 2019 12 commits
  5. 31 Jul, 2019 14 commits
  6. 30 Jul, 2019 4 commits
    • xiaofeis's avatar
      net: dsa: qca8k: enable port flow control · abb48f80
      xiaofeis authored
      Set phy device advertising to enable MAC flow control.
      Signed-off-by: default avatarXiaofei Shen <xiaofeis@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      abb48f80
    • Arnd Bergmann's avatar
      compat_ioctl: pppoe: fix PPPOEIOCSFWD handling · 055d8824
      Arnd Bergmann authored
      Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
      linux-2.5.69 along with hundreds of other commands, but was always broken
      sincen only the structure is compatible, but the command number is not,
      due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
      sockaddr_pppox)), which is different on 64-bit architectures.
      
      Guillaume Nault adds:
      
        And the implementation was broken until 2016 (see 29e73269 ("pppoe:
        fix reference counting in PPPoE proxy")), and nobody ever noticed. I
        should probably have removed this ioctl entirely instead of fixing it.
        Clearly, it has never been used.
      
      Fix it by adding a compat_ioctl handler for all pppoe variants that
      translates the command number and then calls the regular ioctl function.
      
      All other ioctl commands handled by pppoe are compatible between 32-bit
      and 64-bit, and require compat_ptr() conversion.
      
      This should apply to all stable kernels.
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      055d8824
    • Jon Maloy's avatar
      tipc: fix unitilized skb list crash · 2948a1fc
      Jon Maloy authored
      Our test suite somtimes provokes the following crash:
      
      Description of problem:
      [ 1092.597234] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
      [ 1092.605072] PGD 0 P4D 0
      [ 1092.607620] Oops: 0000 [#1] SMP PTI
      [ 1092.611118] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Not tainted 4.18.0-122.el8.x86_64 #1
      [ 1092.619724] Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 1.3.7 02/08/2018
      [ 1092.627215] RIP: 0010:tipc_mcast_filter_msg+0x93/0x2d0 [tipc]
      [ 1092.632955] Code: 0f 84 aa 01 00 00 89 cf 4d 01 ca 4c 8b 26 c1 ef 19 83 e7 0f 83 ff 0c 4d 0f 45 d1 41 8b 6a 10 0f cd 4c 39 e6 0f 84 81 01 00 00 <4d> 8b 9c 24 e8 00 00 00 45 8b 13 41 0f ca 44 89 d7 c1 ef 13 83 e7
      [ 1092.651703] RSP: 0018:ffff929e5fa83a18 EFLAGS: 00010282
      [ 1092.656927] RAX: ffff929e3fb38100 RBX: 00000000069f29ee RCX: 00000000416c0045
      [ 1092.664058] RDX: ffff929e5fa83a88 RSI: ffff929e31a28420 RDI: 0000000000000000
      [ 1092.671209] RBP: 0000000029b11821 R08: 0000000000000000 R09: ffff929e39b4407a
      [ 1092.678343] R10: ffff929e39b4407a R11: 0000000000000007 R12: 0000000000000000
      [ 1092.685475] R13: 0000000000000001 R14: ffff929e3fb38100 R15: ffff929e39b4407a
      [ 1092.692614] FS:  0000000000000000(0000) GS:ffff929e5fa80000(0000) knlGS:0000000000000000
      [ 1092.700702] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1092.706447] CR2: 00000000000000e8 CR3: 000000031300a004 CR4: 00000000007606e0
      [ 1092.713579] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1092.720712] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1092.727843] PKRU: 55555554
      [ 1092.730556] Call Trace:
      [ 1092.733010]  <IRQ>
      [ 1092.735034]  tipc_sk_filter_rcv+0x7ca/0xb80 [tipc]
      [ 1092.739828]  ? __kmalloc_node_track_caller+0x1cb/0x290
      [ 1092.744974]  ? dev_hard_start_xmit+0xa5/0x210
      [ 1092.749332]  tipc_sk_rcv+0x389/0x640 [tipc]
      [ 1092.753519]  tipc_sk_mcast_rcv+0x23c/0x3a0 [tipc]
      [ 1092.758224]  tipc_rcv+0x57a/0xf20 [tipc]
      [ 1092.762154]  ? ktime_get_real_ts64+0x40/0xe0
      [ 1092.766432]  ? tpacket_rcv+0x50/0x9f0
      [ 1092.770098]  tipc_l2_rcv_msg+0x4a/0x70 [tipc]
      [ 1092.774452]  __netif_receive_skb_core+0xb62/0xbd0
      [ 1092.779164]  ? enqueue_entity+0xf6/0x630
      [ 1092.783084]  ? kmem_cache_alloc+0x158/0x1c0
      [ 1092.787272]  ? __build_skb+0x25/0xd0
      [ 1092.790849]  netif_receive_skb_internal+0x42/0xf0
      [ 1092.795557]  napi_gro_receive+0xba/0xe0
      [ 1092.799417]  mlx5e_handle_rx_cqe+0x83/0xd0 [mlx5_core]
      [ 1092.804564]  mlx5e_poll_rx_cq+0xd5/0x920 [mlx5_core]
      [ 1092.809536]  mlx5e_napi_poll+0xb2/0xce0 [mlx5_core]
      [ 1092.814415]  ? __wake_up_common_lock+0x89/0xc0
      [ 1092.818861]  net_rx_action+0x149/0x3b0
      [ 1092.822616]  __do_softirq+0xe3/0x30a
      [ 1092.826193]  irq_exit+0x100/0x110
      [ 1092.829512]  do_IRQ+0x85/0xd0
      [ 1092.832483]  common_interrupt+0xf/0xf
      [ 1092.836147]  </IRQ>
      [ 1092.838255] RIP: 0010:cpuidle_enter_state+0xb7/0x2a0
      [ 1092.843221] Code: e8 3e 79 a5 ff 80 7c 24 03 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d7 01 00 00 31 ff e8 a0 6b ab ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 f3 ba ff ff ff 7f 48 39 c3 7f
      [ 1092.861967] RSP: 0018:ffffaa5ec6533e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
      [ 1092.869530] RAX: ffff929e5faa3100 RBX: 000000fe63dd2092 RCX: 000000000000001f
      [ 1092.876665] RDX: 000000fe63dd2092 RSI: 000000003a518aaa RDI: 0000000000000000
      [ 1092.883795] RBP: 0000000000000003 R08: 0000000000000004 R09: 0000000000022940
      [ 1092.890929] R10: 0000040cb0666b56 R11: ffff929e5faa20a8 R12: ffff929e5faade78
      [ 1092.898060] R13: ffffffffb59258f8 R14: 000000fe60f3228d R15: 0000000000000000
      [ 1092.905196]  ? cpuidle_enter_state+0x92/0x2a0
      [ 1092.909555]  do_idle+0x236/0x280
      [ 1092.912785]  cpu_startup_entry+0x6f/0x80
      [ 1092.916715]  start_secondary+0x1a7/0x200
      [ 1092.920642]  secondary_startup_64+0xb7/0xc0
      [...]
      
      The reason is that the skb list tipc_socket::mc_method.deferredq only
      is initialized for connectionless sockets, while nothing stops arriving
      multicast messages from being filtered by connection oriented sockets,
      with subsequent access to the said list.
      
      We fix this by initializing the list unconditionally at socket creation.
      This eliminates the crash, while the message still is dropped further
      down in tipc_sk_filter_rcv() as it should be.
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2948a1fc
    • David S. Miller's avatar
      Merge tag 'rxrpc-fixes-20190730' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · a17c42f9
      David S. Miller authored
      David Howells says:
      
      ====================
      Here are a couple of fixes for rxrpc:
      
       (1) Fix a potential deadlock in the peer keepalive dispatcher.
      
       (2) Fix a missing notification when a UDP sendmsg error occurs in rxrpc.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a17c42f9