1. 10 May, 2022 1 commit
    • Jing Xia's avatar
      writeback: Avoid skipping inode writeback · 846a3351
      Jing Xia authored
      We have run into an issue that a task gets stuck in
      balance_dirty_pages_ratelimited() when perform I/O stress testing.
      The reason we observed is that an I_DIRTY_PAGES inode with lots
      of dirty pages is in b_dirty_time list and standard background
      writeback cannot writeback the inode.
      After studing the relevant code, the following scenario may lead
      to the issue:
      
      task1                                   task2
      -----                                   -----
      fuse_flush
       write_inode_now //in b_dirty_time
        writeback_single_inode
         __writeback_single_inode
                                       fuse_write_end
                                        filemap_dirty_folio
                                         __xa_set_mark:PAGECACHE_TAG_DIRTY
          lock inode->i_lock
          if mapping tagged PAGECACHE_TAG_DIRTY
          inode->i_state |= I_DIRTY_PAGES
          unlock inode->i_lock
                                         __mark_inode_dirty:I_DIRTY_PAGES
                                            lock inode->i_lock
                                            -was dirty,inode stays in
                                            -b_dirty_time
                                            unlock inode->i_lock
      
         if(!(inode->i_state & I_DIRTY_All))
            -not true,so nothing done
      
      This patch moves the dirty inode to b_dirty list when the inode
      currently is not queued in b_io or b_more_io list at the end of
      writeback_single_inode.
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      CC: stable@vger.kernel.org
      Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option")
      Signed-off-by: default avatarJing Xia <jing.xia@unisoc.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220510023514.27399-1-jing.xia@unisoc.com
      846a3351
  2. 09 May, 2022 1 commit
    • Amir Goldstein's avatar
      fanotify: do not allow setting dirent events in mask of non-dir · ceaf69f8
      Amir Goldstein authored
      Dirent events (create/delete/move) are only reported on watched
      directory inodes, but in fanotify as well as in legacy inotify, it was
      always allowed to set them on non-dir inode, which does not result in
      any meaningful outcome.
      
      Until kernel v5.17, dirent events in fanotify also differed from events
      "on child" (e.g. FAN_OPEN) in the information provided in the event.
      For example, FAN_OPEN could be set in the mask of a non-dir or the mask
      of its parent and event would report the fid of the child regardless of
      the marked object.
      By contrast, FAN_DELETE is not reported if the child is marked and the
      child fid was not reported in the events.
      
      Since kernel v5.17, with fanotify group flag FAN_REPORT_TARGET_FID, the
      fid of the child is reported with dirent events, like events "on child",
      which may create confusion for users expecting the same behavior as
      events "on child" when setting events in the mask on a child.
      
      The desired semantics of setting dirent events in the mask of a child
      are not clear, so for now, deny this action for a group initialized
      with flag FAN_REPORT_TARGET_FID and for the new event FAN_RENAME.
      We may relax this restriction in the future if we decide on the
      semantics and implement them.
      
      Fixes: d61fd650 ("fanotify: introduce group flag FAN_REPORT_TARGET_FID")
      Fixes: 8cc3b1cc ("fanotify: wire up FAN_RENAME event")
      Link: https://lore.kernel.org/linux-fsdevel/20220505133057.zm5t6vumc4xdcnsg@quack3.lan/Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220507080028.219826-1-amir73il@gmail.com
      ceaf69f8
  3. 05 May, 2022 24 commits
    • Linus Torvalds's avatar
      Merge tag 'folio-5.18f' of git://git.infradead.org/users/willy/pagecache · fe27d189
      Linus Torvalds authored
      Pull folio fixes from Matthew Wilcox:
       "Two folio fixes for 5.18.
      
        Darrick and Brian have done amazing work debugging the race I created
        in the folio BIO iterator. The readahead problem was deterministic, so
        easy to fix.
      
         - Fix a race when we were calling folio_next() in the BIO folio iter
           without holding a reference, meaning the folio could be split or
           freed, and we'd jump to the next page instead of the intended next
           folio.
      
         - Fix readahead creating single-page folios instead of the intended
           large folios when doing reads that are not a power of two in size"
      
      * tag 'folio-5.18f' of git://git.infradead.org/users/willy/pagecache:
        mm/readahead: Fix readahead with large folios
        block: Do not call folio_next() on an unreferenced folio
      fe27d189
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · f47c960e
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Drop unused 'max-link-speed' in Apple PCIe
      
       - More redundant 'maxItems/minItems' schema fixes
      
       - Support values for pinctrl 'drive-push-pull' and 'drive-open-drain'
      
       - Fix redundant 'unevaluatedProperties' in MT6360 LEDs binding
      
       - Add missing 'power-domains' property to Cadence UFSHC
      
      * tag 'devicetree-fixes-for-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: pci: apple,pcie: Drop max-link-speed from example
        dt-bindings: Drop redundant 'maxItems/minItems' in if/then schemas
        dt-bindings: pinctrl: Allow values for drive-push-pull and drive-open-drain
        dt-bindings: leds-mt6360: Drop redundant 'unevaluatedProperties'
        dt-bindings: ufs: cdns,ufshc: Add power-domains
      f47c960e
    • Linus Torvalds's avatar
      Merge tag 's390-5.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 0f5d752b
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Disable -Warray-bounds warning for gcc12, since the only known way to
         workaround false positive warnings on lowcore accesses would result
         in worse code on fast paths.
      
       - Avoid lockdep_assert_held() warning in kvm vm memop code.
      
       - Reduce overhead within gmap_rmap code to get rid of long latencies
         when e.g. shutting down 2nd level guests.
      
      * tag 's390-5.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        KVM: s390: vsie/gmap: reduce gmap_rmap overhead
        KVM: s390: Fix lockdep issue in vm memop
        s390: disable -Warray-bounds
      0f5d752b
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 905a6537
      Linus Torvalds authored
      Pull MIPS fix from Thomas Bogendoerfer:
       "Extend R4000/R4400 CPU erratum workaround to all revisions"
      
      * tag 'mips-fixes_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Fix CP0 counter erratum detection for R4k CPUs
      905a6537
    • Linus Torvalds's avatar
      Merge tag 'net-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 68533eb1
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, rxrpc and wireguard.
      
        Previous releases - regressions:
      
         - igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter()
      
         - mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter()
      
         - rds: acquire netns refcount on TCP sockets
      
         - rxrpc: enable IPv6 checksums on transport socket
      
         - nic: hinic: fix bug of wq out of bound access
      
         - nic: thunder: don't use pci_irq_vector() in atomic context
      
         - nic: bnxt_en: fix possible bnxt_open() failure caused by wrong RFS
           flag
      
         - nic: mlx5e:
            - lag, fix use-after-free in fib event handler
            - fix deadlock in sync reset flow
      
        Previous releases - always broken:
      
         - tcp: fix insufficient TCP source port randomness
      
         - can: grcan: grcan_close(): fix deadlock
      
         - nfc: reorder destructive operations in to avoid bugs
      
        Misc:
      
         - wireguard: improve selftests reliability"
      
      * tag 'net-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
        NFC: netlink: fix sleep in atomic bug when firmware download timeout
        selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer
        tcp: drop the hash_32() part from the index calculation
        tcp: increase source port perturb table to 2^16
        tcp: dynamically allocate the perturb table used by source ports
        tcp: add small random increments to the source port
        tcp: resalt the secret every 10 seconds
        tcp: use different parts of the port_offset for index and offset
        secure_seq: use the 64 bits of the siphash for port offset calculation
        wireguard: selftests: set panic_on_warn=1 from cmdline
        wireguard: selftests: bump package deps
        wireguard: selftests: restore support for ccache
        wireguard: selftests: use newer toolchains to fill out architectures
        wireguard: selftests: limit parallelism to $(nproc) tests at once
        wireguard: selftests: make routing loop test non-fatal
        net/mlx5: Fix matching on inner TTC
        net/mlx5: Avoid double clear or set of sync reset requested
        net/mlx5: Fix deadlock in sync reset flow
        net/mlx5e: Fix trust state reset in reload
        net/mlx5e: Avoid checking offload capability in post_parse action
        ...
      68533eb1
    • Duoming Zhou's avatar
      NFC: netlink: fix sleep in atomic bug when firmware download timeout · 4071bf12
      Duoming Zhou authored
      There are sleep in atomic bug that could cause kernel panic during
      firmware download process. The root cause is that nlmsg_new with
      GFP_KERNEL parameter is called in fw_dnld_timeout which is a timer
      handler. The call trace is shown below:
      
      BUG: sleeping function called from invalid context at include/linux/sched/mm.h:265
      Call Trace:
      kmem_cache_alloc_node
      __alloc_skb
      nfc_genl_fw_download_done
      call_timer_fn
      __run_timers.part.0
      run_timer_softirq
      __do_softirq
      ...
      
      The nlmsg_new with GFP_KERNEL parameter may sleep during memory
      allocation process, and the timer handler is run as the result of
      a "software interrupt" that should not call any other function
      that could sleep.
      
      This patch changes allocation mode of netlink message from GFP_KERNEL
      to GFP_ATOMIC in order to prevent sleep in atomic bug. The GFP_ATOMIC
      flag makes memory allocation operation could be used in atomic context.
      
      Fixes: 9674da87 ("NFC: Add firmware upload netlink command")
      Fixes: 9ea7187c ("NFC: netlink: Rename CMD_FW_UPLOAD to CMD_FW_DOWNLOAD")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20220504055847.38026-1-duoming@zju.edu.cnSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4071bf12
    • Matthew Wilcox (Oracle)'s avatar
      mm/readahead: Fix readahead with large folios · b9ff43dd
      Matthew Wilcox (Oracle) authored
      Reading 100KB chunks from a big file (eg dd bs=100K) leads to poor
      readahead behaviour.  Studying the traces in detail, I noticed two
      problems.
      
      The first is that we were setting the readahead flag on the folio which
      contains the last byte read from the block.  This is wrong because we
      will trigger readahead at the end of the read without waiting to see
      if a subsequent read is going to use the pages we just read.  Instead,
      we need to set the readahead flag on the first folio _after_ the one
      which contains the last byte that we're reading.
      
      The second is that we were looking for the index of the folio with the
      readahead flag set to exactly match the start + size - async_size.
      If we've rounded this, either down (as previously) or up (as now),
      we'll think we hit a folio marked as readahead by a different read,
      and try to read the wrong pages.  So round the expected index to the
      order of the folio we hit.
      Reported-by: default avatarGuo Xuenan <guoxuenan@huawei.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      b9ff43dd
    • Matthew Wilcox (Oracle)'s avatar
      block: Do not call folio_next() on an unreferenced folio · 170f37d6
      Matthew Wilcox (Oracle) authored
      It is unsafe to call folio_next() on a folio unless you hold a reference
      on it that prevents it from being split or freed.  After returning
      from the iterator, iomap calls folio_end_writeback() which may drop
      the last reference to the page, or allow the page to be split.  If that
      happens, the iterator will not advance far enough through the bio_vec,
      leading to assertion failures like the BUG() in folio_end_writeback()
      that checks we're not trying to end writeback on a page not currently
      under writeback.  Other assertion failures were also seen, but they're
      all explained by this one bug.
      
      Fix the bug by remembering where the next folio starts before returning
      from the iterator.  There are other ways of fixing this bug, but this
      seems the simplest.
      Reported-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Tested-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reported-by: default avatarBrian Foster <bfoster@redhat.com>
      Tested-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      170f37d6
    • Vladimir Oltean's avatar
      selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer · 5a7c5f70
      Vladimir Oltean authored
      As discussed here with Ido Schimmel:
      https://patchwork.kernel.org/project/netdevbpf/patch/20220224102908.5255-2-jianbol@nvidia.com/
      
      the default conform-exceed action is "reclassify", for a reason we don't
      really understand.
      
      The point is that hardware can't offload that police action, so not
      specifying "conform-exceed" was always wrong, even though the command
      used to work in hardware (but not in software) until the kernel started
      adding validation for it.
      
      Fix the command used by the selftest by making the policer drop on
      exceed, and pass the packet to the next action (goto) on conform.
      
      Fixes: 8cd6b020 ("selftests: ocelot: add some example VCAP IS1, IS2 and ES0 tc offloads")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://lore.kernel.org/r/20220503121428.842906-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5a7c5f70
    • Jakub Kicinski's avatar
      Merge branch 'insufficient-tcp-source-port-randomness' · ef562489
      Jakub Kicinski authored
      Willy Tarreau says:
      
      ====================
      insufficient TCP source port randomness
      
      In a not-yet published paper, Moshe Kol, Amit Klein, and Yossi Gilad
      report being able to accurately identify a client by forcing it to emit
      only 40 times more connections than the number of entries in the
      table_perturb[] table, which is indexed by hashing the connection tuple.
      The current 2^8 setting allows them to perform that attack with only 10k
      connections, which is not hard to achieve in a few seconds.
      
      Eric, Amit and I have been working on this for a few weeks now imagining,
      testing and eliminating a number of approaches that Amit and his team were
      still able to break or that were found to be too risky or too expensive,
      and ended up with the simple improvements in this series that resists to
      the attack, doesn't degrade the performance, and preserves a reliable port
      selection algorithm to avoid connection failures, including the odd/even
      port selection preference that allows bind() to always find a port quickly
      even under strong connect() stress.
      
      The approach relies on several factors:
        - resalting the hash secret that's used to choose the table_perturb[]
          entry every 10 seconds to eliminate slow attacks and force the
          attacker to forget everything that was learned after this delay.
          This already eliminates most of the problem because if a client
          stays silent for more than 10 seconds there's no link between the
          previous and the next patterns, and 10s isn't yet frequent enough
          to cause too frequent repetition of a same port that may induce a
          connection failure ;
      
        - adding small random increments to the source port. Previously, a
          random 0 or 1 was added every 16 ports. Now a random 0 to 7 is
          added after each port. This means that with the default 32768-60999
          range, a worst case rollover happens after 1764 connections, and
          an average of 3137. This doesn't stop statistical attacks but
          requires significantly more iterations of the same attack to
          confirm a guess.
      
        - increasing the table_perturb[] size from 2^8 to 2^16, which Amit
          says will require 2.6 million connections to be attacked with the
          changes above, making it pointless to get a fingerprint that will
          only last 10 seconds. Due to the size, the table was made dynamic.
      
        - a few minor improvements on the bits used from the hash, to eliminate
          some unfortunate correlations that may possibly have been exploited
          to design future attack models.
      
      These changes were tested under the most extreme conditions, up to
      1.1 million connections per second to one and a few targets, showing no
      performance regression, and only 2 connection failures within 13 billion,
      which is less than 2^-32 and perfectly within usual values.
      
      The series is split into small reviewable changes and was already reviewed
      by Amit and Eric.
      ====================
      
      Link: https://lore.kernel.org/r/20220502084614.24123-1-w@1wt.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ef562489
    • Willy Tarreau's avatar
      tcp: drop the hash_32() part from the index calculation · e8161345
      Willy Tarreau authored
      In commit 190cc824 ("tcp: change source port randomizarion at
      connect() time"), the table_perturb[] array was introduced and an
      index was taken from the port_offset via hash_32(). But it turns
      out that hash_32() performs a multiplication while the input here
      comes from the output of SipHash in secure_seq, that is well
      distributed enough to avoid the need for yet another hash.
      Suggested-by: default avatarAmit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e8161345
    • Willy Tarreau's avatar
      tcp: increase source port perturb table to 2^16 · 4c2c8f03
      Willy Tarreau authored
      Moshe Kol, Amit Klein, and Yossi Gilad reported being able to accurately
      identify a client by forcing it to emit only 40 times more connections
      than there are entries in the table_perturb[] table. The previous two
      improvements consisting in resalting the secret every 10s and adding
      randomness to each port selection only slightly improved the situation,
      and the current value of 2^8 was too small as it's not very difficult
      to make a client emit 10k connections in less than 10 seconds.
      
      Thus we're increasing the perturb table from 2^8 to 2^16 so that the
      same precision now requires 2.6M connections, which is more difficult in
      this time frame and harder to hide as a background activity. The impact
      is that the table now uses 256 kB instead of 1 kB, which could mostly
      affect devices making frequent outgoing connections. However such
      components usually target a small set of destinations (load balancers,
      database clients, perf assessment tools), and in practice only a few
      entries will be visited, like before.
      
      A live test at 1 million connections per second showed no performance
      difference from the previous value.
      Reported-by: default avatarMoshe Kol <moshe.kol@mail.huji.ac.il>
      Reported-by: default avatarYossi Gilad <yossi.gilad@mail.huji.ac.il>
      Reported-by: default avatarAmit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4c2c8f03
    • Willy Tarreau's avatar
      tcp: dynamically allocate the perturb table used by source ports · e9261476
      Willy Tarreau authored
      We'll need to further increase the size of this table and it's likely
      that at some point its size will not be suitable anymore for a static
      table. Let's allocate it on boot from inet_hashinfo2_init(), which is
      called from tcp_init().
      
      Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
      Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
      Cc: Amit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e9261476
    • Willy Tarreau's avatar
      tcp: add small random increments to the source port · ca7af040
      Willy Tarreau authored
      Here we're randomly adding between 0 and 7 random increments to the
      selected source port in order to add some noise in the source port
      selection that will make the next port less predictable.
      
      With the default port range of 32768-60999 this means a worst case
      reuse scenario of 14116/8=1764 connections between two consecutive
      uses of the same port, with an average of 14116/4.5=3137. This code
      was stressed at more than 800000 connections per second to a fixed
      target with all connections closed by the client using RSTs (worst
      condition) and only 2 connections failed among 13 billion, despite
      the hash being reseeded every 10 seconds, indicating a perfectly
      safe situation.
      
      Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
      Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
      Cc: Amit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ca7af040
    • Eric Dumazet's avatar
      tcp: resalt the secret every 10 seconds · 4dfa9b43
      Eric Dumazet authored
      In order to limit the ability for an observer to recognize the source
      ports sequence used to contact a set of destinations, we should
      periodically shuffle the secret. 10 seconds looks effective enough
      without causing particular issues.
      
      Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
      Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
      Cc: Amit Klein <aksecurity@gmail.com>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Tested-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4dfa9b43
    • Willy Tarreau's avatar
      tcp: use different parts of the port_offset for index and offset · 9e9b70ae
      Willy Tarreau authored
      Amit Klein suggests that we use different parts of port_offset for the
      table's index and the port offset so that there is no direct relation
      between them.
      
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
      Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
      Cc: Amit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9e9b70ae
    • Willy Tarreau's avatar
      secure_seq: use the 64 bits of the siphash for port offset calculation · b2d05756
      Willy Tarreau authored
      SipHash replaced MD5 in secure_ipv{4,6}_port_ephemeral() via commit
      7cd23e53 ("secure_seq: use SipHash in place of MD5"), but the output
      remained truncated to 32-bit only. In order to exploit more bits from the
      hash, let's make the functions return the full 64-bit of siphash_3u32().
      We also make sure the port offset calculation in __inet_hash_connect()
      remains done on 32-bit to avoid the need for div_u64_rem() and an extra
      cost on 32-bit systems.
      
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Cc: Moshe Kol <moshe.kol@mail.huji.ac.il>
      Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il>
      Cc: Amit Klein <aksecurity@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b2d05756
    • Jakub Kicinski's avatar
      Merge branch 'wireguard-patches-for-5-18-rc6' · 205557ba
      Jakub Kicinski authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard patches for 5.18-rc6
      
      In working on some other problems, I wound up leaning on the WireGuard
      CI more than usual and uncovered a few small issues with reliability.
      These are fairly low key changes, since they don't impact kernel code
      itself.
      
      One change does stick out in particular, though, which is the "make
      routing loop test non-fatal" commit. I'm not thrilled about doing this,
      but currently [1] remains unsolved, and I'm still working on a real
      solution to that (hopefully for 5.19 or 5.20 if I can come up with a
      good idea...), so for now that test just prints a big red warning
      instead.
      
      [1] https://lore.kernel.org/netdev/YmszSXueTxYOC41G@zx2c4.com/
      ====================
      
      Link: https://lore.kernel.org/r/20220504202920.72908-1-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      205557ba
    • Jason A. Donenfeld's avatar
      wireguard: selftests: set panic_on_warn=1 from cmdline · 3fc1b11e
      Jason A. Donenfeld authored
      Rather than setting this once init is running, set panic_on_warn from
      the kernel command line, so that it catches splats from WireGuard
      initialization code and the various crypto selftests.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3fc1b11e
    • Jason A. Donenfeld's avatar
      wireguard: selftests: bump package deps · a6b8ea91
      Jason A. Donenfeld authored
      Use newer, more reliable package dependencies. These should hopefully
      reduce flakes. However, we keep the old iputils package, as it
      accumulated bugs after resulting in flakes on slow machines.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a6b8ea91
    • Jason A. Donenfeld's avatar
      wireguard: selftests: restore support for ccache · d261ba6a
      Jason A. Donenfeld authored
      When moving to non-system toolchains, we inadvertantly killed the
      ability to use ccache. So instead, build ccache support into the test
      harness directly.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d261ba6a
    • Jason A. Donenfeld's avatar
      wireguard: selftests: use newer toolchains to fill out architectures · d5d9b29b
      Jason A. Donenfeld authored
      Rather than relying on the system to have cross toolchains available,
      simply download musl.cc's ones and use that libc.so, and then we use it
      to fill in a few missing platforms, such as riscv64, riscv64, powerpc64,
      and s390x.
      
      Since riscv doesn't have a second serial port in its device description,
      we have to use virtio's vport. This is actually the same situation on
      ARM, but we were previously hacking QEMU up to work around this, which
      required a custom QEMU. Instead just do the vport trick on ARM too.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d5d9b29b
    • Jason A. Donenfeld's avatar
      wireguard: selftests: limit parallelism to $(nproc) tests at once · 39f02bf1
      Jason A. Donenfeld authored
      The parallel tests were added to catch queueing issues from multiple
      cores. But what happens in reality when testing tons of processes is
      that these separate threads wind up fighting with the scheduler, and we
      wind up with contention in places we don't care about that decrease the
      chances of hitting a bug. So just do a test with the number of CPU
      cores, rather than trying to scale up arbitrarily.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      39f02bf1
    • Jason A. Donenfeld's avatar
      wireguard: selftests: make routing loop test non-fatal · ae2de669
      Jason A. Donenfeld authored
      I hate to do this, but I still do not have a good solution to actually
      fix this bug across architectures. So just disable it for now, so that
      the CI can still deliver actionable results. This commit adds a large
      red warning, so that at least the failure isn't lost forever, and
      hopefully this can be revisited down the line.
      
      Link: https://lore.kernel.org/netdev/CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@mail.gmail.com/
      Link: https://lore.kernel.org/netdev/YmszSXueTxYOC41G@zx2c4.com/
      Link: https://lore.kernel.org/wireguard/CAHmME9rNnBiNvBstb7MPwK-7AmAN0sOfnhdR=eeLrowWcKxaaQ@mail.gmail.com/Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ae2de669
  4. 04 May, 2022 14 commits