1. 29 Nov, 2021 35 commits
  2. 27 Nov, 2021 5 commits
    • Jakub Kicinski's avatar
      Merge branch 'af_unix-replace-unix_table_lock-with-per-hash-locks' · d40ce48c
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      af_unix: Replace unix_table_lock with per-hash locks.
      
      The hash table of AF_UNIX sockets is protected by a single big lock,
      unix_table_lock.  This series replaces it with small per-hash locks.
      
      1st -  2nd : Misc refactoring
      3rd -  8th : Separate BSD/abstract address logics
      9th - 11th : Prep to save a hash in each socket
      12th       : Replace the big lock
      13th       : Speed up autobind()
      
      Note to maintainers:
      The 12th patch adds two kinds of Sparse warnings on patchwork:
      
        about unix_table_double_lock/unlock()
          We can avoid this by adding two apparent acquires/releases annotations,
          but there are the same kinds of warnings about unix_state_double_lock().
      
        about unix_next_socket() and unix_seq_stop() (/proc/net/unix)
          This is because Sparse does not understand logic in unix_next_socket(),
          which leaves a spin lock held until it returns NULL.
          Also, tcp_seq_stop() causes a warning for the same reason.
      
      These warnings seem reasonable, but let me know if there is any better way.
      Please see [0] for details.
      
      [0]: https://lore.kernel.org/netdev/20211117001611.74123-1-kuniyu@amazon.co.jp/
      ====================
      
      Link: https://lore.kernel.org/r/20211124021431.48956-1-kuniyu@amazon.co.jpSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d40ce48c
    • Kuniyuki Iwashima's avatar
      af_unix: Relax race in unix_autobind(). · 9acbc584
      Kuniyuki Iwashima authored
      When we bind an AF_UNIX socket without a name specified, the kernel selects
      an available one from 0x00000 to 0xFFFFF.  unix_autobind() starts searching
      from a number in the 'static' variable and increments it after acquiring
      two locks.
      
      If multiple processes try autobind, they obtain the same lock and check if
      a socket in the hash list has the same name.  If not, one process uses it,
      and all except one end up retrying the _next_ number (actually not, it may
      be incremented by the other processes).  The more we autobind sockets in
      parallel, the longer the latency gets.  We can avoid such a race by
      searching for a name from a random number.
      
      These show latency in unix_autobind() while 64 CPUs are simultaneously
      autobind-ing 1024 sockets for each.
      
        Without this patch:
      
           usec          : count     distribution
              0          : 1176     |***                                     |
              2          : 3655     |***********                             |
              4          : 4094     |*************                           |
              6          : 3831     |************                            |
              8          : 3829     |************                            |
              10         : 3844     |************                            |
              12         : 3638     |***********                             |
              14         : 2992     |*********                               |
              16         : 2485     |*******                                 |
              18         : 2230     |*******                                 |
              20         : 2095     |******                                  |
              22         : 1853     |*****                                   |
              24         : 1827     |*****                                   |
              26         : 1677     |*****                                   |
              28         : 1473     |****                                    |
              30         : 1573     |*****                                   |
              32         : 1417     |****                                    |
              34         : 1385     |****                                    |
              36         : 1345     |****                                    |
              38         : 1344     |****                                    |
              40         : 1200     |***                                     |
      
        With this patch:
      
           usec          : count     distribution
              0          : 1855     |******                                  |
              2          : 6464     |*********************                   |
              4          : 9936     |********************************        |
              6          : 12107    |****************************************|
              8          : 10441    |**********************************      |
              10         : 7264     |***********************                 |
              12         : 4254     |**************                          |
              14         : 2538     |********                                |
              16         : 1596     |*****                                   |
              18         : 1088     |***                                     |
              20         : 800      |**                                      |
              22         : 670      |**                                      |
              24         : 601      |*                                       |
              26         : 562      |*                                       |
              28         : 525      |*                                       |
              30         : 446      |*                                       |
              32         : 378      |*                                       |
              34         : 337      |*                                       |
              36         : 317      |*                                       |
              38         : 314      |*                                       |
              40         : 298      |                                        |
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9acbc584
    • Kuniyuki Iwashima's avatar
      af_unix: Replace the big lock with small locks. · afd20b92
      Kuniyuki Iwashima authored
      The hash table of AF_UNIX sockets is protected by the single lock.  This
      patch replaces it with per-hash locks.
      
      The effect is noticeable when we handle multiple sockets simultaneously.
      Here is a test result on an EC2 c5.24xlarge instance.  It shows latency
      (under 10us only) in unix_insert_unbound_socket() while 64 CPUs creating
      1024 sockets for each in parallel.
      
        Without this patch:
      
           nsec          : count     distribution
              0          : 179      |                                        |
              500        : 3021     |*********                               |
              1000       : 6271     |*******************                     |
              1500       : 6318     |*******************                     |
              2000       : 5828     |*****************                       |
              2500       : 5124     |***************                         |
              3000       : 4426     |*************                           |
              3500       : 3672     |***********                             |
              4000       : 3138     |*********                               |
              4500       : 2811     |********                                |
              5000       : 2384     |*******                                 |
              5500       : 2023     |******                                  |
              6000       : 1954     |*****                                   |
              6500       : 1737     |*****                                   |
              7000       : 1749     |*****                                   |
              7500       : 1520     |****                                    |
              8000       : 1469     |****                                    |
              8500       : 1394     |****                                    |
              9000       : 1232     |***                                     |
              9500       : 1138     |***                                     |
              10000      : 994      |***                                     |
      
        With this patch:
      
           nsec          : count     distribution
              0          : 1634     |****                                    |
              500        : 13170    |****************************************|
              1000       : 13156    |*************************************** |
              1500       : 9010     |***************************             |
              2000       : 6363     |*******************                     |
              2500       : 4443     |*************                           |
              3000       : 3240     |*********                               |
              3500       : 2549     |*******                                 |
              4000       : 1872     |*****                                   |
              4500       : 1504     |****                                    |
              5000       : 1247     |***                                     |
              5500       : 1035     |***                                     |
              6000       : 889      |**                                      |
              6500       : 744      |**                                      |
              7000       : 634      |*                                       |
              7500       : 498      |*                                       |
              8000       : 433      |*                                       |
              8500       : 355      |*                                       |
              9000       : 336      |*                                       |
              9500       : 284      |                                        |
              10000      : 243      |                                        |
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      afd20b92
    • Kuniyuki Iwashima's avatar
      af_unix: Save hash in sk_hash. · e6b4b873
      Kuniyuki Iwashima authored
      To replace unix_table_lock with per-hash locks in the next patch, we need
      to save a hash in each socket because /proc/net/unix or BPF prog iterate
      sockets while holding a hash table lock and release it later in a different
      function.
      
      Currently, we store a real/pseudo hash in struct unix_address.  However, we
      do not allocate it to unbound sockets, nor should we do just for that.  For
      this purpose, we can use sk_hash.  Then, we no longer use the hash field in
      struct unix_address and can remove it.
      
      Also, this patch does
        - rename unix_insert_socket() to unix_insert_unbound_socket()
        - remove the redundant list argument from __unix_insert_socket() and
           unix_insert_unbound_socket()
        - use 'unsigned int' instead of 'unsigned' in __unix_set_addr_hash()
        - remove 'inline' from unix_remove_socket() and
           unix_insert_unbound_socket().
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e6b4b873
    • Kuniyuki Iwashima's avatar
      af_unix: Add helpers to calculate hashes. · f452be49
      Kuniyuki Iwashima authored
      This patch adds three helper functions that calculate hashes for unbound
      sockets and bound sockets with BSD/abstract addresses.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f452be49