1. 10 Sep, 2024 1 commit
  2. 09 Sep, 2024 6 commits
  3. 06 Sep, 2024 5 commits
  4. 05 Sep, 2024 26 commits
  5. 04 Sep, 2024 2 commits
    • Yonghong Song's avatar
      selftests/bpf: Add a selftest for x86 jit convergence issues · eff5b5ff
      Yonghong Song authored
      The core part of the selftest, i.e., the je <-> jmp cycle, mimics the
      original sched-ext bpf program. The test will fail without the
      previous patch.
      
      I tried to create some cases for other potential cycles
      (je <-> je, jmp <-> je and jmp <-> jmp) with similar pattern
      to the test in this patch, but failed. So this patch
      only contains one test for je <-> jmp cycle.
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/r/20240904221256.37389-1-yonghong.song@linux.devSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      eff5b5ff
    • Yonghong Song's avatar
      bpf, x64: Fix a jit convergence issue · c8831bdb
      Yonghong Song authored
      Daniel Hodges reported a jit error when playing with a sched-ext program.
      The error message is:
        unexpected jmp_cond padding: -4 bytes
      
      But further investigation shows the error is actual due to failed
      convergence. The following are some analysis:
      
        ...
        pass4, final_proglen=4391:
          ...
          20e:    48 85 ff                test   rdi,rdi
          211:    74 7d                   je     0x290
          213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
          ...
          289:    48 85 ff                test   rdi,rdi
          28c:    74 17                   je     0x2a5
          28e:    e9 7f ff ff ff          jmp    0x212
          293:    bf 03 00 00 00          mov    edi,0x3
      
      Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
      and insn at 0x28e is 5-byte jmp insn with offset -129.
      
        pass5, final_proglen=4392:
          ...
          20e:    48 85 ff                test   rdi,rdi
          211:    0f 84 80 00 00 00       je     0x297
          217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
          ...
          28d:    48 85 ff                test   rdi,rdi
          290:    74 1a                   je     0x2ac
          292:    eb 84                   jmp    0x218
          294:    bf 03 00 00 00          mov    edi,0x3
      
      Note that insn at 0x211 is 6-byte cond jump insn now since its offset
      becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
      time, insn at 0x292 is a 2-byte insn since its offset is -124.
      
      pass6 will repeat the same code as in pass4. pass7 will repeat the same
      code as in pass5, and so on. This will prevent eventual convergence.
      
      Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
      insn looks like:
      
          211:    0f 84 80 00 00 00       je     0x297
          217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
          ...
          24d:    48 85 d2                test   rdx,rdx
      
      The similar code in pass14:
          211:    74 7d                   je     0x290
          213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
          ...
          249:    48 85 d2                test   rdx,rdx
          24c:    74 21                   je     0x26f
          24e:    48 01 f7                add    rdi,rsi
          ...
      
      Before generating the following insn,
        250:    74 21                   je     0x273
      "padding = 1" enables some checking to ensure nops is either 0 or 4
      where
        #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
        nops = INSN_SZ_DIFF - 2
      
      In this specific case,
        addrs[i] = 0x24e // from pass14
        addrs[i-1] = 0x24d // from pass15
        prog - temp = 3 // from 'test rdx,rdx' in pass15
      so
        nops = -4
      and this triggers the failure.
      
      To fix the issue, we need to break cycles of je <-> jmp. For example,
      in the above case, we have
        211:    74 7d                   je     0x290
      the offset is 0x7d. If 2-byte je insn is generated only if
      the offset is less than 0x7d (<= 0x7c), the cycle can be
      break and we can achieve the convergence.
      
      I did some study on other cases like je <-> je, jmp <-> je and
      jmp <-> jmp which may cause cycles. Those cases are not from actual
      reproducible cases since it is pretty hard to construct a test case
      for them. the results show that the offset <= 0x7b (0x7b = 123) should
      be enough to cover all cases. This patch added a new helper to generate 8-bit
      cond/uncond jmp insns only if the offset range is [-128, 123].
      Reported-by: default avatarDaniel Hodges <hodgesd@meta.com>
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.devSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c8831bdb