1. 30 Jul, 2020 1 commit
    • Craig Miskell's avatar
      Respect sidekiq timeout when hard-killing workers · d31730c3
      Craig Miskell authored
      As discovered in
      https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10930,
      the 5 second timeout can be too short as during normal shutdowns getppid
      returns "1" sooner than expected.  But even in a "real" failure case
      where the sidekiq-cluster process is terminated hard, we still need to
      respect the sidekiq timeout so that sidekiq will be able to wait for
      running jobs to complete (or termiante them and push them back into the
      queue) before being killed off.  Otherwise we end up with orphaned jobs
      that are only picked up by the reliable fetcher cleanup, up to an hour
      later.
      d31730c3
  2. 29 Jul, 2020 39 commits