• Stan Hu's avatar
    Fix LFS not working with S3 specific-storage settings · da419833
    Stan Hu authored
    https://gitlab.com/gitlab-org/gitlab/-/merge_requests/48269 enabled LFS
    clients to use chunked transfers via `Transfer-Encoding: chunked`.
    
    However, in some cases, this breaks uploads to AWS S3 if
    specific-storage settings are used.  We were able to reproduce this
    problem with Vagrant, but not with VMs on AWS or GCP.
    
    When direct uploads are used, GitLab will only generate pre-signed,
    multipart URLs if the `Content-Length` parameter is sent by the
    client. Previously when chunked transfers were not supported, this
    header was always available and thus was hard-coded in the LFS storage
    controller.
    
    When this header is no longer available, the Workhorse `BodyUploader`
    attempts to transfer the file with an S3 PutObject API call with
    `Transfer-Encoding: chunked`.  This will return a `501 Not Implemented`
    error since this header is not supported; S3 has a different mechanism
    to chunked transfers. Note that Google Cloud Storage works fine in this
    manner.
    
    Now that `Content-Length` is not always available, we have a few
    options:
    
    1. Disable LFS chunked transfers.
    2. Re-enable request buffering in NGINX.
    3. Modify Workhorse to tell us whether `Content-Length` was sent.
    4. Be pessimistic and always generate multipart URLs and use a max
    length with the size of file.
    
    Option 1 is not optimal because we still want to support LFS chunked
    transfers, especially for GitLab.com where Cloudflare will reject files
    over 5 GB.
    
    Option 2 is not desirable either because it causes NGINX to store a
    large temporary file and delays the transfer to object storage.
    
    Option 3 is a slightly preferable option, but it involves modifying
    Workhorse as well. We should consider this as a follow-up issue.
    
    To fix the immediate problem, we implement option 4. Note that using
    consolidated object storage settings avoids this problem because
    Workhorse handles the upload natively with the AWS SDK and doesn't need
    presigned URLs.
    
    Relates to https://gitlab.com/gitlab-org/gitlab-workhorse/-/issues/292
    da419833
lfs_http_spec.rb 41.4 KB