Non-MD5 Etag returned for objects #479

Open
opened 2025-12-28 17:37:32 +00:00 by sami · 2 comments
Owner

Originally created by @roman-khimov on GitHub (Jan 10, 2025).

Current Behavior

Known, documented, we return SHA256 hash in Etag.

Expected Behavior

Applications expect MD5 in some cases. See https://github.com/nspcc-dev/neofs-s3-gw/pull/1030#issuecomment-2540725432

Possible Solution

Probably this can be conditional. AWS docs mention that MD5 is used in a limited number of cases and these cases can be OK for us. We can also consider calculating/storing MD5, search is pretty much the same for payload hash and some S3-specific attribute. Yeah, MD5 sucks, but who cares when we can't provide compatibility with real applications.

Steps to Reproduce

Run Nexus3 against S3 gateway.

Context

https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html

Regression

No.

Your Environment

  • Version of the product used: 0.34.0
Originally created by @roman-khimov on GitHub (Jan 10, 2025). ## Current Behavior Known, documented, we return SHA256 hash in Etag. ## Expected Behavior Applications expect MD5 in some cases. See https://github.com/nspcc-dev/neofs-s3-gw/pull/1030#issuecomment-2540725432 ## Possible Solution Probably this can be conditional. AWS docs mention that MD5 is used in a limited number of cases and these cases can be OK for us. We can also consider calculating/storing MD5, search is pretty much the same for payload hash and some S3-specific attribute. Yeah, MD5 sucks, but who cares when we can't provide compatibility with real applications. ## Steps to Reproduce Run Nexus3 against S3 gateway. ## Context https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html ## Regression No. ## Your Environment * Version of the product used: 0.34.0
Author
Owner

@roman-khimov commented on GitHub (Jan 10, 2025):

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html also

@roman-khimov commented on GitHub (Jan 10, 2025): https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html also
Author
Owner

@roman-khimov commented on GitHub (May 25, 2025):

The key problem is that Etag should be consistent, it's not sufficient to calculate MD5, return and forget it. The next GET should return the same Etag and request can have an Etag as well for us to compare. So it needs to be stored somewhere. But it's a hash, we can't store it in an attribute for split NeoFS objects, it needs to be a proper NeoFS-level hash then (like nspcc-dev/neofs-api@a5a1f32630/object/types.proto (L161)) and this means NeoFS modifications. The other option is to store it in an additional object, but this obviously affects performance.

@roman-khimov commented on GitHub (May 25, 2025): The key problem is that Etag should be consistent, it's not sufficient to calculate MD5, return and forget it. The next GET should return the same Etag and request can have an Etag as well for us to compare. So it needs to be stored somewhere. But it's a hash, we can't store it in an attribute for split NeoFS objects, it needs to be a proper NeoFS-level hash then (like https://github.com/nspcc-dev/neofs-api/blob/a5a1f32630537511f8faf07de9b8529915475622/object/types.proto#L161) and this means NeoFS modifications. The other option is to store it in an additional object, but this obviously affects performance.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neofs-s3-gw#479
No description provided.