HADOOP-16319. S3A Etag tests fail with default encryption enabled on bucket. Contributed by Ben Roling.
ETag values are unpredictable with some S3 encryption algorithms.
Skip ITestS3AMiscOperations tests which make assertions about etags when default encryption on a bucket is enabled.
When testing with an AWS an account which lacks the privilege for a call to getBucketEncryption(), we don't skip the tests. In the event of failure, developers get to expand the permissions of the account or relax default encryption settings.
HADOOP-16823. Large DeleteObject requests are their own Thundering Herd. Contributed by Steve Loughran.
During S3A rename() and delete() calls, the list of objects delete is built up into batches of a thousand and then POSTed in a single large DeleteObjects request.
But as the IO capacity allowed on an S3 partition may only be 3500 writes per second *and* each entry in that POST counts as a single write, then one of those posts alone can trigger throttling on an already loaded S3 directory tree. Which can trigger backoff and retry, with the same thousand entry post, and so recreate the exact same problem.
* Page size for delete object requests is set in fs.s3a.bulk.delete.page.size; the default is 250. * The property fs.s3a.experimental.aws.s3.throttling (default=true) can be set to false to disable throttle retry logic in the AWS client SDK -it is all handled in the S3A client. This gives more visibility in to when operations are being throttled * Bulk delete throttling events are logged to the log org.apache.hadoop.fs.s3a.throttled log at INFO; if this appears often then choose a smaller page size. * The metric "store_io_throttled" adds the entire count of delete requests when a single DeleteObjects request is throttled. * A new quantile, "store_io_throttle_rate" can track throttling load over time. * DynamoDB metastore throttle resilience issues have also been identified and fixed. Note: the fs.s3a.experimental.aws.s3.throttling flag does not apply to DDB IO precisely because there may still be lurking issues there and it safest to rely on the DynamoDB client SDK.