HADOOP-16357. TeraSort Job failing on S3 DirectoryStagingCommitter: destination path exists. Contributed by Steve Loughran.
* changes the default for the staging committer to append, as we get for the classic FileOutputFormat committer * adds a check for the dest path being a file not a dir * adds tests for this * Changes AbstractCommitTerasortIT. to not use the simple parser, so fails if the file is present.
HADOOP-16085. S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite. Contributed by Ben Roling.
S3Guard will now track the etag of uploaded files and, if an S3 bucket is versioned, the object version.
You can then control how to react to a mismatch between the data in the DynamoDB table and that in the store: warn, fail, or, when using versions, return the original value.
This adds two new columns to the table: etag and version. This is transparent to older S3A clients -but when such clients add/update data to the S3Guard table, they will not add these values. As a result, the etag/version checks will not work with files uploaded by older clients.
For a consistent experience, upgrade all clients to use the latest hadoop version.