Clone
Steve Loughran <stevel@cloudera.com>
committed
on 09 Jan
HADOOP-16697. Tune/audit S3A authoritative mode.
Contains:

HADOOP-16474. S3Guard ProgressiveRenameTracker to mark destination
Show more
HADOOP-16697. Tune/audit S3A authoritative mode.

Contains:

HADOOP-16474. S3Guard ProgressiveRenameTracker to mark destination

             dirirectory as authoritative on success.

HADOOP-16684. S3guard bucket info to list a bit more about

             authoritative paths.

HADOOP-16722. S3GuardTool to support FilterFileSystem.

This patch improves the marking of newly created/import directory

trees in S3Guard DynamoDB tables as authoritative.

Specific changes:

* Renamed directories are marked as authoritative if the entire

  operation succeeded (HADOOP-16474).

* When updating parent table entries as part of any table write,

  there's no overwriting of their authoritative flag.

s3guard import changes:

* new -verbose flag to print out what is going on.

* The "s3guard import" command lets you declare that a directory tree

is to be marked as authoritative

 hadoop s3guard import -authoritative -verbose s3a://bucket/path

When importing a listing and a file is found, the import tool queries

the metastore and only updates the entry if the file is different from

before, where different == new timestamp, etag, or length. S3Guard can get

timestamp differences due to clock skew in PUT operations.

As the recursive list performed by the import command doesn't retrieve the

versionID, the existing entry may in fact be more complete.

When updating an existing due to clock skew the existing version ID

is propagated to the new entry (note: the etags must match; this is needed

to deal with inconsistent listings).

There is a new s3guard command to audit a s3guard bucket/path's

authoritative state:

 hadoop s3guard authoritative -check-config s3a://bucket/path

This is primarily for testing/auditing.

The s3guard bucket-info command also provides some more details on the

authoritative state of a store (HADOOP-16684).

Change-Id: I58001341c04f6f3597fcb4fcb1581ccefeb77d91

Show less

trunk + 3 more