Clone Tools
  • last updated 24 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
HADOOP-16424. S3Guard fsck: Check internal consistency of the MetadataStore (#1691). Contributed by Gabor Bota.

    • -2
    • +118
    ./fs/s3a/s3guard/ITestS3GuardFsck.java
  1. … 4 more files in changeset.
HADOOP-16709. S3Guard: Make authoritative mode exclusive for metadata - don't check for expiry for authoritative paths (#1721). Contributed by Gabor Bota.

    • -16
    • +49
    ./fs/s3a/ITestS3GuardOutOfBandOperations.java
  1. … 5 more files in changeset.
HADOOP-16632 Speculating & Partitioned S3A magic committers can leave pending files under __magic (#1599)

Contributed by Steve Loughran.

This downgrade the checks for leftover __magic entries from fail to warn now the parallel

test runs make speculation more likely.

Change-Id: Ia4df2e90f82a06dbae69f3fdaadcbb0e0d713b38

HADOOP-16665. Filesystems to be closed if they failed during initialize().

Contributed by Steve Loughran.

This FileSystem instantiation so if an IOException or RuntimeException is

raised in the invocation of FileSystem.initialize() then a best-effort

attempt is made to close the FS instance; exceptions raised that there

are swallowed.

The S3AFileSystem is also modified to do its own cleanup if an

IOException is raised during its initialize() process, it being the

FS we know has the "potential" to leak threads, especially in

extension points (e.g AWS Authenticators) which spawn threads.

Change-Id: Ib84073a606c9d53bf53cbfca4629876a03894f04

    • -2
    • +2
    ./fs/s3a/auth/ITestRestrictedReadAccess.java
  1. … 8 more files in changeset.
HADOOP-16477. S3A delegation token tests fail if fs.s3a.encryption.key set.

Contributed by Steve Loughran.

Change-Id: I843989f32472bbdefbd4fa504b26c7a614ab1cee

    • -13
    • +67
    ./fs/s3a/AbstractTestS3AEncryption.java
    • -4
    • +14
    ./fs/s3a/ITestS3AAWSCredentialsProvider.java
    • -25
    • +21
    ./fs/s3a/ITestS3AEncryptionSSEC.java
    • -3
    • +4
    ./fs/s3a/ITestS3AEncryptionSSEKMSDefaultKey.java
    • -7
    • +13
    ./fs/s3a/ITestS3AEncryptionSSEKMSUserDefinedKey.java
    • -1
    • +14
    ./fs/s3a/ITestS3AMiscOperations.java
    • -3
    • +5
    ./fs/s3a/impl/ITestPartialRenamesDeletes.java
  1. … 3 more files in changeset.
HADOOP-16484. S3A to warn or fail if S3Guard is disabled (#1661). Contributed by Gabor Bota.

  1. … 5 more files in changeset.
HADOOP-16653. S3Guard DDB overreacts to no tag access (#1660). Contributed by Gabor Bota.

  1. … 2 more files in changeset.
HADOOP-16658. S3A connector does not support including the token renewer in the token identifier.

Contributed by Phil Zampino.

Change-Id: Iea9d5028dcf58bda4da985604f5cd3ac283619bd

  1. … 10 more files in changeset.
HADOOP-16478. S3Guard bucket-info fails if the caller lacks s3:GetBucketLocation.

Contributed by Steve Loughran.

Includes HADOOP-16651. S3 getBucketLocation() can return "US" for us-east.

Change-Id: Ifc0dca76e51495ed1a8fc0f077b86bf125deff40

    • -0
    • +61
    ./fs/s3a/impl/TestNeworkBinding.java
  1. … 5 more files in changeset.
HADOOP-16635. S3A "directories only" scan still does a HEAD.

Contributed by Steve Loughran.

Change-Id: I5e41d7f721364c392e1f4344db83dfa8c5aa06ce

    • -6
    • +134
    ./fs/s3a/ITestS3AFileOperationCost.java
  1. … 2 more files in changeset.
Revert "HADOOP-15870. S3AInputStream.remainingInFile should use nextReadPos."

This reverts commit 7a4b3d42c4e36e468c2a46fd48036a6fed547853.

The patch broke TestRouterWebHDFSContractSeek as it turns out that

WebHDFSInputStream.available() is always 0.

    • -1
    • +1
    ./fs/contract/s3a/ITestS3AContractSeek.java
  1. … 3 more files in changeset.
HADOOP-16520. Race condition in DDB table init and waiting threads. (#1576). Contributed by Gabor Bota.

Fixes HADOOP-16349. DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

Change-Id: Ia83e92b9039ccb780090c99c41b4f71ef7539d35

    • -53
    • +120
    ./fs/s3a/s3guard/ITestDynamoDBMetadataStore.java
  1. … 6 more files in changeset.
HADOOP-15870. S3AInputStream.remainingInFile should use nextReadPos.

Contributed by lqjacklee.

Change-Id: I32bb00a683102e7ff8ff8ce0b8d9c3195ca7381c

    • -1
    • +1
    ./fs/contract/s3a/ITestS3AContractSeek.java
  1. … 3 more files in changeset.
HADOOP-16650. ITestS3AClosedFS failing.

Contributed by Steve Loughran.

Change-Id: Ia9bb84bd6455e210a54cfe9eb944feeda8b58da9

HADOOP-16626. S3A ITestRestrictedReadAccess fails without S3Guard.

Contributed by Steve Loughran.

Change-Id: Ife730b80057ddd43e919438cb5b2abbda990e636

    • -90
    • +167
    ./fs/s3a/auth/ITestRestrictedReadAccess.java
HADOOP-16570. S3A committers encounter scale issues.

Contributed by Steve Loughran.

This addresses two scale issues which has surfaced in large scale benchmarks

of the S3A Committers.

* Thread pools are not cleaned up.

This now happens, with tests.

* OOM on job commit for jobs with many thousands of tasks,

each generating tens of (very large) files.

Instead of loading all pending commits into memory as a single list, the list

of files to load is the sole list which is passed around; .pendingset files are

loaded and processed in isolation -and reloaded if necessary for any

abort/rollback operation.

The parallel commit/abort/revert operations now work at the .pendingset level,

rather than that of individual pending commit files. The existing parallelized

Tasks API is still used to commit those files, but with a null thread pool, so

as to serialize the operations.

Change-Id: I5c8240cd31800eaa83d112358770ca0eb2bca797

    • -10
    • +64
    ./fs/s3a/commit/AbstractITCommitProtocol.java
    • -15
    • +51
    ./fs/s3a/commit/staging/StagingTestBase.java
    • -0
    • +314
    ./fs/s3a/commit/staging/TestDirectoryCommitterScale.java
    • -15
    • +14
    ./fs/s3a/commit/staging/TestStagingCommitter.java
  1. … 11 more files in changeset.
HADOOP-16207 Improved S3A MR tests.

Contributed by Steve Loughran.

Replaces the committer-specific terasort and MR test jobs with parameterization

of the (now single tests) and use of file:// over hdfs:// as the cluster FS.

The parameterization ensures that only one of the specific committer tests

run at a time -overloads of the test machines are less likely, and so the

suites can be pulled back into the parallel phase.

There's also more detailed validation of the stage outputs of the terasorting;

if one test fails the rest are all skipped. This and the fact that job

output is stored under target/yarn-${timestamp} means failures should

be more debuggable.

Change-Id: Iefa370ba73c6419496e6e69dd6673d00f37ff095

    • -4
    • +14
    ./fs/s3a/commit/AbstractCommitITest.java
    • -223
    • +0
    ./fs/s3a/commit/AbstractITCommitMRJob.java
    • -51
    • +145
    ./fs/s3a/commit/AbstractYarnClusterITest.java
    • -120
    • +0
    ./fs/s3a/commit/magic/ITestMagicCommitMRJob.java
    • -0
    • +377
    ./fs/s3a/commit/terasort/ITestTerasortOnS3A.java
  1. … 4 more files in changeset.
HADOOP-16599. Allow a SignerInitializer to be specified along with a Custom Signer

    • -0
    • +237
    ./fs/s3a/auth/ITestCustomSigner.java
    • -0
    • +590
    ./fs/s3a/auth/TestSignerManager.java
  1. … 7 more files in changeset.
HADOOP-16458. LocatedFileStatusFetcher.getFileStatuses failing intermittently with S3

Contributed by Steve Loughran.

Includes

-S3A glob scans don't bother trying to resolve symlinks

-stack traces don't get lost in getFileStatuses() when exceptions are wrapped

-debug level logging of what is up in Globber

-Contains HADOOP-13373. Add S3A implementation of FSMainOperationsBaseTest.

-ITestRestrictedReadAccess tests incomplete read access to files.

This adds a builder API for constructing globbers which other stores can use

so that they too can skip symlink resolution when not needed.

Change-Id: I23bcdb2783d6bd77cf168fdc165b1b4b334d91c7

    • -0
    • +40
    ./fs/s3a/ITestLocatedFileStatusFetcher.java
    • -0
    • +65
    ./fs/s3a/ITestS3AFSMainOperations.java
    • -0
    • +707
    ./fs/s3a/auth/ITestRestrictedReadAccess.java
  1. … 10 more files in changeset.
HADOOP-16600. StagingTestBase uses methods not available in Mockito 1.8.5 in branch-3.1

Signed-off-by: Steve Loughran <stevel@apache.org>

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>

Signed-off-by: stack <stack@apache.org>

    • -7
    • +23
    ./fs/s3a/commit/staging/StagingTestBase.java
HADOOP-15691 Add PathCapabilities to FileSystem and FileContext.

Contributed by Steve Loughran.

This complements the StreamCapabilities Interface by allowing applications to probe for a specific path on a specific instance of a FileSystem client

to offer a specific capability.

This is intended to allow applications to determine

* Whether a method is implemented before calling it and dealing with UnsupportedOperationException.

* Whether a specific feature is believed to be available in the remote store.

As well as a common set of capabilities defined in CommonPathCapabilities,

file systems are free to add their own capabilities, prefixed with

fs. + schema + .

The plan is to identify and document more capabilities -and for file systems which add new features, for a declaration of the availability of the feature to always be available.

Note

* The remote store is not expected to be checked for the feature;

It is more a check of client API and the client's configuration/knowledge

of the state of the remote system.

* Permissions are not checked.

Change-Id: I80bfebe94f4a8bdad8f3ac055495735b824968f5

    • -4
    • +1
    ./fs/s3a/commit/ITestCommitOperations.java
  1. … 35 more files in changeset.
HADOOP-16591 Fix S3A ITest*MRjob failures.

Contributed by Siddharth Seth.

Change-Id: I7f08201c9f7c0551514049389b5b398a84855191

HADOOP-16445. Allow separate custom signing algorithms for S3 and DDB (#1332)

    • -0
    • +68
    ./fs/s3a/ITestS3AConfiguration.java
    • -1
    • +1
    ./fs/s3a/ITestS3ATemporaryCredentials.java
    • -0
    • +130
    ./fs/s3a/TestSignerManager.java
  1. … 8 more files in changeset.
HADOOP-16547. make sure that s3guard prune sets up the FS (#1402). Contributed by Steve Loughran.

Change-Id: Iaf71561cef6c797a3c66fed110faf08da6cac361

    • -8
    • +37
    ./fs/s3a/s3guard/AbstractS3GuardToolTestBase.java
  1. … 1 more file in changeset.
HADOOP-16371: Option to disable GCM for SSL connections when running on Java 8.

Contributed by Sahil Takiar.

This moves the SSLSocketFactoryEx class from hadoop-azure into hadoop-common

as the DelegatingSSLSocketFactory and binds the S3A connector to it so that

it can avoid using those HTTPS algorithms which are underperformant on Java 8.

Change-Id: Ie9e6ac24deac1aa05e136e08899620efa7d22abd

    • -5
    • +17
    ./fs/contract/s3a/ITestS3AContractSeek.java
  1. … 15 more files in changeset.
HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) (#1208). Contributed by Gabor Bota.

Change-Id: I6bbb331b6c0a41c61043e482b95504fda8a50596

    • -6
    • +5
    ./fs/s3a/ITestS3GuardOutOfBandOperations.java
    • -0
    • +504
    ./fs/s3a/s3guard/ITestS3GuardFsck.java
  1. … 4 more files in changeset.
HADOOP-16490. Avoid/handle cached 404s during S3A file creation.

Contributed by Steve Loughran.

This patch avoids issuing any HEAD path request when creating a file with overwrite=true,

so 404s will not end up in the S3 load balancers unless someone calls getFileStatus/exists/isFile

in their own code.

The Hadoop FsShell CommandWithDestination class is modified to not register uncreated files

for deleteOnExit(), because that calls exists() and so can place the 404 in the cache, even

after S3A is patched to not do it itself.

Because S3Guard knows when a file should be present, it adds a special FileNotFound retry policy

independently configurable from other retry policies; it is also exponential, but with

different parameters. This is because every HEAD request will refresh any 404 cached in

the S3 Load Balancers. It's not enough to retry: we have to have a suitable gap between

attempts to (hopefully) ensure any cached entry wil be gone.

The options and values are:

fs.s3a.s3guard.consistency.retry.interval: 2s

fs.s3a.s3guard.consistency.retry.limit: 7

The S3A copy() method used during rename() raises a RemoteFileChangedException which is not caught

so not downgraded to false. Thus: when a rename is unrecoverable, this fact is propagated.

Copy operations without S3Guard lack the confidence that the file exists, so don't retry the same way:

it will fail fast with a different error message. However, because create(path, overwrite=false) no

longer does HEAD path, we can at least be confident that S3A itself is not creating those cached

404 markers.

Change-Id: Ia7807faad8b9a8546836cb19f816cccf17cca26d

    • -4
    • +16
    ./fs/s3a/ITestS3AFileOperationCost.java
    • -5
    • +144
    ./fs/s3a/ITestS3ARemoteFileChanged.java
    • -2
    • +5
    ./fs/s3a/ITestS3GuardOutOfBandOperations.java
    • -8
    • +17
    ./fs/s3a/commit/AbstractITCommitMRJob.java
  1. … 16 more files in changeset.
HADOOP-16430. S3AFilesystem.delete to incrementally update s3guard with deletions

Contributed by Steve Loughran.

This overlaps the scanning for directory entries with batched calls to S3 DELETE and updates of the S3Guard tables.

It also uses S3Guard to list the files to delete, so find newly created files even when S3 listings are not use consistent.

For path which the client considers S3Guard to be authoritative, we also do a recursive LIST of the store and delete files; this is to find unindexed files and do guarantee that the delete(path, true) call really does delete everything underneath.

Change-Id: Ice2f6e940c506e0b3a78fa534a99721b1698708e

    • -1
    • +2
    ./fs/contract/s3a/ITestS3AContractRootDir.java
    • -18
    • +24
    ./fs/s3a/ITestS3AMetadataPersistenceException.java
    • -29
    • +97
    ./fs/s3a/ITestS3GuardListConsistency.java
    • -4
    • +59
    ./fs/s3a/ITestS3GuardOutOfBandOperations.java
    • -2
    • +2
    ./fs/s3a/commit/ITestCommitOperations.java
    • -1
    • +3
    ./fs/s3a/impl/ITestPartialRenamesDeletes.java
    • -4
    • +12
    ./fs/s3a/impl/TestPartialDeleteFailures.java
    • -52
    • +42
    ./fs/s3a/scale/ITestS3ADeleteManyFiles.java
  1. … 28 more files in changeset.
Revert "HADOOP-16193. Add extra S3A MPU test to see what happens if a file is created during the MPU. Contributed by Steve Loughran"

This reverts commit 69ddb36876c0b3819e5409d83b27d18d1da89b22.

HADOOP-16193. Add extra S3A MPU test to see what happens if a file is created during the MPU. Contributed by Steve Loughran