Clone
 

steve loughran <stevel@cloudera.com> in hadoop

HADOOP-17107. hadoop-azure parallel tests not working on recent JDKs (#2118)

Contributed by Steve Loughran.

    • -69
    • +12
    /hadoop-tools/hadoop-azure/pom.xml
HADOOP-17107. hadoop-azure parallel tests not working on recent JDKs (#2118)

Contributed by Steve Loughran.

Change-Id: I972264aed36f384b7ae23e214326ef7870261cf5

    • -69
    • +12
    /hadoop-tools/hadoop-azure/pom.xml
HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext.

Contributed by Steve Loughran.

Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3

  1. … 33 more files in changeset.
HDFS-13934. Multipart uploaders to be created through FileSystem/FileContext.

Contributed by Steve Loughran.

Change-Id: Iebd34140c1a0aa71f44a3f4d0fee85f6bdf123a3

  1. … 33 more files in changeset.
HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963)

Contributed by Steve Loughran.

Fixes a condition which can cause job commit to fail if a task was

aborted < 60s before the job commit commenced: the task abort

will shut down the thread pool with a hard exit after 60s; the

job commit POST requests would be scheduled through the same pool,

so be interrupted and fail. At present the access is synchronized,

but presumably the executor shutdown code is calling wait() and releasing

locks.

Task abort is triggered from the AM when task attempts succeed but

there are still active speculative task attempts running. Thus it

only surfaces when speculation is enabled and the final tasks are

speculating, which, given they are the stragglers, is not unheard of.

Note: this problem has never been seen in production; it has surfaced

in the hadoop-aws tests on a heavily overloaded desktop

Change-Id: I3b433356d01fcc50d88b4353dbca018484984bc8

HADOOP-16798. S3A Committer thread pool shutdown problems. (#1963)

Contributed by Steve Loughran.

Fixes a condition which can cause job commit to fail if a task was

aborted < 60s before the job commit commenced: the task abort

will shut down the thread pool with a hard exit after 60s; the

job commit POST requests would be scheduled through the same pool,

so be interrupted and fail. At present the access is synchronized,

but presumably the executor shutdown code is calling wait() and releasing

locks.

Task abort is triggered from the AM when task attempts succeed but

there are still active speculative task attempts running. Thus it

only surfaces when speculation is enabled and the final tasks are

speculating, which, given they are the stragglers, is not unheard of.

Note: this problem has never been seen in production; it has surfaced

in the hadoop-aws tests on a heavily overloaded desktop

HADOOP-17050 S3A to support additional token issuers

Contributed by Steve Loughran.

S3A delegation token providers will be asked for any additional

token issuers, an array can be returned,

each one will be asked for tokens when DelegationTokenIssuer collects

all the tokens for a filesystem.

HADOOP-17050. S3A to support additional token issuers

Contributed by Steve Loughran.

S3A delegation token providers will be asked for any additional

token issuers, an array can be returned,

each one will be asked for tokens when DelegationTokenIssuer collects

all the tokens for a filesystem.

Change-Id: I1bd3035bbff98cbd8e1d1ac7fc615d937e6bb7bb

HADOOP-16568. S3A FullCredentialsTokenBinding fails if local credentials are unset. (#1441)

Contributed by Steve Loughran.

Move the loading to deployUnbonded (where they are required) and add a safety check when a new DT is requested

Change-Id: I03c69aa2e16accfccddca756b2771ff832e7dd58

HADOOP-16568. S3A FullCredentialsTokenBinding fails if local credentials are unset. (#1441)

Contributed by Steve Loughran.

Move the loading to deployUnbonded (where they are required) and add a safety check when a new DT is requested

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-14557. Document HADOOP-8143 (Change distcp to have -pb on by default)."

This reverts commit 44350fdf495f5cf1bb15b1fe6f6e9587d3de0a59.

It is related to the rollback of HADOOP-8143.

Change-Id: If48e3dd670c920ada702dc36461ff398fe9d35cc

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

Revert "HADOOP-8143. Change distcp to have -pb on by default."

This reverts commit dd65eea74b1f9dde858ff34df8111e5340115511.

Change-Id: I74180cf59d5bbad8c9f66cb331535addcbea863e

HADOOP-16953. tuning s3guard disabled warnings (#1962)

Contributed by Steve Loughran.

The S3Guard absence warning of HADOOP-16484 has been changed

so that by default the S3A connector only logs at debug

when the connection to the S3 Store does not have S3Guard

enabled.

The option to control this log level is now

fs.s3a.s3guard.disabled.warn.level

and can be one of: silent, inform, warn, fail.

On a failure, an ExitException is raised with exit code 49.

For details on this safety feature, consult the s3guard documentation.

HADOOP-16953. tuning s3guard disabled warnings (#1962)

Contributed by Steve Loughran.

The S3Guard absence warning of HADOOP-16484 has been changed

so that by default the S3A connector only logs at debug

when the connection to the S3 Store does not have S3Guard

enabled.

The option to control this log level is now

fs.s3a.s3guard.disabled.warn.level

and can be one of: silent, inform, warn, fail.

On a failure, an ExitException is raised with exit code 49.

For details on this safety feature, consult the s3guard documentation.

Change-Id: If868671c9260977c2b03b3e475b9c9531c98ce79

HADOOP-16986. S3A to not need wildfly on the classpath. (#1948)

Contributed by Steve Loughran.

This is a successor to HADOOP-16346, which enabled the S3A connector

to load the native openssl SSL libraries for better HTTPS performance.

That patch required wildfly.jar to be on the classpath. This

update:

* Makes wildfly.jar optional except in the special case that

"fs.s3a.ssl.channel.mode" is set to "openssl"

* Retains the declaration of wildfly.jar as a compile-time

dependency in the hadoop-aws POM. This means that unless

explicitly excluded, applications importing that published

maven artifact will, transitively, add the specified

wildfly JAR into their classpath for compilation/testing/

distribution.

This is done for packaging and to offer that optional

speedup. It is not mandatory: applications importing

the hadoop-aws POM can exclude it if they choose.

Change-Id: I7ed3e5948d1e10ce21276b3508871709347e113d

HADOOP-16986. S3A to not need wildfly on the classpath. (#1948)

HADOOP-16986. S3A to not need wildfly JAR on its classpath.

Contributed by Steve Loughran

This is a successor to HADOOP-16346, which enabled the S3A connector

to load the native openssl SSL libraries for better HTTPS performance.

That patch required wildfly.jar to be on the classpath. This

update:

* Makes wildfly.jar optional except in the special case that

"fs.s3a.ssl.channel.mode" is set to "openssl"

* Retains the declaration of wildfly.jar as a compile-time

dependency in the hadoop-aws POM. This means that unless

explicitly excluded, applications importing that published

maven artifact will, transitively, add the specified

wildfly JAR into their classpath for compilation/testing/

distribution.

This is done for packaging and to offer that optional

speedup. It is not mandatory: applications importing

the hadoop-aws POM can exclude it if they choose.

HADOOP-16941. ITestS3GuardOutOfBandOperations.testListingDelete failing on versioned bucket (#1919)

Contributed by Steve Loughran.

Removed the failing probe and replacing with two probes which will fail

on both versioned and unversioned buckets.

HADOOP-16941. ITestS3GuardOutOfBandOperations.testListingDelete failing on versioned bucket (#1919)

Contributed by Steve Loughran.

Removed the failing probe and replacing with two probes which will fail

on both versioned and unversioned buckets.

HADOOP-16932. distcp copy calls getFileStatus() needlessly and can fail against S3 (#1936)

Contributed by Steve Loughran.

This strips out all the -p preservation options which have already been

processed when uploading a file before deciding whether or not to query

the far end for the status of the (existing/uploaded) file to see if any

other attributes need changing.

This will avoid 404 caching-related issues in S3, wherein a newly created

file can have a 404 entry in the S3 load balancer's cache from the

probes for the file's existence prior to the upload.

It partially addresses a regression caused by HADOOP-8143,

"Change distcp to have -pb on by default" that causes a resurfacing

of HADOOP-13145, "In DistCp, prevent unnecessary getFileStatus call when

not preserving metadata"

Change-Id: Ibc25d19e92548e6165eb8397157ebf89446333f7

HADOOP-16932. distcp copy calls getFileStatus() needlessly and can fail against S3 (#1936)

Contributed by Steve Loughran.

This strips out all the -p preservation options which have already been

processed when uploading a file before deciding whether or not to query

the far end for the status of the (existing/uploaded) file to see if any

other attributes need changing.

This will avoid 404 caching-related issues in S3, wherein a newly created

file can have a 404 entry in the S3 load balancer's cache from the

probes for the file's existence prior to the upload.

It partially addresses a regression caused by HADOOP-8143,

"Change distcp to have -pb on by default" that causes a resurfacing

of HADOOP-13145, "In DistCp, prevent unnecessary getFileStatus call when

not preserving metadata"