Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
HADOOP-16371: Option to disable GCM for SSL connections when running on Java 8.

Contributed by Sahil Takiar.

This moves the SSLSocketFactoryEx class from hadoop-azure into hadoop-common

as the DelegatingSSLSocketFactory and binds the S3A connector to it so that

it can avoid using those HTTPS algorithms which are underperformant on Java 8.

Change-Id: Ie9e6ac24deac1aa05e136e08899620efa7d22abd

  1. … 15 more files in changeset.
HADOOP-16490. Avoid/handle cached 404s during S3A file creation.

Contributed by Steve Loughran.

This patch avoids issuing any HEAD path request when creating a file with overwrite=true,

so 404s will not end up in the S3 load balancers unless someone calls getFileStatus/exists/isFile

in their own code.

The Hadoop FsShell CommandWithDestination class is modified to not register uncreated files

for deleteOnExit(), because that calls exists() and so can place the 404 in the cache, even

after S3A is patched to not do it itself.

Because S3Guard knows when a file should be present, it adds a special FileNotFound retry policy

independently configurable from other retry policies; it is also exponential, but with

different parameters. This is because every HEAD request will refresh any 404 cached in

the S3 Load Balancers. It's not enough to retry: we have to have a suitable gap between

attempts to (hopefully) ensure any cached entry wil be gone.

The options and values are:

fs.s3a.s3guard.consistency.retry.interval: 2s

fs.s3a.s3guard.consistency.retry.limit: 7

The S3A copy() method used during rename() raises a RemoteFileChangedException which is not caught

so not downgraded to false. Thus: when a rename is unrecoverable, this fact is propagated.

Copy operations without S3Guard lack the confidence that the file exists, so don't retry the same way:

it will fail fast with a different error message. However, because create(path, overwrite=false) no

longer does HEAD path, we can at least be confident that S3A itself is not creating those cached

404 markers.

Change-Id: Ia7807faad8b9a8546836cb19f816cccf17cca26d

  1. … 24 more files in changeset.
HADOOP-16549. Remove Unsupported SSL/TLS Versions from Docs/Properties. Contributed by Daisuke Kobayashi.

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>

Reviewed-by: Akira Ajisaka <aajisaka@apache.org>

  1. … 2 more files in changeset.
HADOOP-16438. ADLS Gen1 OpenSSL config control.

Contributed by Sneha Vijayarajan.

Change-Id: Ib79ea6b4a90ad068033e175f3f59c5185868872d

  1. … 5 more files in changeset.
HADOOP-16527. Add a whitelist of endpoints to skip Kerberos authentication (#1336) Contributed by Akira Ajisaka.

  1. … 2 more files in changeset.
HADOOP-16470. Make last AWS credential provider in default auth chain EC2ContainerCredentialsProviderWrapper.

Contributed by Steve Loughran.

Contains HADOOP-16471. Restore (documented) fs.s3a.SharedInstanceProfileCredentialsProvider.

Change-Id: I06b99b57459cac80bf743c5c54f04e59bb54c2f8

  1. … 3 more files in changeset.
HADOOP-16504. Increase ipc.server.listen.queue.size default from 128 to 256. Contributed by Lisheng Sun.

  1. … 1 more file in changeset.
HADOOP-16499. S3A retry policy to be exponential (#1246). Contributed by Steve Loughran.

  1. … 11 more files in changeset.
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun.

  1. … 10 more files in changeset.
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun.

  1. … 10 more files in changeset.
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun.

(cherry picked from commit a5bb1e8ee871df1111ff77d0f6921b13c8ffb50e)

Conflicts:

hadoop-common-project/hadoop-common/src/main/resources/core-default.xml

  1. … 10 more files in changeset.
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun.

  1. … 10 more files in changeset.
HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du. Contributed by Lisheng Sun.

(cherry picked from commit a5bb1e8ee871df1111ff77d0f6921b13c8ffb50e)

Conflicts:

hadoop-common-project/hadoop-common/src/main/resources/core-default.xml

(cherry picked from commit c74027d9d34711c2c4baed7c98bc475d95097be0)

Conflicts:

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java

  1. … 10 more files in changeset.
HDFS-14652. Addendum: HealthMonitor connection retry times should be configurable. Contributed by Chen Zhang.

HADOOP-16398. Exports Hadoop metrics to Prometheus (#1170)

  1. … 6 more files in changeset.
HADOOP-16452. Increase ipc.maximum.data.length default from 64MB to 128MB. Contributed by Siyao Meng.

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>

  1. … 1 more file in changeset.
HADOOP-13868. [s3a] New default for S3A multi-part configuration (#1125)

  1. … 2 more files in changeset.
HADOOP-16357. TeraSort Job failing on S3 DirectoryStagingCommitter: destination path exists.

Contributed by Steve Loughran.

This patch

* changes the default for the staging committer to append, as we get for the classic FileOutputFormat committer

* adds a check for the dest path being a file not a dir

* adds tests for this

* Changes AbstractCommitTerasortIT. to not use the simple parser, so fails if the file is present.

Change-Id: Id53742958ed1cf321ff96c9063505d64f3254f53

  1. … 13 more files in changeset.
HADOOP-16350. Ability to tell HDFS client not to request KMS Information from NameNode. Contributed by Greg Senia, Ajay Kumar.

  1. … 3 more files in changeset.
HADOOP-16350. Ability to tell HDFS client not to request KMS Information from NameNode. Ccontributed by Greg Senia, Ajay Kumar.

  1. … 3 more files in changeset.
HADOOP-15183. S3Guard store becomes inconsistent after partial failure of rename.

Contributed by Steve Loughran.

Change-Id: I825b0bc36be960475d2d259b1cdab45ae1bb78eb

  1. … 70 more files in changeset.
HADOOP-16279. S3Guard: Implement time-based (TTL) expiry for entries (and tombstones).

Contributed by Gabor Bota.

Change-Id: I73a2d2861901dedfe7a0e783b310fbb95e7c1af9

  1. … 21 more files in changeset.
HADOOP-15563. S3Guard to support creating on-demand DDB tables.

Contributed by Steve Loughran

Change-Id: I2262b5b9f52e42ded8ed6f50fd39756f96e77087

  1. … 10 more files in changeset.
HADOOP-16085. S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite.

Contributed by Ben Roling.

S3Guard will now track the etag of uploaded files and, if an S3

bucket is versioned, the object version.

You can then control how to react to a mismatch between the data

in the DynamoDB table and that in the store: warn, fail, or, when

using versions, return the original value.

This adds two new columns to the table: etag and version.

This is transparent to older S3A clients -but when such clients

add/update data to the S3Guard table, they will not add these values.

As a result, the etag/version checks will not work with files uploaded by older clients.

For a consistent experience, upgrade all clients to use the latest hadoop version.

  1. … 55 more files in changeset.
HADOOP-16238. Add the possbility to set SO_REUSEADDR in IPC Server Listener. Contributed by Peter Bacsko.

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>

  1. … 2 more files in changeset.
HADOOP-16221. S3Guard: add option to fail operation on metadata write failure.

  1. … 10 more files in changeset.
HADOOP-16011. OsSecureRandom very slow compared to other SecureRandom implementations. Contributed by Siyao Meng.

Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>

  1. … 1 more file in changeset.
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.

Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.

  1. … 19 more files in changeset.
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.

Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.

  1. … 19 more files in changeset.
HADOOP-16125. Support multiple bind users in LdapGroupsMapping. Contributed by Lukas Majercak.

  1. … 4 more files in changeset.