Clone
Aman Sinha <asinha@maprtech.com>
committed
on 25 Mar 16
DRILL-4530: Optimize partition pruning with metadata caching for the single partition case.
- Enhance PruneScanRule to detect single partit… Show more
DRILL-4530: Optimize partition pruning with metadata caching for the single partition case.

- Enhance PruneScanRule to detect single partitions based on referenced dirs in the filter.

- Keep a new status of EXPANDED_PARTIAL for FileSelection.

- Create separate .directories metadata file to prune directories first before files.

- Introduce cacheFileRoot attribute to keep track of the parent directory of the cache file after partition pruning.

Check if prefix components are non-null the very first time single partition info is initialized.

Add separate interface method to create scan using a cacheFileRoot.

Create filenames list with unique names using fileSet if available.  Add several unit tests.

Populate only fileSet when expanding using the metadata cache.

Remove cacheFileRoot parameter from FileGroupScan's clone() method and instead leverage it from FileSelection.

Keep track of whether all partitions were previously pruned and process this state where needed.

close apache/drill#519

Show less

master + 10 more