Clone Tools
  • last updated 21 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[NO ISSUE][STO] Persist Bloom Filter Existence in Index Metadata

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- To clearly identify whether an index has a bloom filter or

not for BTree indexes, persist this information in the index's

metadata stored on each NC.

- For backward compatibility, when reading an index's metadata that

was created before adding the hasBloomFilter field, default its

value based on whether or not the index is a primary key index.

- Remove unused special readObject from LSMBTreeLocalResource.

Change-Id: Icec570d490987de401c036790ee9567238a60301

Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/3804

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -4
    • +2
    ./storage/am/lsm/btree/utils/LSMBTreeUtil.java
  1. … 14 more files in changeset.
[ASTERIXDB-2541][STO] Introduce GreedyScheduler

- user model changes: yes.

Add new option: storage.io.scheduler (async/greedy)

- storage format changes: no.

- interface changes: yes.

Introduce IIndexCursorStats

Details:

- Introduce GreedyScheduler that always executes the merge

operation with the smallest number of remaining pages to minimize

the number of disk components

- Introduce IIndexCursorStats to collect the statistics of index scans.

This allows GreedyScheduler to know the remaning pages of merge

operations.

- Extend AbstractIoOperation so that GreedyScheduler can pause/resume

merge operations if needed.

Change-Id: I38fe394d1180d4e3f6796064c0e6c6630b6ad303

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3284

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Michael Blow <mblow@apache.org>

    • -2
    • +5
    ./storage/am/lsm/btree/impls/ExternalBTree.java
    • -2
    • +5
    ./storage/am/lsm/btree/impls/LSMBTree.java
  1. … 62 more files in changeset.
[ASTERIXDB-2540][STO] Optimize Storage Disk I/O

- user model changes: yes. Add a new storage option:

storage.disk.force.bytes (default 16MB),

- storage format changes: no.

- interface changes: yes.

Introduced IPageWriteCallback to LSM indexes

Details:

- Bypass all queuing (from BufferCache and IOManager) for disk writes.

This queuing is unnecessary but destroys fairness among multiple

writers.

- Introduce IPageWriteCallback to control the behavior of disk page

writes. Currently, this interface is used to perform disk forces

regularly for each writer thread.

Change-Id: I1f618dc7c186623e860239b4d97640fe3528e75b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3285

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -5
    • +13
    ./storage/am/lsm/btree/impls/ExternalBTree.java
    • -11
    • +17
    ./storage/am/lsm/btree/impls/LSMBTree.java
    • -12
    • +16
    ./storage/am/lsm/btree/utils/LSMBTreeUtil.java
  1. … 133 more files in changeset.
[ASTERIXDB-2599][STO] Cleanup compression LAFs

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Cleanup compression LAFs after merge.

- Delete LAF files when merged components contain

previous components.

- Make sure the recovery after non-graceful shutdown

(in the middle of cleanup) deletes the component and its LAF.

Change-Id: I17adb6145f7bf77470fd82f04321faf7a4007bf7

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3498

Reviewed-by: Wail Alkowaileet <wael.y.k@gmail.com>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

  1. … 6 more files in changeset.
[ASTERIXDB-2310][STO]Enforce Key Uniquness using PKIndex

- user model changes: no

- storage format changes: yes. Primary key index

now has bloom filters.

- interface changes: no

Details:

- Add bloom filters to primary key index.

- Introduce LSMPrimaryInsertOperator to separate uniqueness check from

the primary index. When the primary key index is available, it will be

used for uniqueness check. This implementation of this operation is

similar to LSMPrimaryUpsertOperator.

Change-Id: I7a52bb75ee5b14521972999df2f45ba62adc5af1

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2453

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -4
    • +5
    ./storage/am/lsm/btree/impls/LSMBTree.java
    • -5
    • +5
    ./storage/am/lsm/btree/utils/LSMBTreeUtil.java
  1. … 51 more files in changeset.
[NO ISSUE][RT] Improve PreclusteredGroupWriter

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Modified PreclusteredGroupWriter to only save group fields

from a last tuple in a frame instead of the whole frame

- move PermutingFrameTupleReference and PermutingTupleReference

from 'hyracks-storage-am-common' to 'hyracks-dataflow-common'

Change-Id: Ic75de2e6b64d0aacaf48096ecc9d47fc8e95c9cf

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3351

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ali Alsuliman <ali.al.solaiman@gmail.com>

  1. … 35 more files in changeset.
[NO ISSUE] Apply / enforce java import order

The process-sources target will now sort imports as well as

format source code; the source-format job will likewise verify

import order in addition to source code format

Change-Id: I55d976c4df10d9919c6a25683be2a3e3304e65d9

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3288

Integration-Tests: Michael Blow <mblow@apache.org>

Tested-by: Michael Blow <mblow@apache.org>

Reviewed-by: Till Westmann <tillw@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 625 more files in changeset.
[NO ISSUE] Cleanup / refactor upgrade code

Change-Id: Ic81e87e70eecf49b71f9d96b1ac7c7180a314564

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3174

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

  1. … 5 more files in changeset.
[NO ISSUE][HYR] Binary compatibility enhancements

Infrastructure & changes to enable binary compatibility with 0.9.4

Change-Id: I77d4919be4853d9afe9b0137861cff3b1d751e20

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3128

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

  1. … 32 more files in changeset.
[ASTERIXDB-2422][STO] Introduce compressed storage

- user model changes: yes

- Add new configuration in the with-caluse to enable compression

- Add new nc configuration in the config file

- storage format changes: yes

- Pages of the primary index can be compressed

- Add a companion file (Look Aside File) with the compressed index

- Allow optional values in the LocalResource

- Add compression information in Metadata.Dataset

- interface changes: yes

- ICCApplicationContext:

- Add getCompressionManager()

- IBufferCache:

- Add getCompressedFileWriter(int fileId)

- ICachedPageInternal:

- Add setCompressedPageOffset(long offset)

- Add getCompressedPageOffset()

- Add setCompressedPageSize(int size)

- Add getCompressedPageSize()

Details:

- Add new integration test for this patch

- Fix ASTERIXDB-2464

- Add ddl-with-clause type validator

Additional details in the design document:

https://cwiki.apache.org/confluence/display/ASTERIXDB/Compression+in+AsterixDB

Change-Id: Idde6f37c810c30c7f1a5ee8bcbc1e3e5f4410031

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2857

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -5
    • +6
    ./storage/am/lsm/btree/utils/LSMBTreeUtil.java
  1. … 183 more files in changeset.
[ASTERIXDB-2444][STO] Avoid Using System Clock in Storage

- user model changes: no

- storage format changes: yes

- interface changes: yes

Details:

- Replace the usage of system clock timestamps in LSM

index components file names by a sequencer. The next

sequence id to use is determined by checking the list

of existing components on disk. Note that due to a

rollback, an index checkpoint file may have last valid

component sequence which is greater than what is on disk.

This should not cause any issues since only components

that have a sequence greater than that appears in the

checkpoint will be deleted.

- Replace the usage of system clock timestamps in LSM

index components ids by a monotonically increasing

sequencer. The sequencer is initialized after restarts

by the last valid component id that appears in the

index checkpoint.

- Refactor the logic to generate flush/merge file names.

- Refactor the logic to check invalid components.

- Adapt test cases to new naming format.

Change-Id: I9dff8ffb38ce8064a199d03b070ed1f5b924b8a4

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2927

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 22 more files in changeset.
[NO ISSUE][OTH] Remove Unused Imports

- user model changes: no

- storage format changes: no

- interface changes: no

Change-Id: Iafff39073d0fedaff74a26ef7e3260008a79ff0c

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2893

Reviewed-by: Michael Blow <mblow@apache.org>

Tested-by: Michael Blow <mblow@apache.org>

  1. … 67 more files in changeset.
[ASTERIXDB-2414][STO] Remove deleted component files from buffer cache

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- When activating an LSM index, we remove files of components

that were merged into a bigger component but not cleaned up yet.

- However, we sometimes leave a file reference mapped in the buffer

cache even when the file is removed from disk.

- This change ensures that all files are removed from the buffer

cache as well.

Change-Id: If0f11bc222662e4b50c1b47b1dfa6b30d1463b2e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2822

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 6 more files in changeset.
[ASTERIXDB-2414][STO] Fix name of merge files

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- A bug is found where merge file names are created incorrectly

where start and end components are reversed.

- The bug is fixed and an explicit check for the invariance was

added.

Change-Id: I861765bc0f293bdfdf0285f97884d536204fdb1e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2820

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Reviewed-by: Wail Alkowaileet <wael.y.k@gmail.com>

Reviewed-by: Ian Maxon <imaxon@apache.org>

    • -2
    • +2
    ./storage/am/lsm/btree/impls/ExternalBTree.java
  1. … 4 more files in changeset.
[NO ISSUE] Report all BufferCache write failures.

- user model changes: no

- storage format changes: no

- interface changes: yes

+ IPageWriteFailureCallback: used to notify async

IO caller when something goes wrong.

Details:

- Before this change, it is possible for failures to

be lost and for bulkload operations to not be

aware of failure to write some pages. This can be

dangerous.

- To avoid this, when sending a page to be written

a PageWriteFailureCallback is associated with the

page to notify the caller that a failure took place.

Change-Id: I97fd3dccff85dab84d644359be6f66b15ee708ef

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2787

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -3
    • +22
    ./storage/am/lsm/btree/impls/ExternalBTree.java
  1. … 46 more files in changeset.
[ASTERIXDB-1130][STO]: JSON serialization for persisted objects

- user model changes: no

- storage format changes:

This change replaces the use of Java serialization for persisted objects

such as dataset/index metadata, checkpoints, ect...

This will break backward compatibly with any existing AsterixDB instance.

However, the change is needed to enable future backward compatibility support

for persisted objects.

- interface changes:

IJsonSerializable: contains API to serialize a class as a JsonNode.

IPersistedResourceRegistry: contains a mapping between an IJsonSerializable

class and a unique type id. An IPersistedResourceRegistry is responsible

for generating the class identifier in the JSON output.

The class identifier will always contain the following attributes:

@type: a unique type id that identifies the object type.

@version: the version of the serialized class.

@class: the serialized class full name.

Any registered class with PersistedResourceRegistry must provide

a static fromJson(IPersistedResourceRegistry, JsonNode) method for

deserialization. This is ensured during the class registration process.

Change-Id: I5b103e06eab6627dbfe9d531caae1a3ac4b296da

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2752

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 126 more files in changeset.
[NO ISSUE][STO] Misc Storage Fixes and Improvements

- user model changes: no

- storage format changes: no

- interface changes: yes

Details:

- This change introduces some improvements to storage

operations.

- Local RecoveryManager is now extensible.

- Bulk loaders now call the IO callback similar to

Flushes, making them less special and creating a

unified lifecycle for adding an index component.

- As a result, The IndexCheckpointManager doesn't need

to have a special treatment for components loaded

through the bulk load operation.

- Component Id have been added to the index checkpoint

files.

- Cleanup for the code of local recovery for failed flush

operations.

- Ensure that after local recovery of flushes, primary

and secondary indexes have the same index for mutable

memory component.

- The use of WAIT logs to ensure in-flight flushes

are scheduled didn't work as expected. A new log type

WAIT_FOR_FLUSHES was introduced to acheive the expected

behavior.

- The local test framework was made Extensible to support

more use cases.

- Test cases were added for component ids in checkpoint files.

The following scenarios were covered:

- Primary and secondary both have values when a flush is

shceduled.

- Primary have values but not secondary when a flush is

scheduled.

- Primary is empty and an index is created through bulk

load.

- Primary has a single component and secondary is created

through bulk load.

- Primary has multiple components and secondary is created

through bulk load.

- Each primary opTracker now keeps a list of ongoing flushes.

- FlushDataset now waits only for flushes only and

not all io operations.

- Previously, we had many flushes scheduled on open datasets.

This was not detected but after this change, a failure

is thrown in such cases.

- Flush operations dont need to extend the comparable

interface anymore since they are FIFO per index.

Change-Id: If24c9baaac2b79e7d1acf47fa2601767388ce988

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2632

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -35
    • +46
    ./storage/am/lsm/btree/impls/ExternalBTree.java
  1. … 87 more files in changeset.
[NO ISSUE][STO] Eliminate S Lock for Disk Components

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Eliminate S lock for tuples returned from disk components, since LSM

disk components only contain committed data and S lock is not needed to

prevent from reading uncommitted data.

Change-Id: Id6ec999b131cd6609d588966d7ae7788f429ab9d

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2637

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 3 more files in changeset.
[NO ISSUE][STO] Fix component switch in LSMBTreeRangeSearchCursor

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- switchRequest array is used for switching memory components

with disk components. It has the size = maximum number of

memory components which can be greater than all the number of

components size in certain cases (0 disk component,

1 memory component for example). To avoid index out of bound,

we end the loop early in this corner case.

Change-Id: If20cc671974485bf73ad05e95d681424805611d3

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2633

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

[NO ISSUE][STO] Add consistency to flush lifecycle

- user model changes: no

- storage format changes: yes

- renamed AbstractLSMIOOperationCallbackFactory

to LSMIOOperationCallbackFactory

- useless classes have been removed.

- LSMBTreeIOOperationCallbackFactory

- LSMBTreeWithBuddyIOOperationCallbackFactory

- LSMInvertedIndexIOOperationCallbackFactory

- LSMRTreeIOOperationCallbackFactory

- interface changes: yes

Details:

- Previously, flushes have different lifecycle depending

on the memory component state

- not allocated

- allocated

- modified

- In certain cases, flush operations are skipped alltogether

- IO Operation callbacks became complicated and difficult

to maintain since calls are done differently in different

cases.

- In certain cases, afterFinalize is called on the IO

Operation callbacks even if beforeOperation was never

called.

- In this change, flushes go through the same lifecycle

events regardless of the state of the memory component.

- In addition, primary and secondary memory components

would reside in different virtual buffer caches due

to skipped flushes, or due to having the secondary

index created when the primary index's memory component

is residing on the virtual buffer cache with index !=0.

- Moreover, when flushes are lagging and all memory

components are being flushed, search operations assumes

the oldest of the memory component is the newest and

produces incorrect results.

- In addition, in case of a failed flush of a component,

the IO scheduler would skip it and flush the next

component. This would produce a bad state on disk.

- In this change, a failed flush can be retried. otherwise,

all future flushes of the component fail due to the failure

of the previously failed flush.

- Previously, when a component fails to modify an index due

to flush failures, it assumes disk is full.

- With this change, the modification failure reports the

original cause of the failed flush.

Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2584

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

    • -16
    • +10
    ./storage/am/lsm/btree/impls/ExternalBTree.java
    • -10
    • +7
    ./storage/am/lsm/btree/impls/LSMBTree.java
    • -5
    • +0
    ./storage/am/lsm/btree/utils/LSMBTreeUtil.java
  1. … 154 more files in changeset.
[ASTERIXDB-2374][RT] Index-only plan on B+Tree disk components

- user model changes: no

- storage format changes: no

- interface changes: no

Details: Let the index-only plan properly work on the disk components of B+Tree.

Currently, only the records from in-memory components has been applied because

searchCallback.proceed() is only called for those.

Change-Id: I655eacc5517352a382d1b61f7b630f0f307b7c7b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2623

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Dmitry Lychagin <dmitry.lychagin@couchbase.com>

  1. … 7 more files in changeset.
[NO ISSUE][STO] Only re-use compatible accessors in LSM Cursors

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Currently, when a search operation starts, using a previously

used LSMCursor, and the number of components didn't change, we

re-use cursors and accessors.

- However, we didn't check the type of the underlying component

and since disk cursors/accessors are incompatible with memory

component trees, we end up getting ClassCastException.

- This change checks for compatibility. If the component/cursor

pair are incompatible, the cursor and accessor are destroyed

and new ones are created.

Change-Id: I0fe224af1049b53c8cc37448ee8bfa2ccd780564

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2608

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

[ASTERIXDB-2339] Add a new inverted index merge cursor

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Implement a new inverted index merge cursor which uses two priority queues,

one for tokens and one for keys. For each token, we merge their inverted

lists using the key queue. After that, we fetch the next token and merge

their lists again. This reduces unnecessary token comparision a lot.

- Along this change, created a fast path for inverted index bulkloader.

Based on how the token+key pair is created, there is no need to copy

bulkloaded tuple and check whether it's a new token during merge.

Change-Id: I57d039cd7e08033884529a204bff9acffd96d9bb

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2519

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

    • -2
    • +2
    ./storage/am/lsm/btree/impls/ExternalBTree.java
    • -3
    • +5
    ./storage/am/lsm/btree/impls/LSMBTree.java
  1. … 19 more files in changeset.
[ASTERIXDB-2321][STO] Follow the contract in IIndexCursor.open calls

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- The index cursor contract says that an open call which returns

successfully, leaves the cursor in the open state, otherwise,

the cursor remains in the closed state.

- The LSM cursors have many cursors inside. In the

case where one of the cursors fails to open, and an exception

is about to be thrown, we must close all previously open cursors

since the LSM cursor will be in the closed state

and close will not be called.

Change-Id: I19db2afd2d6ca4a2ca1056cd95ae504b2be69813

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2501

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 7 more files in changeset.
[ASTERIXDB-1952][TX][IDX] Filter logs pt.2

- user model changes: no

- storage format changes: yes

- interface changes: yes

Details:

- Add a log type specifically for filters

- Only log change when filter actually widens

- Stop logging of index + filter tuple during modification

- Redo index and filter tuples separately via their logs

Change-Id: Ie9e7795d9c8c212e8610dcb9bb5d26ec9fbbee8a

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1857

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

    • -5
    • +9
    ./storage/am/lsm/btree/impls/LSMBTree.java
  1. … 44 more files in changeset.
[NO ISSUE][STO] Improve the LSMIOOperationCallback interface.

- user model changes: no

- storage format changes: no

- interface changes: yes

+ ILSMIndexOperationContext.getIoOperationType()

+ ILSMIndexOperationContext.getNewComponent()

* before, after, and finalize

calls of ILSMIOOperationCallback now take

ILSMIndexOperationContext as a parameter

Details:

- Before, some calls to ILSMIOOperationCallback

take just an enum LSMIOOperationType, some of them

take an enum and a component object. These sometimes don't

provide enough information to different implementations of

the callback that might be interested in more than that.

- Having the operation context object passed allow for

better exchange of information between different callers

and callees throughout the IO operation.

Change-Id: Ib7120c40a1a2256ed528dfd2e5853db9dba247c6

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2455

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -3
    • +3
    ./storage/am/lsm/btree/impls/ExternalBTree.java
  1. … 17 more files in changeset.
[ASTERIXDB-2083][COMP][RT][IDX][SITE] Budget-Constrained Inverted index search

- user-model changes: add text.searchmemory parameter

- storage format changes: no

- interface changes: IInvertedIndexSearcher, IInPlaceInvertedIndex,

IInvertedIndexAccessor, IInvertedListCursor

IObjectFactory, IPartitionedInvertedIndex,

IIndexAccessor

Details:

- Introduce text.searchmemory parameter in the configuration

to conduct budget-constrained inverted index search to prevent

a possible OOM exception

- Remove non-standard hyracks task context from the inverted-index-search

Change-Id: Ib2b2ef7c0b8c55ef66a5322be5d97ebbbf287bf5

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2251

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -7
    • +4
    ./storage/am/lsm/btree/impls/LSMBTree.java
  1. … 98 more files in changeset.
[NO ISSUE][RT] follow IFrameWriter protocol in SplitOperatorDescriptor

- user model changes: no

- storage format changes: no

- interface changes: no

details:

- Previously, the SplitOperatorDescriptor didn't follow the

IFrameWriter protocol in case of failure which lead to having

some open resources after the job.

- This caused so many failures in Cancellation tests.

- This change also increases the rate of cancellation during the

cancellation tests to ensure that similar problems are found.

Change-Id: I3166895589e1ab7355d689397f676f7da5c9809f

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2399

Reviewed-by: Taewoo Kim <wangsaeu@gmail.com>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 11 more files in changeset.
[ASTERIXDB-1972][COMP][RT][TX] index-only plan

- user model changes: no

- storage format changes: no

- interface changes: IAccessMethod, ILSMIndexOperationContext,

IIndexAccessor

Details:

- Implement an index-only plan

- Add a SET option that disables the index-only plan

Change-Id: Ifd5c9ab1cf2e4bedb7d8db582441919875e74d51

Reviewed-on: https://asterix-gerrit.ics.uci.edu/1866

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Taewoo Kim <wangsaeu@gmail.com>

  1. … 421 more files in changeset.
[ASTERIXDB-2204][STO] Fix implementations and usages of IIndexCursor

- user model changes: no

- storage format changes: no

- interface changes: yes

- IIndexCursor.close() is now idempotent and can be called on

a closed cursor.

- IIndexCursor.destroy() is now idempotent and can be called

on a destroyed cursor.

- Add IIndexAccessor.destroy() letting the accessor know it is

safe to destroy its reusable cursors and operation contexts.

- Add IIndexOperationContext.destroy() letting the context

know that the user is done with it and allow it to release

resources

details:

- Previously, implementations of the IIndexCursor interface

didn't enforce the interface contract. This change enforces

the contract for all the implementations.

- With the enforcement of the contract, all the users of the

cursors are expected to follow and enforce the expected lifecycle.

- Test cases were added.

Change-Id: I98a7a8b931eb24dbe11bf2bdc61b754ca28ebdf9

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2324

Reviewed-by: Michael Blow <mblow@apache.org>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -71
    • +104
    ./storage/am/lsm/btree/impls/LSMBTree.java
  1. … 119 more files in changeset.