Clone Tools
  • last updated 21 mins ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
[NO ISSUE][STO] Persist Bloom Filter Existence in Index Metadata

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- To clearly identify whether an index has a bloom filter or

not for BTree indexes, persist this information in the index's

metadata stored on each NC.

- For backward compatibility, when reading an index's metadata that

was created before adding the hasBloomFilter field, default its

value based on whether or not the index is a primary key index.

- Remove unused special readObject from LSMBTreeLocalResource.

Change-Id: Icec570d490987de401c036790ee9567238a60301

Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/3804

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -3
    • +3
    ./TestLsmBtreeLocalResourceFactory.java
  1. … 19 more files in changeset.
[ASTERIXDB-2540][STO] Optimize Storage Disk I/O

- user model changes: yes. Add a new storage option:

storage.disk.force.bytes (default 16MB),

- storage format changes: no.

- interface changes: yes.

Introduced IPageWriteCallback to LSM indexes

Details:

- Bypass all queuing (from BufferCache and IOManager) for disk writes.

This queuing is unnecessary but destroys fairness among multiple

writers.

- Introduce IPageWriteCallback to control the behavior of disk page

writes. Currently, this interface is used to perform disk forces

regularly for each writer thread.

Change-Id: I1f618dc7c186623e860239b4d97640fe3528e75b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/3285

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Michael Blow <mblow@apache.org>

    • -5
    • +8
    ./TestLsmBtreeLocalResourceFactory.java
  1. … 138 more files in changeset.
[ASTERIXDB-2310][STO]Enforce Key Uniquness using PKIndex

- user model changes: no

- storage format changes: yes. Primary key index

now has bloom filters.

- interface changes: no

Details:

- Add bloom filters to primary key index.

- Introduce LSMPrimaryInsertOperator to separate uniqueness check from

the primary index. When the primary key index is available, it will be

used for uniqueness check. This implementation of this operation is

similar to LSMPrimaryUpsertOperator.

Change-Id: I7a52bb75ee5b14521972999df2f45ba62adc5af1

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2453

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

  1. … 51 more files in changeset.
[ASTERIXDB-2422][STO] Introduce compressed storage

- user model changes: yes

- Add new configuration in the with-caluse to enable compression

- Add new nc configuration in the config file

- storage format changes: yes

- Pages of the primary index can be compressed

- Add a companion file (Look Aside File) with the compressed index

- Allow optional values in the LocalResource

- Add compression information in Metadata.Dataset

- interface changes: yes

- ICCApplicationContext:

- Add getCompressionManager()

- IBufferCache:

- Add getCompressedFileWriter(int fileId)

- ICachedPageInternal:

- Add setCompressedPageOffset(long offset)

- Add getCompressedPageOffset()

- Add setCompressedPageSize(int size)

- Add getCompressedPageSize()

Details:

- Add new integration test for this patch

- Fix ASTERIXDB-2464

- Add ddl-with-clause type validator

Additional details in the design document:

https://cwiki.apache.org/confluence/display/ASTERIXDB/Compression+in+AsterixDB

Change-Id: Idde6f37c810c30c7f1a5ee8bcbc1e3e5f4410031

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2857

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

    • -1
    • +2
    ./TestLsmBtreeLocalResourceFactory.java
  1. … 190 more files in changeset.
[ASTERIXDB-1130][STO]: JSON serialization for persisted objects

- user model changes: no

- storage format changes:

This change replaces the use of Java serialization for persisted objects

such as dataset/index metadata, checkpoints, ect...

This will break backward compatibly with any existing AsterixDB instance.

However, the change is needed to enable future backward compatibility support

for persisted objects.

- interface changes:

IJsonSerializable: contains API to serialize a class as a JsonNode.

IPersistedResourceRegistry: contains a mapping between an IJsonSerializable

class and a unique type id. An IPersistedResourceRegistry is responsible

for generating the class identifier in the JSON output.

The class identifier will always contain the following attributes:

@type: a unique type id that identifies the object type.

@version: the version of the serialized class.

@class: the serialized class full name.

Any registered class with PersistedResourceRegistry must provide

a static fromJson(IPersistedResourceRegistry, JsonNode) method for

deserialization. This is ensured during the class registration process.

Change-Id: I5b103e06eab6627dbfe9d531caae1a3ac4b296da

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2752

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 128 more files in changeset.
[NO ISSUE][STO] Add consistency to flush lifecycle

- user model changes: no

- storage format changes: yes

- renamed AbstractLSMIOOperationCallbackFactory

to LSMIOOperationCallbackFactory

- useless classes have been removed.

- LSMBTreeIOOperationCallbackFactory

- LSMBTreeWithBuddyIOOperationCallbackFactory

- LSMInvertedIndexIOOperationCallbackFactory

- LSMRTreeIOOperationCallbackFactory

- interface changes: yes

Details:

- Previously, flushes have different lifecycle depending

on the memory component state

- not allocated

- allocated

- modified

- In certain cases, flush operations are skipped alltogether

- IO Operation callbacks became complicated and difficult

to maintain since calls are done differently in different

cases.

- In certain cases, afterFinalize is called on the IO

Operation callbacks even if beforeOperation was never

called.

- In this change, flushes go through the same lifecycle

events regardless of the state of the memory component.

- In addition, primary and secondary memory components

would reside in different virtual buffer caches due

to skipped flushes, or due to having the secondary

index created when the primary index's memory component

is residing on the virtual buffer cache with index !=0.

- Moreover, when flushes are lagging and all memory

components are being flushed, search operations assumes

the oldest of the memory component is the newest and

produces incorrect results.

- In addition, in case of a failed flush of a component,

the IO scheduler would skip it and flush the next

component. This would produce a bad state on disk.

- In this change, a failed flush can be retried. otherwise,

all future flushes of the component fail due to the failure

of the previously failed flush.

- Previously, when a component fails to modify an index due

to flush failures, it assumes disk is full.

- With this change, the modification failure reports the

original cause of the failed flush.

Change-Id: I29f7992ec6c0f71c5b63d45800b2fb590d651e4b

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2584

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Tested-by: Murtadha Hubail <mhubail@apache.org>

    • -0
    • +101
    ./CountingIoOperationCallback.java
    • -0
    • +48
    ./CountingIoOperationCallbackFactory.java
    • -0
    • +45
    ./NoOpTestCallback.java
  1. … 155 more files in changeset.
[ASTERIXDB-2188] Ensure recovery of component ids

- user model changes: no

- storage format changes: yes.

Flush log record format changes.

- interface changes: no

Details:

- Add flush component ids to the flush log record. Upon

seeing a flush log record during recovery, schedule

a flush to all indexes in this partition s.t. LSN>maxDiskLSN

to ensure component ids are properly maintained upon

failed flushes.

- Add a test case to ensure the correctness of the recovery logic

of component ids

Change-Id: I8c1fc2b209cfb9d3dafa216771d2b7032eb99e75

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2408

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

    • -0
    • +41
    ./AllowTestOpCallback.java
  1. … 16 more files in changeset.
[ASTERIXDB-2204][STO] Fix implementations and usages of IIndexCursor

- user model changes: no

- storage format changes: no

- interface changes: yes

- IIndexCursor.close() is now idempotent and can be called on

a closed cursor.

- IIndexCursor.destroy() is now idempotent and can be called

on a destroyed cursor.

- Add IIndexAccessor.destroy() letting the accessor know it is

safe to destroy its reusable cursors and operation contexts.

- Add IIndexOperationContext.destroy() letting the context

know that the user is done with it and allow it to release

resources

details:

- Previously, implementations of the IIndexCursor interface

didn't enforce the interface contract. This change enforces

the contract for all the implementations.

- With the enforcement of the contract, all the users of the

cursors are expected to follow and enforce the expected lifecycle.

- Test cases were added.

Change-Id: I98a7a8b931eb24dbe11bf2bdc61b754ca28ebdf9

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2324

Reviewed-by: Michael Blow <mblow@apache.org>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 132 more files in changeset.
[ASTERIXDB-2231][STO] Separate primary op tracker for each partition

- user model changes: no

- storage format changes: no.

- interface changes: yes.

Details:

- Separate primary index operation tracker for each partition, instead

of having a global one on each NC to achieve better scalability.

- As a coordinated change, separate component id generator for each

partition as well.

- Add partition to transaction context so that transaction operations

can operate on proper op tracker.

- Fixes [ASTERIXDB-2232] to calculate dataset partitions correctly.

Change-Id: I9eb3854d2343e45beeccb87b0d434e5f4efd69c9

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2263

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 62 more files in changeset.
[NO ISSUE][STO] Recover from failure in memory allocation callback

- user model changes: no

- storage format changes: no

- interface changes: no

details:

- Previously, if an exception is thrown in the

ILSMIOOperationCallback.allocated call, then the memory component

is allocated but the flag memoryComponentsAllocated is false.

- Any subsequent attempt to modify the index will try to allocate

the component but since it has already been allocated, it will fail

with the exception: File is already mapped.

- In this change, if an exception is thrown from the callback, then

the component is de-allocated before throwing the exception.

- Test is case is added.

Change-Id: I80e605461df18c7f6d7785cd7504ca3acb4f45b1

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2336

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 11 more files in changeset.
[ASTERIXDB-2184] Add Immutable DiskBTree

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- Add a immutable DiskBTree to for LSM disk components. This DiskBTree

only supports two operations, i.e., search and bulkload. No concurrency

control is performed at all, since it's immutable.

- Change LSMBTree/InvertedIndex to use this DiskBTree

- Add a DiskBTree point search cursor to optimize point lookups

Change-Id: I8f2a9281478c4b8665589dc695769d0497af9961

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2193

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 21 more files in changeset.
[ASTERIXDB-2161][TEST] Add indexes to MultiPartitionTest

- user model changes: no

- storage format changes: no

- interface changes: no

Details:

- This change adds secondary indexes to multi partition

LSM indexes tests.

- This enables testing of specific concurrency scenarios

and ensuring properties stored in primary and secondary

indexes are consistent.

- In addition, the call for flushDataset in

DatasetLifecycleManager now throws an

IllegalStateException if the number of active

operations is not 0. Some tests used to call this

function when there are ongoing operations and that

is expected to never be the case in the actual system.

Change-Id: I5aea71a87f149b01f6c7310867fc15b5a340b93c

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2173

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -0
    • +23
    ./IVirtualBufferCacheCallback.java
    • -0
    • +215
    ./TestVirtualBufferCache.java
  1. … 14 more files in changeset.
[ASTERIXDB-2169][STO][TX] Unblock modifications during full scan

- user model changes: no

- storage format changes: no

- interface changes: yes

- added ILSMHarness.replaceMemoryComponentsWithDiskComponents

details:

- During a long running query aka full scan, two things block

incoming modifications:

1) Memory component gets full, is flushed but can't be recycled

because of the search operation inside the component.

2) Read latches on the memory component not being released and

the memory component search cursor is not advancing.

The two cases are addressed in this change for the LSMBTree but

not yet addressed for other indexes.

The proposed solution for case (1) is to poll memory components

states every n records during the search operation. If a memory

component was found to have been flushed, its cursor is moved

to the corresponding disk component allowing the memory

component to be recycled.

The proposed solution for case (2) is to check memory component

cursor every n records. If the cursor has not advanced and the

component has writers, then the latches over the leaf page are

released, and the cursor re-do the operation entering from the

tree root.

- Added a test case.

- Added performance traces for enter and exit components.

Change-Id: I37ba52f6324ed1c5a78465c3a8cbcd351f1ed5bc

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2166

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Luo Chen <cluo8@uci.edu>

  1. … 51 more files in changeset.
[ASTERIXDB-2115] Add Component Ids to LSM Indexes

- user model changes: no

- storage format changes: no

- interface changes: yes

Details:

- Add LSMComponentId to all LSM components. Component Ids are managed

through IO operation callbacks.

- For memory component, it's ID is reset every time it's recycled.

- For disk component, it's ID is copied from the source component(s)

during flush/merge

- For indexes of a dataset, we need to guarantee all their memory

components should recieve the same ID. This is achieved using a shared

component Id generator.

- Fix memory component recycled callback, make sure it's called only

when we've indeed recycled the memory component

A design wiki for this patch: https://cwiki.apache.org/confluence/display/

ASTERIXDB/Component+Id-based+secondary-to-primary+index+acceleration

Change-Id: I8aec6261a84a0729ce35f4b1cb708be299ddb98d

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2025

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 62 more files in changeset.
[NO ISSUE][STO] Add a callback on recycling of memory components

- user model changes: no

- storage format changes: no

- interface change: yes

- ILSMIOOperationCallbackFactory.createIoOpCallback now takes

the ILSMIndex as a parameter.

- Remove ILSMIOOperationCallback.setNumOfMutableComponents

The callback can find out the number of mutable components

on instantiation since the lsm index is now passed.

- ILSMIOOperationCallback.allocated was added.

It gets called whenever a memory component is allocated.

- ILSMIOOperationCallback.recycled was added.

It gets called whenever a memory component is recycled.

- ILSMIndex.hasMemoryComponent is replaced with

ILSMIndex.getNumberOfMemoryComponents

Change-Id: I578ffd7ef17784034c94f3c0d23cd5094e39f6e0

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2126

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

  1. … 103 more files in changeset.
[NO ISSUE][STO][IDX] LSM storage cleanup

- user model changes: no

- storage format changes: no

- interface changes: yes

- Replaced component lifecycle-related fabric methods in AbstractLSMIndex

with direct method calls of lsmComponent functions.

- Extracted common lifecycle-related functionality from index-specific

disk/memory lsmComponents to interfaces.

- Introduced composable disk component bulkloader design which assembles the

proper bulkload pipeline from individual elements populating lsmFilters,

bloomFilters, buddyBTrees\deletedKeysBTrees, bTress\rTrees\invIndexes.

- Changed methods to return index-specific versions of objects (accessors,

components, index instances) to avoid nasty downcasting.

Change-Id: I6739d751b990e7a28e03e32a5de6e2b670d37a1e

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2014

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>

  1. … 92 more files in changeset.
Revert "[ASTERIXDB-2103][STO] Too many disk components for CorrelatedPolicy"

This reverts commit 21ed0f72681a20ccb6a654f9aa4d54b8d0ea9c5c.

Change-Id: I670545acd09c678f21be25313353ab306be86202

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2063

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Ian Maxon <imaxon@apache.org>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 54 more files in changeset.
[ASTERIXDB-2103][STO] Too many disk components for CorrelatedPolicy

- user model changes: no

- storage format changes: no

- interface changes: yes

Details:

Currently CorrelatedMergePolicy uses component Ids to ensure disk

components of primary and secondary indexes are merged together,

but without synchronization. However, this results in too many disk

components for secondary InvertedIndex. The reason is that secondary

index could miss some round of merges, if the merge policy finds out

the corresponding secondary components are not available (either being

merged or being flushed). Even though flow-control on secondary indexes

can guarantee the secondary index would catch up the next time, it is

still possible that the primary component is finialized, which leaves

the secondary components which miss this round of merge are never merged

again.

This patch fixes this bug by:

- Add the mechanism of depending operations to LSM IO operation. An

operation finishes only after all depending operations have finished.

- For correlated merge policy, the flush/merge of the primary index depends

on all flushes/merges of secondary indexes. This ensures when the

correlated policy schedules merge, all related components of all indexes

are available to merge.

Change-Id: Ib6c06ee23f3bfd16b758802388389c00e29780b1

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2018

Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Jianfeng Jia <jianfeng.jia@gmail.com>

  1. … 54 more files in changeset.
Introduce ITracer

Change-Id: I1d41d9cf74f481ba26882cf2ca318d0d2b9607f7

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2050

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

  1. … 23 more files in changeset.
[NO ISSUE][STO] Component Deletes Through flushes and merges

- user model changes: no

- storage format changes: no

- interface changes: yes

- moved validation of component from the index:

- ILSMIndex and all of its implementations

to the component:

- ILSMDiskComponent and all of its implementations

details:

- This change enables component level deletes.

Change-Id: I178656207bfa1d15e6ae5ff2403a16df33940773

Reviewed-on: https://asterix-gerrit.ics.uci.edu/2017

Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

Reviewed-by: Murtadha Hubail <mhubail@apache.org>

Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>

    • -0
    • +25
    ./ITestOpCallback.java
    • -0
    • +260
    ./TestLsmBtree.java
    • -0
    • +70
    ./TestLsmBtreeLocalResource.java
    • -0
    • +61
    ./TestLsmBtreeLocalResourceFactory.java
    • -0
    • +51
    ./TestLsmBtreeSearchCursor.java
    • -0
    • +105
    ./TestLsmBtreeUtil.java
  1. … 73 more files in changeset.