Clone Tools
  • last updated a few minutes ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Integrated bloom filters with LSM-BTree during flushes, merges, and bulkload. All tests pass except the merge test due to what it seems a bug in the cleanup after merges if there are no search threads accessing the disk components. Next is to use bloom filters during search and also with other lsm indexes.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree_bloom_filter@2706 123451ca-8445-de46-9d55-352943316053

    • -6
    • +7
    ./impls/LSMInvertedIndexFileManager.java
  1. … 29 more files in changeset.
added option to conditionally flush an LSM index when it is being deactivated; added missing file from previous commit

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2702 123451ca-8445-de46-9d55-352943316053

  1. … 6 more files in changeset.
added proper IO Opcallback for when LSM indexes are deactivatd

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2701 123451ca-8445-de46-9d55-352943316053

    • -5
    • +8
    ./dataflow/LSMInvertedIndexDataflowHelper.java
    • -3
    • +5
    ./dataflow/LSMInvertedIndexDataflowHelperFactory.java
    • -5
    • +8
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelper.java
    • -3
    • +6
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelperFactory.java
    • -2
    • +4
    ./impls/PartitionedLSMInvertedIndex.java
  1. … 59 more files in changeset.
Fixed a performance bug that might have caused significant slowdown for inverted-index construction.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_experiments@2690 123451ca-8445-de46-9d55-352943316053

    • -4
    • +3
    ./dataflow/BinaryTokenizerOperatorNodePushable.java
major reworking of all lsm indexes with respect to synchronization and interfacing with the lsmharness

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2681 123451ca-8445-de46-9d55-352943316053

    • -178
    • +188
    ./impls/LSMInvertedIndex.java
    • -23
    • +12
    ./impls/LSMInvertedIndexAccessor.java
    • -56
    • +0
    ./impls/LSMInvertedIndexComponent.java
    • -3
    • +2
    ./impls/LSMInvertedIndexComponentFactory.java
    • -1
    • +8
    ./impls/LSMInvertedIndexFlushOperation.java
    • -0
    • +34
    ./impls/LSMInvertedIndexImmutableComponent.java
    • -1
    • +1
    ./impls/LSMInvertedIndexMergeOperation.java
    • -0
    • +55
    ./impls/LSMInvertedIndexMutableComponent.java
    • -4
    • +17
    ./impls/LSMInvertedIndexOpContext.java
    • -1
    • +1
    ./impls/LSMInvertedIndexSearchCursor.java
  1. … 57 more files in changeset.
Hopefully fixed a performance bug, still need to test on cluster.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_experiments@2675 123451ca-8445-de46-9d55-352943316053

    • -3
    • +5
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -0
    • +7
    ./ondisk/FixedSizeElementInvertedListCursor.java
    • -2
    • +5
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -4
    • +27
    ./search/PartitionedTOccurrenceSearcher.java
minor cleanup: fixed typo; swapped sync objects for sync on 'this'

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2612 123451ca-8445-de46-9d55-352943316053

    • -1
    • +1
    ./impls/LSMInvertedIndexFlushOperation.java
    • -4
    • +4
    ./impls/LSMInvertedIndexMergeOperation.java
  1. … 8 more files in changeset.
getWrite/ReadDevices returns Set instead of List; RTree IO operations now also returns buddy btree devices; hid merge/flush behind internal interface;

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2611 123451ca-8445-de46-9d55-352943316053

    • -13
    • +9
    ./impls/LSMInvertedIndexFlushOperation.java
    • -14
    • +11
    ./impls/LSMInvertedIndexMergeOperation.java
  1. … 11 more files in changeset.
merged the creation and scheduling of flushes and merge IO operations to a single call

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2610 123451ca-8445-de46-9d55-352943316053

  1. … 17 more files in changeset.
removed flush controller; ILSMIndex replaces the functionality

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2609 123451ca-8445-de46-9d55-352943316053

    • -10
    • +8
    ./dataflow/LSMInvertedIndexDataflowHelper.java
    • -7
    • +4
    ./dataflow/LSMInvertedIndexDataflowHelperFactory.java
    • -12
    • +9
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelper.java
    • -7
    • +4
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelperFactory.java
    • -4
    • +3
    ./impls/PartitionedLSMInvertedIndex.java
  1. … 64 more files in changeset.
Changed operator memory settings. Made some minor perf improvements to inverted index searches.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_experiments@2601 123451ca-8445-de46-9d55-352943316053

    • -2
    • +15
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -1
    • +9
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -10
    • +21
    ./search/PartitionedTOccurrenceSearcher.java
  1. … 2 more files in changeset.
Allowing LSM file managers to be initialized with a starting IO device index. Using that mechanism, the first disk component of the i-th partition of an Asterix LSM index is written on the i-th IO device (further components are assigned to IO devices in a round robin fashion).

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2585 123451ca-8445-de46-9d55-352943316053

    • -1
    • +1
    ./dataflow/LSMInvertedIndexDataflowHelper.java
    • -1
    • +1
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelper.java
    • -2
    • +2
    ./impls/LSMInvertedIndexFileManager.java
  1. … 11 more files in changeset.
* The cleanup of the merged components in the LSM indexes is now the responsibility of either: the last existing search thread (in case the merge process is over and there are still search threads accessing the merged components), or the merge thread itself (in case the merge process is over and there are no search threads accessing the merged components). * Allowed concurrent merges to occur at the same time instead of the old design which only allowed one merge process at a time per LSM index.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2573 123451ca-8445-de46-9d55-352943316053

    • -33
    • +13
    ./impls/LSMInvertedIndexComponent.java
    • -8
    • +8
    ./impls/LSMInvertedIndexRangeSearchCursor.java
    • -8
    • +9
    ./impls/LSMInvertedIndexRangeSearchCursorInitialState.java
    • -4
    • +5
    ./impls/LSMInvertedIndexSearchCursor.java
    • -8
    • +9
    ./impls/LSMInvertedIndexSearchCursorInitialState.java
  1. … 25 more files in changeset.
Merged hyracks_asterix_stabilization r2462:r2562.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2563 123451ca-8445-de46-9d55-352943316053

    • -4
    • +4
    ./dataflow/BinaryTokenizerOperatorNodePushable.java
    • -1
    • +1
    ./search/AbstractTOccurrenceSearcher.java
    • -9
    • +13
    ./tokenizers/AbstractUTF8StringBinaryTokenizer.java
    • -12
    • +17
    ./tokenizers/AbstractUTF8Token.java
    • -9
    • +13
    ./tokenizers/AbstractUTF8TokenFactory.java
    • -9
    • +13
    ./tokenizers/DelimitedUTF8StringBinaryTokenizer.java
    • -24
    • +30
    ./tokenizers/DelimitedUTF8StringBinaryTokenizerFactory.java
    • -13
    • +17
    ./tokenizers/HashedUTF8NGramToken.java
    • -9
    • +13
    ./tokenizers/HashedUTF8NGramTokenFactory.java
    • -13
    • +17
    ./tokenizers/HashedUTF8WordToken.java
    • -10
    • +13
    ./tokenizers/HashedUTF8WordTokenFactory.java
    • -9
    • +13
    ./tokenizers/IBinaryTokenizerFactory.java
  1. … 20 more files in changeset.
Refactored the LSM-indexes to use common abstract class. Added new ILSMComponent interface to represents the LSMComponents instead of passing Objects all around the place. Removed the component finalizers classes and cleaned the file manager API. Fixed couple of bugs. Cleaned the code and changed the names of many methods.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2552 123451ca-8445-de46-9d55-352943316053

    • -213
    • +141
    ./impls/LSMInvertedIndex.java
    • -0
    • +76
    ./impls/LSMInvertedIndexComponent.java
    • -0
    • +48
    ./impls/LSMInvertedIndexComponentFactory.java
    • -61
    • +0
    ./impls/LSMInvertedIndexComponentFinalizer.java
    • -42
    • +27
    ./impls/LSMInvertedIndexFileManager.java
    • -5
    • +6
    ./impls/LSMInvertedIndexMergeOperation.java
  1. … 43 more files in changeset.
Minor amendment to my multicomparator changes.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2538 123451ca-8445-de46-9d55-352943316053

    • -2
    • +2
    ./inmemory/InMemoryInvertedIndexOpContext.java
    • -1
    • +1
    ./ondisk/OnDiskInvertedIndexOpContext.java
  1. … 6 more files in changeset.
Finished implementing performance-optimized MultiComparators.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_multicomparator_opt@2506 123451ca-8445-de46-9d55-352943316053

  1. … 5 more files in changeset.
Added search modifiers that were moved from Asterix.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2496 123451ca-8445-de46-9d55-352943316053

    • -0
    • +55
    ./search/ListEditDistanceSearchModifier.java
    • -0
    • +35
    ./search/ListEditDistanceSearchModifierFactory.java
Small beauty fixes.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_tree@2493 123451ca-8445-de46-9d55-352943316053

    • -2
    • +2
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -2
    • +2
    ./search/PartitionedTOccurrenceSearcher.java
Minor fix.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2488 123451ca-8445-de46-9d55-352943316053

Implemented dataflow components for length-partitioned inverted indexes. Added integration test.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2487 123451ca-8445-de46-9d55-352943316053

    • -5
    • +10
    ./dataflow/BinaryTokenizerOperatorDescriptor.java
    • -28
    • +41
    ./dataflow/BinaryTokenizerOperatorNodePushable.java
    • -0
    • +79
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelper.java
    • -0
    • +45
    ./dataflow/PartitionedLSMInvertedIndexDataflowHelperFactory.java
  1. … 5 more files in changeset.
Changed partitioning field in length-partitioned inverted indexes from integer to short.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2486 123451ca-8445-de46-9d55-352943316053

    • -5
    • +4
    ./api/IInvertedIndexSearchModifier.java
    • -14
    • +14
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -3
    • +3
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -2
    • +2
    ./search/ConjunctiveSearchModifier.java
    • -6
    • +6
    ./search/EditDistanceSearchModifier.java
    • -9
    • +9
    ./search/PartitionedTOccurrenceSearcher.java
    • -4
    • +4
    ./util/PartitionedInvertedIndexTokenizingTupleIterator.java
  1. … 3 more files in changeset.
Changed the search algorithm for in-memory length-partitioned inverted indexes to only latch one inverted list at a time.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2485 123451ca-8445-de46-9d55-352943316053

    • -12
    • +14
    ./inmemory/InMemoryInvertedListCursor.java
    • -10
    • +0
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -7
    • +0
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -1
    • +0
    ./search/PartitionedTOccurrenceSearcher.java
  1. … 1 more file in changeset.
Implemented length-partitioned LSM inverted index. Still some cleanup needed.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2467 123451ca-8445-de46-9d55-352943316053

    • -3
    • +4
    ./impls/LSMInvertedIndexOpContext.java
    • -0
    • +56
    ./impls/PartitionedLSMInvertedIndex.java
    • -4
    • +0
    ./inmemory/InMemoryInvertedListCursor.java
    • -11
    • +16
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -0
    • +48
    ./ondisk/PartitionedOnDiskInvertedIndexFactory.java
  1. … 9 more files in changeset.
Implemented in-memory component for length-partitioned inverted indexes.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2466 123451ca-8445-de46-9d55-352943316053

    • -0
    • +31
    ./api/IPartitionedInvertedIndex.java
    • -4
    • +8
    ./inmemory/InMemoryInvertedIndexAccessor.java
    • -6
    • +9
    ./inmemory/InMemoryInvertedIndexOpContext.java
    • -9
    • +16
    ./inmemory/InMemoryInvertedListCursor.java
    • -0
    • +136
    ./inmemory/PartitionedInMemoryInvertedIndex.java
    • -0
    • +31
    ./inmemory/PartitionedInMemoryInvertedIndexAccessor.java
    • -0
    • +36
    ./inmemory/PartitionedInMemoryInvertedIndexOpContext.java
    • -6
    • +31
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -18
    • +10
    ./search/InvertedListPartitions.java
    • -53
    • +63
    ./search/PartitionedTOccurrenceSearcher.java
    • -68
    • +0
    ./util/InvertedIndexTokenizingNumTokensTupleIterator.java
  1. … 7 more files in changeset.
Refactored code for better sharing. Added new test for on-disk component of length-partitioned inverted index.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2465 123451ca-8445-de46-9d55-352943316053

    • -85
    • +5
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -0
    • +154
    ./search/AbstractTOccurrenceSearcher.java
    • -0
    • +102
    ./search/InvertedListPartitions.java
    • -129
    • +9
    ./search/PartitionedTOccurrenceSearcher.java
    • -133
    • +9
    ./search/TOccurrenceSearcher.java
  1. … 1 more file in changeset.
Implemented bulk loading and basic search for the on-disk components of length-partitioned inverted indexes.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2462 123451ca-8445-de46-9d55-352943316053

    • -1
    • +1
    ./api/IInvertedIndexSearchModifier.java
    • -4
    • +26
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -1
    • +1
    ./search/ConjunctiveSearchModifier.java
    • -2
    • +2
    ./search/EditDistanceSearchModifier.java
    • -0
    • +330
    ./search/InvertedListMerger.java
    • -385
    • +37
    ./search/PartitionedTOccurrenceSearcher.java
    • -298
    • +21
    ./search/TOccurrenceSearcher.java
  1. … 2 more files in changeset.
Checkpointing progress on implementing a length-partitioned inverted index.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2455 123451ca-8445-de46-9d55-352943316053

    • -0
    • +20
    ./api/IObjectFactory.java
    • -35
    • +41
    ./ondisk/OnDiskInvertedIndex.java
    • -0
    • +5
    ./ondisk/OnDiskInvertedIndexOpContext.java
    • -0
    • +145
    ./ondisk/PartitionedOnDiskInvertedIndex.java
    • -0
    • +27
    ./search/ArrayListFactory.java
    • -0
    • +34
    ./search/InvertedListCursorFactory.java
    • -0
    • +607
    ./search/PartitionedTOccurrenceSearcher.java
    • -0
    • +139
    ./search/SearchResult.java
    • -140
    • +68
    ./search/TOccurrenceSearcher.java
    • -95
    • +0
    ./search/TOccurrenceSearcherSuffixProbeOnly.java
    • -134
    • +0
    ./search/TOccurrenceSearcherSuffixScanOnly.java
    • -0
    • +51
    ./util/ObjectCache.java
  1. … 3 more files in changeset.
First steps in preparing the inverted-index testing framework to deal with length partitioning.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2454 123451ca-8445-de46-9d55-352943316053

    • -0
    • +4
    ./api/IInvertedIndexSearchModifier.java
    • -0
    • +10
    ./search/ConjunctiveSearchModifier.java
    • -0
    • +10
    ./search/EditDistanceSearchModifier.java
    • -0
    • +10
    ./search/JaccardSearchModifier.java
    • -0
    • +68
    ./util/InvertedIndexTokenizingNumTokensTupleIterator.java
    • -6
    • +6
    ./util/InvertedIndexTokenizingTupleIterator.java
  1. … 2 more files in changeset.
Some generalizations to support length filtering.

git-svn-id: https://hyracks.googlecode.com/svn/branches/hyracks_lsm_length_filter@2453 123451ca-8445-de46-9d55-352943316053

    • -24
    • +40
    ./ondisk/OnDiskInvertedIndex.java