Checkout Tools
  • last updated 7 hours ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
fsfs: Use the `WITHOUT ROWID` optimization for rep-cache.db in format 8.

This optimization, introduced in SQLite 3.8.2, works well for tables that

have non-integer primary keys, such as

hash TEXT NOT NULL PRIMARY KEY

in the rep-cache.db. (See the https://sqlite.org/withoutrowid.html article

for additional details.)

A quick experiment showed a reduction of the on-disk size of the database

by ~1.75x. The lookups should also be faster, both due to the reduced

database size and due to the lesser amount of internal bsearches. This

should improve the times of new commits and `svnadmin load`, especially

for large repositories that also have large rep-cache.db files.

In order to maintain compatibility, since SQLite versions prior to 3.8.2

do not support this statement, we only start using it for fsfs format 8

repositories and simultaneously bump the minimal required SQLite version

from 3.7.12 (May 2012) to 3.8.2 (December 2013). The last step ensures that

all binaries compiled to support format 8 can work with the tables with

this optimization. Also, as the various scripts have both the minimal

and recommended (3.7.15.1) SQLite versions, we bump the recommended

version to the last 3.8.x patch version, which is 3.8.11.1.

* subversion/libsvn_fs_fs/rep-cache-db.sql

(STMT_CREATE_SCHEMA): Rename this ...

(STMT_CREATE_SCHEMA_V1): ...to this.

(STMT_CREATE_SCHEMA_V2): New, enables `WITHOUT ROWID` optimization.

(STMT_GET_REP, STMT_SET_REP, STMT_GET_REPS_FOR_RANGE,

STMT_GET_MAX_REV, STMT_DEL_REPS_YOUNGER_THAN_REV,

STMT_LOCK_REP, STMT_UNLOCK_REP):

Note that these statements work for both V1 and V2 schemas.

* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__MIN_REP_CACHE_SCHEMA_V2_FORMAT): New.

* subversion/libsvn_fs_fs/rep-cache.c

(REP_CACHE_SCHEMA_FORMAT): Remove.

(open_rep_cache): Select between creating a V1 or V2 schemas based

on the format of the filesystem.

* subversion/libsvn_subr/sqlite.c

(): Bump minimum required SQLite version to 3.8.2.

* subversion/tests/cmdline/svnadmin_tests.py

(check_hotcopy_fsfs_fsx): Check if the Python's built-in SQLite version

is enough to interpret the schema of rep-cache.db, and skip the check

if it's not.

* build/generator/gen_win_dependencies.py

(_find_sqlite): Bump minimum required SQLite version to 3.8.2.

* configure.ac

(SQLITE_MINIMUM_VER): Bump to 3.8.2.

(SQLITE_RECOMMENDED_VER): Bump to 3.8.11.1.

(SQLITE_RECOMMENDED_VER_REL_YEAR): New, required to construct the

download URL which includes the release year for the newer SQLite

amalgamation versions.

(SQLITE_URL): Update the download URL.

* get-deps.sh

(SQLITE_VERSION): Bump to 3.8.11.1.

(SQLITE_VERSION_REL_YEAR): New.

(get_sqlite): Update the download URL that includes the release year

for the newer SQLite amalgamation versions.

* INSTALL

(C.12.SQLite): Bump minimum required SQLite version to 3.8.2.

(E.1.Prerequisites): Bump the minimum and recommended SQLite versions.

  1. … 8 more files in changeset.
fsfs: Lay the groundwork for an extended fix for issues #4623 and #4700.

As per r1813898, we now store both the SHA1 and the uniquifier in the

on-disk property representation strings. The SHA1 value is not required,

but has to be stored due to an existing dependency in the serializer

where the resulting strings can either have both the SHA1 value *and*

the uniquifier, or don't have them at all.

Untie this dependency by introducing a new notation ("-") for such

optional values, which would be supported by the new filesystem format 8.

This would allow us to skip writing SHA1, and only store the uniquifier

in the representation strings for the new formats.

See https://lists.apache.org/thread.html/d282f27c1260c620fe5deb7c9976f4c05bfb34d5156dee1fa6dad644@%3Cdev.subversion.apache.org%3E

* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__MIN_REP_STRING_OPTIONAL_VALUES_FORMAT): New.

* subversion/libsvn_fs_fs/structure

(Revision file format): Describe the format extension.

* subversion/libsvn_fs_fs/low_level.c

(format_uniquifier): New helper function, factored out from ...

(svn_fs_fs__unparse_representation): ...here. Support the "-"

notation for the absent SHA1 and uniquifier values in the new format.

Tweak the code to handle the older formats one by one. Keep the code

for the newest format in the end to simplify extending it in the future.

(svn_fs_fs__parse_representation): Handle the new "-" notation when

parsing SHA1 and uniquifier values.

  1. … 2 more files in changeset.
* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__MIN_CONFIG_FILE): Fix the ordering of capability-related

definitions by moving this define into the appropriate place.

fsfs: Introduce new 'compression' config option.

This option allows explicitly specifying the compression algorithm for

format 8 repositories. It deprecates the previously used 'compression-level'

option. The syntax of the new option is:

compression = none | lz4 | zlib | zlib-1 ... zlib-9

See the related discussion in:

https://lists.apache.org/thread.html/40650a309d8ff041adbb62e8ffe19cc3990b9098a032db932fabd170@%3Cdev.subversion.apache.org%3E

* subversion/libsvn_fs_fs/fs.h

(CONFIG_OPTION_COMPRESSION): New.

(compression_type_t): New.

(fs_fs_data_t): Add field to store the delta compression type.

* subversion/libsvn_fs_fs/fs_fs.c

(write_config): Revamp the section describing delta compression.

(parse_compression_option): New helper function.

(read_config): Parse the new 'compression' option when working with

newer formats, with a possible fall back to 'compression-level' in

case it's specified explicitly. In order to always have appropriate

and usable compression settings in ffd, move the part of the code that

disables compression when only svndiff0 is supported from ...

* subversion/libsvn_fs_fs/transaction.c

(txdelta_to_svndiff): ...this function. Adjust this function to select

the appropriate svndiff version, depending on the options.

* win-tests.py

(): Rename 'fsfs_compression_level' to 'fsfs_compression'.

(_usage_exit): Adjust usage text.

* build/run_tests.py

(): Update usage comment.

(_init_py_tests): Pass the option as string, not int.

(create_parser): Parse the option as string, not int.

* subversion/tests/cmdline/svntest/main.py

(parse_options): Only allow using the --fsfs-compression option with

--server-minor-version >= 10.

(_create_parser): Parse the option as string, not int.

(run_one): Pass the option as string, not int.

(_post_create_repos): Update the code that adjusts fsfs.conf.

  1. … 5 more files in changeset.
fsfs: Add initial support for LZ4 compression.

This can significantly (up to 3 times) improve the speed of commits and

other operations with large binary or incompressible files, while still

maintaining a decent compression ratio.

Our current use of zlib compression — which, depending on the protocol,

can be used multiple times — heavily affects the speed of commits with

large binary or incompressible files. According to the Squash benchmark

(https://quixdb.github.io/squash-benchmark/) and to my measurements, the

zlib compression speed with the default level is about 30-40 MiB/s, and

it doesn't matter if the file is incompressible or not.

This patch provides an alternative in the form of the LZ4 compression.

While still providing a decent compression ratio, LZ4 offers much faster

compression even than zlib with level=1, and can skip incompressible data

chunks. Presumably, LZ4 is used for on-the-fly compression in different

file systems for these reasons.

With this patch, LZ4 compression will be enabled for fsfs repositories which

specify compression-level=1 in fsfs.conf. The interoperability is implemented

by bumping the format of svndiff to 2 and the repository file system format

to 8. From the client perspective, the patch starts using LZ4 compression

only for file:// protocol, and the support/negotiation of the use of svndiff2

with LZ4 compression for http:// and svn:// can be added later.

The tests for LZ4 compression can be run with one of the following commands:

win-tests.py --fsfs-compression=1

make check FSFS_COMPRESSION=1

* subversion/include/svn_delta.h

(svn_txdelta_to_svndiff3): Update docstring.

* subversion/include/svn_error_codes.h

(SVN_ERR_LZ4_COMPRESSION_FAILED,

SVN_ERR_LZ4_DECOMPRESSION_FAILED): New error codes.

* subversion/include/private/svn_subr_private.h

(svn__compress, svn__decompress): Rename to ...

(svn__compress_zlib, svn__decompress_zlib): ..this.

(svn__compress_lz4, svn__decompress_lz4): Declare new functions.

* subversion/libsvn_subr/compress.c

(): Include LZ4 library header.

(svn__compress, svn__decompress): Rename to ...

(svn__compress_zlib, svn__decompress_zlib): ..this.

(svn__compress_lz4, svn__decompress_lz4): Implement new functions.

* subversion/libsvn_subr/packed_data.c

(write_stream_data, read_stream_data): Update usages of svn__compress()

and svn__decompress().

* subversion/libsvn_delta/svndiff.c

(SVNDIFF_V2): New.

(get_svndiff_header): Update to support svndiff2 headers.

(encode_window, decode_window, write_handler): Support svndiff2 with

LZ4 compression. Tweak the relevant comments.

* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__FORMAT_NUMBER): Bump to 8.

(SVN_FS_FS__MIN_SVNDIFF2_FORMAT): New define.

* subversion/libsvn_fs_fs/fs_fs.c

(write_config): Tweak the compression-level option description.

(svn_fs_fs__create, svn_fs_fs__info_format): Update to handle the

format bump.

* subversion/libsvn_fs_fs/transaction.c

(txdelta_to_svndiff): New helper to call svn_txdelta_to_svndiff3() with

appropriate svndiff version and compression level, depending on the

file system configuration.

(rep_write_get_baton, write_container_delta_rep): Use new helper.

* subversion/libsvn_fs_fs/revprops.c

(parse_packed_revprops, repack_revprops, svn_fs_fs__copy_revprops):

Update usages of svn__compress() and svn__decompress().

* subversion/libsvn_fs_fs/structure

(Filesystem formats): Update to describe usage of svndiff2.

* subversion/tests/libsvn_subr/compress-test.c: New.

* subversion/tests/libsvn_delta/random_test.c

(DEFAULT_ITERATIONS): Increase to 60.

(do_random_test, do_random_combine_test): Test different svndiff versions

and compresssion levels.

* build.conf

(libsvn_subr): Build LZ4 library sources.

(compress-test): Add new section.

* notes/svndiff: Describe svndiff2.

* NOTICE, LICENSE: Include license for LZ4.

  1. … 16 more files in changeset.
Make verify-before-commit a configurable option in FSFS.

The option causes FSFS to verify each new revision immediately before

finalizing the commit (bumping the 'current' revision number).

It is useful to be able to enable this after repository corruption has been

observed that is suspected to be bug-induced.

The option is disabled by default in release-mode builds, and enabled by

default in debug-mode builds. Previously, the verification was permanently

disabled in release-mode builds and enabled in debug-mode builds.

* subversion/libsvn_fs_fs/fs_fs.c

(read_config): Read the option.

(write_config): Write a template for the option.

* subversion/libsvn_fs_fs/fs.h

(CONFIG_OPTION_VERIFY_BEFORE_COMMIT): New.

(fs_fs_data_t): Store the option value.

* subversion/libsvn_fs_fs/transaction.c

(verify_as_revision_before_current_plus_plus): Rename to ...

(verify_before_commit): ... this and compile unconditionally,

not just in debug builds.

(commit_body): Call the function only if the option is enabled.

Patch by: julianfoad (with minor tweaks by me)

  1. … 2 more files in changeset.
Corrected spelling mistakes in comments.

* subversion/include/private/svn_utf_private.h

(svn_utf__glob): as above

* subversion/include/svn_fs.h

(svn_fs_refresh_revision_props): as above

* subversion/libsvn_fs_fs/fs.h

(): as above

* subversion/libsvn_fs_fs/pack.c

(tweak_path_for_ordering): as above

* subversion/libsvn_fs_fs/temp_serializer.h

(): as above

* subversion/libsvn_fs_x/pack.c

(): as above

* subversion/libsvn_subr/prefix_string.c

(): as above

* subversion/libsvn_wc/wc_db.h

(svn_wc__db_wclock_find_root): as above

* subversion/svn/conflict-callbacks.c

(find_option_by_id): as above

* tools/dev/fsfs-access-map.c

(): as above

Obvious fix.

  1. … 9 more files in changeset.
In FSFS, complete the chunked read support for changed paths lists.

This patch does two things. Instead of full changes lists, it caches only

blocks of up to 100 entires - possibly more than one per revision. Secondly,

it needs to be able to seek to specific blocks of changes if they should not

be found in the cache. So, we need to remember their offsets within the

on-disk representation.

This ports r1744987 and r1744991 from FSX to FSFS. Properties and

directories are now the only data structures left with unbound memory

usage.

* subversion/libsvn_fs_x/fs.h

(SVN_FS_FS__CHANGES_BLOCK_SIZE): New define.

(svn_fs_fs__data_t.changes_cache): Update docstring.

(svn_fs_fs__changes_context): Add the NEXT_OFFSET element, so we know where

to read from disk when we need to fall back

to that.

* subversion/libsvn_fs_fs/temp_serializer.h

(svn_fs_fs__changes_list_t): New data structure that decorated the plain

APR array of changes with per-block info such

as relative position on disk etc.

(svn_fs_fs__serialize_changes,

svn_fs_fs__deserialize_changes): Update docstrings as we expect the changes

list to be of the above type now.

* subversion/libsvn_fs_x/temp_serializer.c

(changes_data_t): Superseeded by svn_fs_fs__changes_list_t.

(svn_fs_fs__serialize_changes,

svn_fs_fs__deserialize_changes): Update. The new data type is similar

to the one used before internally.

* subversion/libsvn_fs_fs/caching.c

(svn_fs_fs__initialize_caches): Changed paths lists are now keyed by

rev,block pairs.

* subversion/libsvn_fs_fs/cached_data.c

(svn_fs_fs__get_changes): Fetch only the next block of list entries from

cache, or read the next block from disk and

cache it, respectively. Be sure to provide all

relative location info we may need to restart

the parser for this or the next block.

(block_read_changes): Provide the additional information now required per

entries block. Cache only lists that fit into a

single entries block. Update docstring.

  1. … 4 more files in changeset.
In FSFS, introduce the concept of a get_changes context.

Everything above svn_fs_fs__get_changes is now "final".

This replicates most of r1733804 and its follow-up r1733830.

* subversion/libsvn_fs_fs/fs.h

(svn_fs_fs__changes_context_t): Define new internal data type.

* subversion/libsvn_fs_fs/cached_data.h

(svn_fs_fs__create_changes_context): Declare new constructor.

(svn_fs_fs__get_changes): Take inputs from the CONTEXT now and use the

two-pool paradigm.

* subversion/libsvn_fs_fs/cached_data.c

(svn_fs_fs__create_changes_context): Implement.

(svn_fs_fs__get_changes): CONTEXT provides most of the parameters and

variables now. SCRATCH_POOL is provided by

the caller. Be sure to update the CONTEXT

according to the *CHANGES returned and make

sure to close the rev file in the last call.

* subversion/libsvn_fs_fs/stats.c

(get_phys_change_count): Update caller. We can now use a normal ITERPOOL

Instead of a SUBPOOL.

* subversion/libsvn_fs_fs/transaction.c

(svn_fs_fs__paths_changed): Update the backward compat code.

* subversion/libsvn_fs_fs/tree.c

(fs_revision_changes_iterator_data_t,

fs_revision_changes_iterator_get,

fs_report_changes): Fetch the changes iteratively now - even though the

underlying layer will return them in one block, ATM.

  1. … 5 more files in changeset.
Introduce `--no-flush-to-disk' option for `svnadmin load'.

The option can be used to to dramatically speed up the load process when

there's no need to ensure that the resulting data survives a system crash

or power loss — e.g., when loading a dump into a fresh new repository.

This is one of the ideas in http://svn.haxx.se/dev/archive-2015-09/0187.shtml

(Subject: "Whiteboard -- topics list on the white board").

* subversion/include/svn_fs.h

(SVN_FS_CONFIG_NO_FLUSH_TO_DISK): New option.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Add `flush_to_disk' boolean field.

* subversion/libsvn_fs_fs/fs.с

(initialize_fs_struct): Initialize the new field.

* subversion/libsvn_fs_fs/fs_fs.c

(read_global_config): Set the new field based on what's in fs->config.

* subversion/libsvn_fs_fs/util.h

(svn_fs_fs__move_into_place): Accept a new `flush_to_disk' argument.

* subversion/libsvn_fs_fs/util.c

(svn_fs_fs__move_into_place): Make the flush optional based on the

new argument.

* subversion/libsvn_fs_fs/transaction.c

(get_and_increment_txn_key_body): Don't flush to disk if that's allowed.

(write_final_revprop): Accept a new `flush_to_disk' argument. Make the

flush optional based on the new argument.

(commit_body): Don't flush to disk if that's allowed. Adjust calls to

write_final_revprop() and svn_fs_fs__move_into_place().

* subversion/libsvn_fs_fs/revprops.c

(switch_to_new_revprop): Adjust the call to svn_fs_fs__move_into_place().

Keep the existing behavior and always flush to disk.

* subversion/svnadmin/svnadmin.c

(svnadmin__no_flush_to_disk): New enum value.

(options_table): Define --no-flush-to-disk option.

(cmd_table): Allow `load' to accept --no-flush-to-disk.

(svnadmin_opt_state): Add `no_flush_to_disk' member.

(open_repos): Move below the definition of svnadmin_opt_state. Accept

an svnadmin_opt_state structure as one of the arguments and initialize

the new SVN_FS_CONFIG_NO_FLUSH_TO_DISK option based on it.

(subcommand_crashtest, subcommand_deltify, subcommand_dump,

subcommand_dump_revprops, subcommand_load, subcommand_load_revprops,

subcommand_lstxns, subcommand_recover, subcommand_rmtxns, set_revprop,

subcommand_setuuid, subcommand_pack, subcommand_verify, subcommand_info,

subcommand_lock, subcommand_lslocks, subcommand_rmlocks,

subcommand_unlock): Adjust these callers of open_repos().

(main): Handle --no-flush-to-disk option.

* subversion/tests/cmdline/svnadmin_tests.py

(load_no_flush_to_disk): New test.

(test_list): Add reference to new test.

* tools/client-side/bash_completion

(_svnadmin): Add new option to `load'.

  1. … 10 more files in changeset.
* subversion/libsvn_fs_fs/fs.h

(struct node_revision_t): Followup to r1727028, update comment.

Do not read TXN props on every svn_fs_txn_open() in libsvn_fs_fs: FSFS doesn't

use transaction_t.proplist (and never used). This code seems to inherited

from BDB.

FWIW this is responsible for about 5% I/O operations when running testsuite

over http:// protocol on Windows, because mod_dav_svn opens TXN for every

request against transaction.

* subversion/libsvn_fs_fs/fs.h

(transaction_t): Remove PROPLIST member.

* subversion/libsvn_fs_fs/transaction.c

(svn_fs_fs__get_txn): Remove call to get_txn_proplist().

  1. … 1 more file in changeset.
Instead of a UUID, produce a unique 64 bit number as a key prefix to FSFS'

temporary revprop cache.

This introduces a svn_atomic__unique_counter(), which uses a thread-safe

64 bit counter implementation to produce unique values. Switching FSFS

from a UUID string to an integer key element is straight-forward.

The rationale behind this change is that the UUID generation may be very

expensive on some systems.

Suggested by: rhuijben

* subversion/include/private/svn_atomic.h

(svn_atomic__unique_counter): Declare the new private API.

* subversion/libsvn_subr/atomic.c

(unique_counter,

counter_status,

counter_mutex): New static objects for the counter itself and its

access serialization support.

(init_unique_counter,

read_unique_counter,

svn_atomic__unique_counter): New functions implementing the new API.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Change the prefix type to a ui64.

* subversion/libsvn_fs_fs/fs.c

(initialize_fs_struct): Update initialization for that struct.

* subversion/libsvn_fs_fs/caching.c

(svn_fs_fs__initialize_caches): The cache key is now a pair_cache_key_t.

* subversion/libsvn_fs_fs/revprops.c

(svn_fs_fs__reset_revprop_cache): Update prefix reset code.

(prepare_revprop_cache): Call the new API to generate the prefix and

handle errors.

(cache_revprops,

svn_fs_fs__get_revision_proplist): Update cache key construction.

  1. … 5 more files in changeset.
Restore the 1.8 behavior of svn_fs_contents_changed() and _props_changed()

API. Switch all calling sites of the new API, svn_fs_contents_different()

and _props_different(), back to using the old functions.

There are no user-visible problems associated with the old code. The new

API doesn't improve any real use cases in the current code, but is causing

problems:

- We had a problem with misbehaving svn blame -g

(http://svn.haxx.se/dev/archive-2015-06/0069.shtml, "Blame behaviour

change in 1.9").

- We have an issue with repositories behaving differently in client-side

operations like 'svn log' after dump/load

(http://svn.haxx.se/dev/archive-2015-09/0269.shtml, "No-op changes no

longer dumped by 'svnadmin dump' in 1.9"; also see issue #4598 in

https://issues.apache.org/jira/browse/SVN-4598).

- We could experience same problems originating from other callers of the

new API, because the low level behavior change associated with switching

to it propagates up to higher levels like svn_repos or svn_ra and alters

the behavior of many different callers like svn_ra_get_file_revs2() or

the update reporter. Third-party API callers could not be ready for it

as well, because public API functions like svn_ra_get_file_revs2() didn't

receive an erratum or a bump.

See the discussion in http://svn.haxx.se/dev/archive-2015-10/0022.shtml

("Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9").

* subversion/libsvn_fs_base/dag.c

(svn_fs_base__things_different): Compare the uniquifiers, as we did in 1.8.

* subversion/libsvn_fs_base/fs.h

(node_revision_t.data_key_uniquifier): Remove the comment about not using

this field.

* subversion/libsvn_fs_fs/fs_fs.c

(svn_fs_fs__noderev_same_rep_key): Reintroduce this helper function.

(svn_fs_fs__file_text_rep_equal, svn_fs_fs__prop_rep_equal): Always

assume the strict mode in these helpers.

* subversion/libsvn_fs_fs/fs_fs.h

(svn_fs_fs__noderev_same_rep_key): Declare this re-added helper.

(svn_fs_fs__file_text_rep_equal, svn_fs_fs__prop_rep_equal): Update the

docstrings for these helper functions.

* subversion/libsvn_fs_fs/dag.c

(svn_fs_fs__dag_things_different): Preserve the current comparison behavior

in strict mode. Restore the 1.8 way of comparing the representation keys

in non-strict mode.

* subversion/libsvn_fs_fs/tree.c

(merge): Restore the 1.8 way of comparing property lists.

* subversion/libsvn_fs_fs/fs.h

(representation_t.uniquifier): Remove the comment about not using this

field.

* subversion/libsvn_repos/delta.c

(delta_proplists): Switch back to using svn_fs_props_changed().

(svn_repos__compare_files): Restore this function to its 1.8 state.

(delta_files): Restore the 1.8 way of comparing files.

* subversion/libsvn_repos/dump.c

(dump_node): Switch back to using svn_fs_contents_changed() and

svn_fs_props_changed().

* subversion/libsvn_repos/reporter.c

(delta_proplists): Switch back to using svn_fs_props_changed().

* subversion/libsvn_repos/rev_hunt.c

(send_path_revision): Switch back to using svn_fs_contents_changed().

Remove the no longer necessary hack for svn blame -g with older clients.

* subversion/include/svn_ra.h

(svn_ra_get_file_revs2): Update @note in the docstring.

* subversion/include/svn_repos.h

(svn_repos_get_file_revs2): Update @note in the docstring.

* subversion/tests/cmdline/svnadmin_tests.py

(dump_no_op_change): No longer fails with fsfs and bdb.

  1. … 13 more files in changeset.
In FSFS, add the revprop cache object and its invalidation logic.

Do not write to the cache, yet.

Note that in an attempt to minimize the CPU overhead when reading

whole revprop pack files, we don't fully parse them before caching

them. Instead, the cache expects the plain serialized hash in set()

and only parses it during get(). The overhead for the getter is very

small while the setter gets significantly faster.

Also note that svn_fs_fs__get_revision_proplist will be the only

place where we may ever read cached revprops. Therefore, functions

like svn_fs_fs__set_revision_proplist will never write (partially)

outdated revprops.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Add the revprop cache and its key prefix.

* subversion/libsvn_fs_fs/caching.c

(svn_fs_fs__initialize_caches): Construct the new cache as well.

* subversion/libsvn_fs_fs/fs.c

(fs_refresh_revprops): This is now "invalidate revprop cache".

(initialize_fs_struct): Initialize the new struct element.

* subversion/libsvn_fs_fs/temp_serializer.h

(svn_fs_fs__serialize_revprops,

svn_fs_fs__deserialize_revprops): Declare (de-)serialization functions

for the new cache.

* subversion/libsvn_fs_fs/temp_serializer.c

(svn_fs_fs__serialize_revprops,

svn_fs_fs__deserialize_revprops): Implement them asymmetrically.

* subversion/libsvn_fs_fs/revprops.h

(svn_fs_fs__reset_revprop_cache): Declare our new cache invalidation

function.

* subversion/libsvn_fs_fs/revprops.c

(svn_fs_fs__reset_revprop_cache): Implement.

(prepare_revprop_cache): New utility function.

(svn_fs_fs__get_revision_proplist): Actually evaluate the REFRESH option/

sync barrier now and return cached

data if available - which currently

is never the case.

(svn_fs_fs__set_revision_proplist): After we changed revprops, we *know*

our cache is invalid. IOW, write

implies a barrier.

* subversion/libsvn_fs_fs/fs_fs.c

(change_rev_prop_body): Document why we need the "refresh" option here.

  1. … 7 more files in changeset.
Remove code that become unused since r1687061.

* subversion/libsvn_fs_fs/fs.h

(PATH_TXN_PROPS_FINAL): Remove.

* subversion/libsvn_fs_fs/transaction.c

(path_txn_props_final): Remove.

(set_txn_proplist): Remove FINAL argument and relevant code.

(svn_fs_fs__change_txn_props, svn_fs_fs__begin_txn): Adapt calls to

set_txn_proplist().

  1. … 1 more file in changeset.
Begin working on the fs-test 44 issue with FSFS:

Add an in-txn file size info to cached directories. For that,

we simply wrap the entries array into a new struct at the cache

interface.

This patch only introduces the struct and updates the cache

access functions. The filesize value is neither being set nor

checked at this point.

* subversion/libsvn_fs_fs/fs.h

(svn_fs_fs__dir_data_t): New data structure.

* subversion/libsvn_fs_fs/cached_data.c

(get_dir_contents): Return the new struct instead of the plain

entries array.

(svn_fs_fs__rep_contents_dir): Update caller. Use the new struct

at the cache interface.

* subversion/libsvn_fs_fs/temp_serializer.c

(dir_data_t): Add TXN_FILESIZE element such that we can store

all parts of the new svn_fs_fs__dir_data_t.

(serialize_dir,

deserialize_dir): Expect and return the new struct instead of

a plain dir entries array.

(svn_fs_fs__serialize_dir_entries,

slowly_replace_dir_entry): Update caller.

* subversion/libsvn_fs_fs/temp_serializer.h

(svn_fs_fs__serialize_dir_entries,

svn_fs_fs__deserialize_dir_entries): Update type in docstrings.

  1. … 3 more files in changeset.
Docstring improvement. No functional change.

* subversion/libsvn_fs_fs/fs.h

(representation_t): Clarify the EXPANDED_SIZE==0 case.

Found by: julianfoad.

Follow-up to r1634880: Cleanup unused revision properties caching code.

* subversion/libsvn_fs_fs/caching.c

(read_config): Remove CACHE_REVPROPS output parameter - it was always

initalized to FALSE.

(svn_fs_fs__initialize_caches): Remove revprop cache initialization.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Remove REVPROP_CACHE member.

* subversion/libsvn_fs_fs/revprops.c

(has_revprop_cache): Remove unused function.

(switch_to_new_revprop): Remove unused argument BUMP_GENERATION.

(parse_revprop, svn_fs_fs__set_revision_proplist): Remove unused code.

  1. … 2 more files in changeset.
Make FSFSv7 repositories always use consistent addressing mode, instead of

saving revision number from which logical addressing was enabled.

From the performance point of view there will be no big benefits to enable

log addressing for an existing repository, because the existing old part

of the repository will remain to be addressed physically. So those who

want to benefit from the performance improvements related to the log

addressing feature will be required to dump/load their repositories anyway.

On the other hand, consistent addressing allows us to omit some tricky code.

This also fixes problems with long living svn_fs_t instances during the hot

repository upgrade in background.

* subversion/libsvn_fs_fs/fs.h

(PATH_TXNS_LA_DIR): Remove.

(fs_fs_data_t): Replace MIN_LOG_ADDRESSING_REV member with simple

USE_LOG_ADDRESSING flag.

* subversion/libsvn_fs_fs/fs.c

(initialize_fs_struct): Initialize USE_LOG_ADDRESSING field instead of

MIN_LOG_ADDRESSING_REV.

* subversion/libsvn_fs_fs/util.c

* subversion/libsvn_fs_fs/util.h

(svn_fs_fs__use_log_addressing): Remove REV argument.

* subversion/libsvn_fs_fs/cached_data.c

(dbg_log_access, use_block_read, svn_fs_fs__rev_get_root,

svn_fs_fs__check_rep, svn_fs_fs__get_changes): Adapt calls to

svn_fs_fs__use_log_addressing().

* subversion/libsvn_fs_fs/dump-index.c

(svn_fs_fs__dump_index): Adapt calls to svn_fs_fs__use_log_addressing().

* subversion/libsvn_fs_fs/fs_fs.c

(read_format): Replace MIN_LOG_ADDRESSING_REV output parameter with

USE_LOG_ADDRESSING. Do not parse revision number if repository uses

logical addressing mode.

(svn_fs_fs__write_format): Simple write "addressing physical" or

"addressing logical" to format file depending repository addressing mode.

(svn_fs_fs__read_format_file): Adapt calls to read_format() function.

(upgrade_body): Replace local variable MIN_LOG_ADDRESSING_REV with

USE_LOG_ADDRESSING flag. Adapt calls to read_format() and remove

code that renames transaction folder during upgrade.

(write_revision_zero): Adapt calls to svn_fs_fs__use_log_addressing().

(svn_fs_fs__create_file_tree): Replace MIN_LOG_ADDRESSING_REV argument with

USE_LOG_ADDRESSING flag and use it for FFD->USE_LOG_ADDRESSING

initialization.

(svn_fs_fs__create): Pass TRUE for USE_LOG_ADDRESSING parameter in call to

svn_fs_fs__create_file_tree().

* subversion/libsvn_fs_fs/fs_fs.h

(svn_fs_fs__create_file_tree): Update function declaration and docstring.

* subversion/libsvn_fs_fs/hotcopy.c

(hotcopy_create_empty_dest): Adapt call to svn_fs_fs__create_file_tree().

* subversion/libsvn_fs_fs/load-index.c

* subversion/libsvn_fs_fs/index.c

(svn_fs_fs__item_offset, svn_fs_fs__load_index): Adapt calls

to svn_fs_fs__use_log_addressing().

* subversion/libsvn_fs_fs/pack.c

(svn_fs_fs__order_dir_entries, pack_rev_shard): Adapt calls

to svn_fs_fs__use_log_addressing().

* subversion/libsvn_fs_fs/stats.c

(parse_representation, read_noderev, read_revisions): Adapt calls

to svn_fs_fs__use_log_addressing().

* subversion/libsvn_fs_fs/transaction.c

(auto_truncate_proto_rev, store_p2l_index_entry, allocate_item_index,

write_final_rev): Adapt calls to svn_fs_fs__use_log_addressing().

(store_l2p_index_entry): Remove FINAL_REVISION argument. Adapt calls

to svn_fs_fs__use_log_addressing().

(compare_sort_p2l_entry, using_log_addressing, upgrade_transaction):

Remove.

(commit_body): Remove call to upgrade_transaction().

* subversion/libsvn_fs_fs/verify.c

(svn_fs_fs__verify): Adapt calls to svn_fs_fs__use_log_addressing().

* subversion/tests/libsvn_fs/fs-test.c

(upgrade_while_committing): Expect that begin_txn and commit doesn't fail

after upgrade to fsfs7.

  1. … 15 more files in changeset.
* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__USE_LOCK_MUTEX): Remove outdated reference from docstring.

no functional change.

Remove the SHM-dependent revprop caching logic from /trunk such that we

may remove the named_atomics code entirely. Keep actual cache access

code (although not being called, ATM) to reduce the diff against the

revprop-caching-ng branch.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Remove the named_atomics fields.

* subversion/libsvn_fs_fs/revprops.h

(svn_fs_fs__write_revprop_generation_file,

svn_fs_fs__cleanup_revprop_namespace): Remove revprop caching

management API declaration.

* subversion/libsvn_fs_fs/revprops.c

(ATOMIC_REVPROP_GENERATION,

ATOMIC_REVPROP_TIMEOUT,

ATOMIC_REVPROP_NAMESPACE): No longer needed.

(Revprop caching management): Remove doc section.

(read_revprop_generation_file,

svn_fs_fs__write_revprop_generation_file,

ensure_revprop_namespace,

svn_fs_fs__cleanup_revprop_namespace,

ensure_revprop_generation,

ensure_revprop_timeout,

log_revprop_cache_init_warning): Remove revprop caching management code.

(has_revprop_cache): Always return FALSE.

(revprop_generation_fixup_t,

revprop_generation_fixup,

read_revprop_generation,

begin_revprop_change,

end_revprop_change): Remove revprop caching management code.

(read_pack_revprop,

svn_fs_fs__get_revision_proplist,

switch_to_new_revprop,

write_packed_revprop): Remove invocations of the former revprop

caching management functions.

* subversion/libsvn_fs_fs/hotcopy.c

(hotcopy_body): No revprop generation file to take care of anymore.

* subversion/libsvn_fs_fs/recovery.c

(recover_body): No SHM namespace needed for revprop caching anymore.

* subversion/tests/cmdline/svnadmin_tests.py

(check_hotcopy_fsfs_fsx): Remove the check for the SHM-related files.

  1. … 5 more files in changeset.
Rename a minimum format constant for consistency with others.

* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__MIN_MERGEINFO_IN_CHANGES_FORMAT): Rename to ...

(SVN_FS_FS__MIN_MERGEINFO_IN_CHANGED_FORMAT): ... this.

* subversion/libsvn_fs_fs/low_level.c

(svn_fs_fs__write_changes): Update usage.

Suggested by: julianfoad

  1. … 1 more file in changeset.
Make FSFS format 7 use a different name for the 'transactions' folder.

This is somewhat similar to what happened during the f1/f2->f3 upgrade

which made old code either succeed with a commit or fail in various

places when the format upgrade already had an impact on the contents.

With this change, however, old servers will not be able to even access

txns after the upgrade and will consistently see ENOENT errors. Hence,

they can't write to the repo. New servers will only be able to continue

transactions after a hot upgrade when they refresh their format info

as done in e.g. r1627949.

* subversion/libsvn_fs_fs/fs.h

(PATH_TXNS_DIR): Document that this is only valid for pre-f7 repos.

(PATH_TXNS_LA_DIR): New path constant.

* subversion/libsvn_fs_fs/fs_fs.c

(upgrade_body): Rename the txns dir if the repo has been pre-f7.

Be sure to revert that if the upgrade process failed.

* subversion/libsvn_fs_fs/util.c

(svn_fs_fs__path_txns_dir): Return the now format-dependent txns folder.

* subversion/libsvn_fs_fs/structure

(Layout of the FS directory): Document how the txns folder name depends

on the repo format.

* subversion/tests/libsvn_fs/fs-test.c

(upgrade_while_committing): Adapt test case. Any txn access with an

outdated svn_fs_t will now fail with ENOENT.

  1. … 4 more files in changeset.
Revert r1593015. Resolve text conflicts.

Suggested by: kotkov

  1. … 1 more file in changeset.
Revert r1620909 as requested by zhakov.
  1. … 12 more files in changeset.
[Reverted in r1620928]

Make FSFS export the private APIs that svnfsfs comsumes.

Not much going on here, mainly moving lots of declarations

and definitions to the new svn_fs_fs_private.h header.

* build.conf

(libsvn_fs_fs): Tell msvc what to export.

* subversion/include/private/svn_fs_fs_private.h

(): New header file. Contents taken from the following headers.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_shared_txn_data_t,

fs_fs_shared_data_t,

fs_fs_dag_cache_t,

fs_fs_data_t): Moved to the new header.

* subversion/libsvn_fs_fs/id.h

(svn_fs_fs__id_part_t): Same.

* subversion/libsvn_fs_fs/index.h

(SVN_FS_FS__ITEM_INDEX_*,

SVN_FS_FS__ITEM_TYPE_*,

svn_fs_fs__p2l_entry_t,

svn_fs_fs__p2l_index_lookup,

svn_fs_fs__p2l_get_max_offset,

svn_fs_fs__l2p_index_from_p2l_entries,

svn_fs_fs__p2l_index_from_p2l_entries): Same.

* subversion/libsvn_fs_fs/pack.h

(svn_fs_fs__get_packed_offset): Same.

* subversion/libsvn_fs_fs/rev_file.h

(svn_fs_fs__packed_number_stream_t,

svn_fs_fs__revision_file_t,

svn_fs_fs__open_pack_or_rev_file,

svn_fs_fs__open_pack_or_rev_file_writable,

svn_fs_fs__auto_read_footer,

svn_fs_fs__close_revision_file): Same.

* subversion/libsvn_fs_fs/transaction.h

(svn_fs_fs__add_index_data): Same.

* subversion/libsvn_fs_fs/util.h

(svn_fs_fs__use_log_addressing): Same.

* subversion/libsvn_fs_fs/cached_data.c

(): Add (now) missing #include.

* subversion/svnfsfs/dump-index-cmd.c,

subversion/svnfsfs/load-index-cmd.c,

subversion/svnfsfs/stats-cmd.c:

(): #include the new header instead of the lib-internal ones.

  1. … 12 more files in changeset.
Revert r1619413. Resolve a text conflict in fs-fs-pack-test.c .
  1. … 8 more files in changeset.
[Reverted in r1619769]

Enforce the "everybody or nobody" restriction on revprop caching, i.e.

either all processes (usually servers) or none must use revprop caching.

This patch makes sure that the first process to use revprop caching will

create the revprop generation file before even reading and caching any

revprops - instead of auto-creating it upon the first write. Then this

file is being used as an indicator that the repo has been accessed at

least once using revprop caching. FS instances without that feature

enabled will then be banned from writing any revprops.

To make this workable, applications that may write revprops but are not

servers themselves (the latter having explicit revprop caching options)

need to engage the revprop caching infrastructure automatically if the

repo that they are accessing requires it. svnadmin, svnlook and ra_local

enable this new mode.

* subversion/include/svn_fs.h

(SVN_FS_CONFIG_FSFS_CACHE_REVPROPS): Document that there is now yet

another mode.

* subversion/libsvn_fs_fs/fs.h

(fs_fs_data_t): Add the flag allowing on-demand activation of revprop

caching.

* subversion/libsvn_fs_fs/fs_fs.h

(svn_fs_fs__initialize_revprop_caches): Declare new function to init

the revprop caches only.

* subversion/libsvn_fs_fs/caching.c

(read_config): Handle the new revprop caching mode.

(cache_key_prefix): Common functionality factored out from

svn_fs_fs__initialize_caches.

(svn_fs_fs__initialize_revprop_caches): Implement, mainly taking out

of ...

(svn_fs_fs__initialize_caches): ... this. Update after refactoring.

* subversion/libsvn_fs_fs/revprops.c

(init_generation_baton_t,

init_revprop_generation_file): New code to create the revprop generation

file before the first cached revprop

access (read or write).

(read_revprop_generation_file): Trigger auto-creation of that file.

(enforce_consistent_caching): New function containing the actual

cache settings consistency check.

(svn_fs_fs__set_revision_proplist): Trigger the new check to make sure

we don't modify revprops w/o telling

caching applications.

* subversion/libsvn_ra_local/split_url.c

(svn_ra_local__split_URL): Set "enable revprop caching on demand" feature

instead of "no revprop caching" on open repos.

* subversion/svnadmin/svnadmin.c

(open_repos): Use the same mode here.

* subversion/svnlook/svnlook.c

(get_ctxt_baton): And here.

* subversion/tests/libsvn_fs_fs/fs-fs-pack-test.c

(enforce_consistent_revprop_caching): New test covering the new mode as

well as new the consistency check.

(test_funcs): Register new test.

  1. … 8 more files in changeset.
Avoid shared data clashes and false cache key aliasing between repositories

duplicated using 'hotcopy' or created as a result of dump / load cycles.

See the discussion in http://svn.haxx.se/dev/archive-2014-04/0245.shtml

and http://svn.haxx.se/dev/archive-2014-08/0093.shtml

This is not a "scalability issue" (as stated in the first of the referenced

threads), but rather a full-fledged problem. We have an ability to share

data between different objects pointing to same filesystems. This sharing

works within a single process boundary; currently we share locks

(svn_mutex__t objects) and certain transaction data. Accessing this kind of

shared data requires some sort of a key and we used to use a filesystem UUID

for this purpose. However, this is *not* good enough for at least a couple

of cases.

Filesystem UUIDs aren't really unique for every filesystem an end user might

have, because they get duplicated during hotcopy, naive copy (copy-paste) or

dump / load cycles. Whenever we have two filesystems with the same UUIDs

open within a single process, the shared data starts clashing and things can

get pretty ugly. For example, one can experience random errors with parallel

commits to 2 repositories with the same UUID (hosted by Apache HTTP Server).

Another example was recently mitigated by http://svn.apache.org/r1589653 — we

did encounter a deadlock within nested 'svnadmin freeze' commands executed

for two repositories with the same UUID.

Errors that I witnessed include (but might not be limited to):

- Cannot write to the prototype revision file of transaction '392-ax'

because a previous representation is currently being written by this

process (SVN_ERR_FS_CORRUPT)

- Can't unlock unknown transaction '392-ax' (SVN_ERR_FS_CORRUPT)

- Recursive locks are not supported (SVN_ERR_RECURSIVE_LOCK)

# This used to be deadlock prior to http://svn.apache.org/r1591919

Fix the issue by introducing a concept of "instance IDs" on the FS layer.

Basically, this gives us an ability to distinguish filesystem duplicates or

near-duplicates produced via our API. We can now have different filesystems

with the same "original" UUID, but with different instance IDs. With this

concept, it is rather easy to get rid of the shared data clashes described

above. While doing this, also prevent false aliasing for our cache keys by

throwing in the instance ID there as well. This kind of aliasing is no

better than the shared data clashes — we might encounter a deadly situation

when two entirely different filesystems access each other's cached data due

to the UUID aliasing.

[ Note from the future: We stopped using instance IDs in the cache keys in

r1623402, see http://svn.haxx.se/dev/archive-2014-08/0239.shtml ]

Patch by: stefan2

me

* subversion/libsvn_fs_fs/fs.h

(SVN_FS_FS__MIN_INSTANCE_ID_FORMAT): New.

(fs_fs_data_t.instance_id): New.

* subversion/libsvn_fs_fs/fs_fs.h

(svn_fs_fs__set_uuid): Add the instance ID parameter.

* subversion/libsvn_fs_fs/fs.c

(fs_serialized_init): Use an instance ID as a part of the shared data key.

(fs_set_uuid): New adapter wrapping the svn_fs_fs__set_uuid() function.

Whenever we set a new UUID, imply that filesystem will be a different

instance, i.e. have a new unique instance ID. Strictly speaking, our

approach should work fine even if we choose to preserve the instance ID

upon the UUID bump. However, we stick to the other option — it doesn't

make any real difference, but is a bit simpler to implement and

(arguably) fits the concept better. Resetting a filesystem UUID probably

implies that the user wants to recreate all identification markers for

that filesystem, so we might as well generate a new instance ID.

(fs_vtable): Use the new fs_set_uuid() adapter here.

* subversion/libsvn_fs_fs/fs_fs.c

(svn_fs_fs__open): Read the instance ID when it is supported by the format.

(svn_fs_fs__set_uuid): Rework this routine in order to support writing and

generating instance IDs when required by the filesystem format.

(upgrade_body, svn_fs_fs__create): Generate a new instance ID when

necessary.

* subversion/libsvn_fs_fs/hotcopy.c

(hotcopy_create_empty_dest): Unconditionally generate a new instance ID.

* subversion/libsvn_fs_fs/caching.c

(svn_fs_fs__initialize_caches, svn_fs_fs__initialize_txn_caches): Use an

instance ID as a part of the cache key.

* subversion/tests/cmdline/svnadmin_tests.py

(check_hotcopy_fsfs_fsx): Allow different instance IDs when comparing the

'db/uuid' file contents.

(freeze_freeze): Do not change UUID of hotcopy for new formats supporting

instance IDs. For new formats, 'svnadmin freeze A (svnadmin freeze B)'

should not deadlock or error out with SVN_ERR_RECURSIVE_LOCK even if 'A'

and 'B' share the same UUID.

(freeze_same_uuid): New (fails without the core change).

(test_list): Reference the new test.

* subversion/libsvn_fs_fs/structure

(Layout of the FS directory): Tweak wording for the 'db/uuid' file.

(Filesystem formats): Shortly list the format specifics of that file.

  1. … 7 more files in changeset.