Checkout Tools
  • last updated 6 hours ago
Constraints: committers
Constraints: files
Constraints: dates
Fix an issue with the readline implementation for file streams that could

cause excessive memory usage for inputs containing one or multiple \0 bytes.

This is the likely cause of the OOM reported in

(I think that the problem itself is a regression introduced by me in 1.10.)

Note: one thing I noticed while preparing the fix is that our `readline_fn`

functions for different streams have inconsistent behavior if the input data

contains \0 bytes. More specifically, they may return different `line` values,

that may either be truncated at \0 or actually contain the whole data between

EOLs, including \0 bytes. For now, this patch only fixes the excessive memory

usage problem, and I noted this related problem in the test and left it for

future work.

* subversion/libsvn_subr/stream.c

(readline_apr_lf, readline_apr_generic): Reallocate the buffer based on its

current size, instead of calculating the new size based on the already

prealloc'd size. There are no actual benefits in reallocating based on

`blocksize`, and in the described case with \0 bytes doing so also backfires

and may cause excessive allocations due to the actual size of the string

being less than we expect it to. A degenerate case of the erroneous

behavior is ...

* subversion/tests/libsvn_subr/stream-test.c

(test_stream_readline_file_nul): ...exploited in this new test.

(test_funcs): Run new test.

* subversion/tests/libsvn_subr

(): Adjust svn:ignore.

  1. … 2 more files in changeset.
* subversion/libsvn_subr/stream.c

(seek_handler_checksum): Don't ignore error return from svn_checksum_ctx_reset().

Found by: danielsh

using tools/dev/

Stream svndiff deltas without creating temporary files during 'svn commit'

over HTTP.

Creating the temporary files has a certain overhead (time to write them

and disk space), but what is more important, it prevents the client and

server from simultaneously processing the data, i.e., checksumming and

compressing it. See the log message to r1803144 for additional details

on this.

Technically, in order to do this, we use the new apply_textdelta_stream()

delta editor callback for the wc commit. In order to do that, we need

to make some of the used streams (mostly, with associated temporary files

on disk) properly support svn_stream_reset().

* subversion/include/private/svn_io_private.h

(svn_stream__from_aprfile): Declare this new function.

* subversion/libsvn_subr/stream.c

(struct baton_apr): Add new 'truncate_on_seek' field.

(make_stream_from_apr_file): Accept a new 'truncate_on_seek' argument,

which we'd need to properly reset streams over temporary files.

(seek_handler_apr): Honor the new 'truncate_on_seek' field.

(svn_stream__from_aprfile): Implement by forwarding the arguments to


(svn_stream_from_aprfile2, svn_stream_for_stdin2, svn_stream_for_stdout

svn_stream_for_stderr): Update calls to make_stream_from_apr_file(),

don't change the existing behavior.

(svn_stream__create_for_install): Set the temporary file to be truncated

on seeks.

* subversion/libsvn_wc/adm_crawler.c

(seek_handler_copy): New function.

(copying_stream): Implement reset support in this stream.

(open_txdelta_stream_baton_t): New.

(open_txdelta_stream): New, implements svn_txdelta_stream_open_func_t.

Note that this function must be restartable, as it can be called

more than once, for example, if the HTTP server decides to do a

renegotiation, and we would be forced to resend the whole request

body. If this is the case, we reset the streams and continue.

(svn_wc__internal_transmit_text_deltas): Use the new apply_textdelta_stream()

delta editor callback. Calculate the MD5 checksum with the help of

svn_stream_checksummed2() stream. Pass the disowned streams to the open

txdelta callback.

* subversion/libsvn_wc/deprecated.c

(): Include svn_io_private.h.

(svn_wc_transmit_text_deltas2): Set the temporary file to be truncated

on seeks.

  1. … 3 more files in changeset.
Return resettable streams from svn_stream_checksummed2().

This lays the groundwork required to stream commits over HTTP without

creating temporary files. To switch to the new apply_textdelta_stream()

delta editor callback, we need to make the callback that opens a txdelta

stream restartable, and this is where a resettable checksummed stream

would be handy to have.

Please note that currently the ability to reset a stream, and the new

associated svn_stream_supports_reset() API is tied to the existence of

a non-default seek_fn implementation. The implementation is free to

_only_ support seek_fn(NULL) requests. It would probably be slightly

better to have a separate reset_fn callback. However, we already had it

at some point, and then reverted it in r966156, so let's keep everything

without a separate callback and keep using seek_fn(NULL) for now.

* subversion/include/svn_io.h

(svn_stream_checksummed2): Note that now this API returns a

resettable stream.

(svn_stream_supports_reset): Declare this new function.

* subversion/include/svn_checksum.h

(svn_checksum_ctx_reset): Declare this new function.

* subversion/libsvn_subr/checksum.c

(svn_checksum_ctx_reset): Implement this new function. Forward FNV1-a

implementations to ...

* subversion/libsvn_subr/fnv1a.h

(svn_fnv1a_32__context_reset, svn_fnv1a_32x4__context_reset): ...

* subversion/libsvn_subr/fnv1a.c

(svn_fnv1a_32__context_reset, svn_fnv1a_32x4__context_reset): ...these

new helper functions.

* subversion/libsvn_subr/stream.c

(svn_stream_supports_reset): Implement this new function.

(seek_handler_checksum): Implement the seek handler for checksummed

streams, only support reset requests for now.

(svn_stream_checksummed2): Optionally install the new seek handler.

* subversion/tests/libsvn_subr/checksum-test.c


test_checksummed_stream_reset): New tests.

(test_funcs): Add new tests.

  1. … 6 more files in changeset.
* subversion/libsvn_subr/stream.c

(stream_readline_bytewise): Tweak the outdated docstring.

Fix the excessive amount of read and seek syscalls in various fsfs operations

that read from disk.

The idea behind the fix is that we provide a special svn_stream_readline()

implementation for APR file-based streams that handles a common case of

LF ("\n") end-of-line sequence with apr_file_gets().

For example, this patch turns the following sequence ...

open("/repo/db/revs/22.pack/manifest", O_RDONLY|O_CLOEXEC) = 4

read(4, "0\n3304\n7139\n12288\n13981\n37151\n59"..., 4096) = 4096

read(4, "5140\n20998974\n21002732\n21004670\n"..., 4096) = 4096

lseek(4, 4020, SEEK_SET) = 4020

read(4, "18102627\n18139220\n18141921\n18147"..., 4096) = 4096

read(4, "2535219\n52537641\n52550290\n525536"..., 4096) = 584

lseek(4, 8043, SEEK_SET) = 8043

read(4, "51452518\n51454318\n51456275\n51457"..., 4096) = 657

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

read(4, "", 4096) = 0

close(4) = 0

...into this:

open("/repo/db/revs/22.pack/manifest", O_RDONLY|O_CLOEXEC) = 4

read(4, "0\n3304\n7139\n12288\n13981\n37151\n59"..., 4096) = 4096

read(4, "5140\n20998974\n21002732\n21004670\n"..., 4096) = 4096

read(4, "131\n52730772\n52744939\n52748273\n5"..., 4096) = 508

read(4, "", 4096) = 0


* subversion/libsvn_subr/stream.c

(readline_apr_lf): New.

(readline_apr_generic): New, factored out from ...

(readline_apr_handler): Call the apr_file_gets()-based routine

when we're looking for an LF end-of-line sequence to avoid unnecessary


* subversion/tests/libsvn_subr/stream-test.c

(test_stream_readline_file): New helper for ...

(test_stream_readline_file_lf, test_stream_readline_file_crlf): ...these

new tests that cover svn_stream_readline() behavior for LF and CRLF

terminated lines.

(test_funcs): Add new tests.

  1. … 1 more file in changeset.
Remove the "is buffered" (svn_stream__is_buffered_fn_t) handler for streams.

This handler allowed us to distinguish (in an inverted manner) APR file

or string-based streams that benefit from the chunky implementation of


Since we now have a custom readline handler and these streams have their

specific readline implementations, the svn_stream__is_buffered_fn_t isn't

required anymore.

* subversion/include/svn_io.h

(svn_stream__is_buffered_fn_t, svn_stream__set_is_buffered,

svn_stream__is_buffered): Remove.

* subversion/libsvn_subr/stream.c

(svn_stream_t): Remove the 'is_buffered_fn' field.

(svn_stream__set_is_buffered, svn_stream__is_buffered,

is_buffered_handler_empty, is_buffered_handler_disown,

is_buffered_handler_apr, is_buffered_handler_stringbuf,

is_buffered_handler_string, is_buffered_lazyopen,

stream_readline_chunky): Remove these functions.

(svn_stream_empty, svn_stream_disown, make_stream_from_apr_file,

svn_stream_from_stringbuf, svn_stream_from_string,

svn_stream_lazyopen_create): Don't install the is_buffered handler.

(stream_readline): Remove the special handling for buffered and seekable

streams. Then, inline this function ...


(stream_readline_bytewise): Tweak the docstring to note that this is

now the default (fallback) implementation.

* subversion/libsvn_subr/subst.c

(translated_stream_is_buffered): Remove.

(stream_translated): Don't install the is_buffered handler. We don't

provide a specific readline implementation (although it is possible) for

these streams, because they already have internal buffering, and have an

optimized code path for reading a single byte.

  1. … 2 more files in changeset.
Provide readline implementation for APR file streams.

Currently, this is the same as what we do in stream_readline_chunky(),

but using svn_io_file_seek() directly instead of emulating the same behavior

with stream seek and skip functions. We can do better for a common case

of reading a line with the "\n" EOL by using apr_file_gets(), but I'll do

it separately.

* subversion/libsvn_subr/stream.c

(readline_handler_apr): New.

(make_stream_from_apr_file): Set the new handler for files that support

seeking, i.e., for everything except wrapped stdin, stdout and stderr


Implement readline handlers for a couple of our existing streams.

* subversion/libsvn_subr/stream.c

(readline_handler_disown, readline_handler_stringbuf,

readline_handler_string, readline_handler_lazyopen): New.

(svn_stream_disown, svn_stream_from_stringbuf,

svn_stream_from_string, svn_stream_lazyopen_create): Set new handlers.

Lay some groundwork for fixing an excessive amount of read syscalls in

various fsfs operations that read from disk.

A typical pattern for such calls now looks like this (note the amount of

ReadFile operations ending with an EOF):

ReadFile txns\63-qp5.txn\node.0.0 SUCCESS Offset: 0, Length: 237

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

ReadFile txns\63-qp5.txn\node.0.0 END OF FILE Offset: 237, Length: 4,096

This changeset introduces a new svn_stream_readline_fn_t handler for our

streams. I'll use it to provide custom svn_stream_readline() handling for

our APR file streams.

* subversion/include/svn_io.h

(svn_stream_readline_fn_t, svn_stream_set_readline): New.

* subversion/libsvn_subr/stream.c

(svn_stream_t): Add new handler.

(svn_stream_set_readline): Implement new function.

(svn_stream_readline): Use the specific readline handler if it's available.

(stream_readline): Tweak the docstring.

  1. … 1 more file in changeset.
Follow-up to r1755486: Rename svn_stream_checksum() to


Suggested by: danielsh

* subversion/include/svn_io.h

* subversion/libsvn_subr/stream.c

(svn_stream_checksum): Rename to svn_stream_contents_checksum().

(compute_stream_checksum): Update docstring.

* subversion/libsvn_fs/fs-loader.c

* subversion/libsvn_subr/io.c

* subversion/tests/libsvn_fs/fs-test.c

* subversion/tests/libsvn_subr/stream-test.c

(svn_fs_file_checksum, svn_io_file_checksum2, get_file_checksum,

test_stream_checksum): Adapt callers.

  1. … 5 more files in changeset.
Introduce svn_stream_checksum() function to calculate checksum of specified

stream contents. Use new API where it makes sense.

* subversion/include/svn_io.h

(svn_stream_checksum): New.

* subversion/libsvn_subr/stream.c

(compute_stream_checksum): New. Helper for svn_stream_checksum().

(svn_stream_checksum): New.

* subversion/tests/libsvn_subr/stream-test.c

(test_stream_checksum): New. Simple test for svn_stream_checksum().

(test_funcs): Add test_stream_checksum to test list.

* subversion/libsvn_fs/fs-loader.c

* subversion/libsvn_repos/config_pool.c

* subversion/libsvn_subr/io.c

* subversion/tests/libsvn_fs/fs-test.c

(svn_fs_file_checksum, auto_parse, svn_io_file_checksum2,

get_file_checksum): Use svn_stream_checksum() instead of

svn_stream_checksummed2(READ_ALL=TRUE) + svn_stream_close().

  1. … 6 more files in changeset.
Use svn_io_file_get_offset() instead of svn_io_file_seek(APR_CUR) where it

makes sense.

* subversion/libsvn_client/patch.c

* subversion/libsvn_diff/parse-diff.c

* subversion/libsvn_repos/dump.c

* subversion/libsvn_subr/io.c

* subversion/libsvn_subr/stream.c

* subversion/tests/libsvn_subr/io-test.c

(tell_file, read_handler_base85, hunk_readline_original_or_modified,

svn_diff_hunk_readline_diff_text, parse_next_hunk, parse_binary_patch,

svn_diff_parse_next_patch, store_delta, svn_io_file_readline,

mark_handler_apr, test_file_readline, aligned_seek): Use

svn_io_file_get_offset() instead of svn_io_file_seek(0, APR_CUR).

  1. … 5 more files in changeset.
Continue work started in r1714372:

Fix a number of potential overflow conditions on platforms where pointers

may be allocated very close to the end of address space, such as WoW64.

Same for systems with 32 bit file offsets.

There is no direct way to trigger the respective overflow conditions but

this patch makes the code more resilient against repository corruption and

failures higher up in the call stack.

* subversion/libsvn_fs_fs/cached_data.c

(block_read): Instead of "base + x < max", we must check "max - base > x"

to prevent overflows under all circumstances.

* subversion/libsvn_fs_fs/index.c

(svn_fs_fs__l2p_get_max_ids): Same.

* subversion/libsvn_fs_fs/pack.c

(pack_log_addressed): Same.

* subversion/libsvn_fs_fs/revprops.c

(parse_packed_revprops): Same.

* subversion/libsvn_subr/base64.c


decode_bytes): Same.

* subversion/libsvn_subr/stream.c

(svn_stringbuf_from_stream): Same.

* subversion/libsvn_subr/string.c

(svn_cstring__match_length): Same.

  1. … 6 more files in changeset.
Remove the scratch pool from svn_string_from_stream2(), for consistency with


* subversion/include/svn_io.h,


(svn_string_from_stream2): Remove the scratch scratch pool parameter. (It

wasn't used.)

* subversion/libsvn_fs_x/reps.c

(svn_fs_x__reps_add_base): Update a caller.

* subversion/libsvn_subr/deprecated.c

(svn_string_from_stream): Update a caller.

* subversion/libsvn_wc/old-and-busted.c

(svn_wc__read_entries_old): Update a caller.

* tools/dev/x509-parser.c

(get_der_cert_from_stream): Update a caller.

  1. … 5 more files in changeset.
Improve the documentation of svn_string[buf]_from_stream(), especially in

describing the length hint parameter and in documenting the two similar

functions the same way.

Also rename the output parameter of one of them for consistency with the


* subversion/include/svn_io.h

(svn_stringbuf_from_stream): Rename the output parameter 'str' to 'result'.

Improve the documentation.

(svn_string_from_stream2): Improve the documentation.

* subversion/libsvn_subr/stream.c

(svn_stringbuf_from_stream): Rename the output parameter 'str' to 'result'.

Document the performance characteristics.

  1. … 1 more file in changeset.
* subversion/libsvn_subr/stream.c

(): Include svn_sorts.h.

(svn_stringbuf_from_stream): Optimize memory usage a bit and avoid

svn_stream_read_full() call after we got partial read.

Revv svn_string_from_stream() function and share implementation with


Suggested by: julianf

* subversion/include/svn_io.h

(svn_string_from_stream2): New.

(svn_string_from_stream): Deprecate.

* subversion/libsvn_subr/stream.c

(svn_string_from_stream2): Revv from svn_string_from_stream(): add LEN_HINT

argument. Use svn_stringbuf_from_stream() as implementation.

* subversion/libsvn_subr/deprecated.c

(svn_string_from_stream): Call svn_string_from_stream2() with LEN_HINT=0.

* subversion/libsvn_fs_x/reps.c

* subversion/libsvn_wc/old-and-busted.c

* tools/dev/x509-parser.c

(svn_fs_x__reps_add_base, svn_wc__read_entries_old,

get_der_cert_from_stream): Use svn_string_from_stream2() with

LEN_HINT=SVN__STREAM_CHUNK_SIZE. It doesn't increase memory usage because

we use same pool for SCRATCH and RESULT pool.

  1. … 5 more files in changeset.
Slightly optimize svn_stringbuf_from_stream() to avoid allocating twice

more memory and unnecessary memcpy() when LEN_HINT is equal to final stringbuf


* subversion/libsvn_subr/stream.c

(svn_stringbuf_from_stream): Always preallocate LEN_HINT + MIN_READ_SIZE

bytes to be able perform final read without stringbuf reallocation.

Remove the now unused svn_stream_wrap_buffered_read() functionality.

* subversion/include/svn_io.h

(svn_stream_wrap_buffered_read): Drop.

* subversion/libsvn_subr/stream.c





svn_stream_wrap_buffered_read): Drop.

* subversion/tests/libsvn_subr/stream-test.c






test_stream_buffered_wrapper): Drop.

(test_funcs): Unregister test.

  1. … 2 more files in changeset.
Introduce svn_stream_for_stdin2 that supports optional APR buffering.

Update all callers.

* subversion/include/svn_io.h

(svn_stream_for_stdin2): Declare new, bumped API version.

(svn_stream_for_stdin): Depcreate.

* subversion/libsvn_subr/deprecated.c

(svn_stream_for_stdin): Implement in terms of the new API version.

* subversion/libsvn_subr/stream.c

(svn_stream_for_stdin2): Implement.

* subversion/svnadmin/svnadmin.c

(subcommand_load): Update caller.

(subcommand_load_revprops): Same. Also, don't use the buffering

read stream wrapper anymore.

* subversion/svnfsfs/load-index-cmd.c

(subcommand__load_index): Ditto.

* subversion/svnmucc/svnmucc.c

(execute): Update caller.

* subversion/svnserve/svnserve.c

(sub_main): Same.

* tools/dev/x509-parser.c

(main): Same.

  1. … 7 more files in changeset.
Revert r1702305 due to issues with pipes in APR.
[Reverted in 1702410]

APR file based streams shall not superficially support read or write

functions when we know that the APR file itself does not. Instead,

they will now return a "not supported" stream error.

Note that this check is not perfect for arbitrary APR file handles

(may enable more functions than the handle actually supports) but

works correctly for our STD* streams and the files opened through

our svn_io_* API.

* subversion/libsvn_subr/stream.c

(make_stream_from_apr_file): Set read and write functions conditionally.

(svn_stream_from_aprfile2): Determine r/w capabilities to the degree

we can based on the APR API.



svn_stream_for_stderr): Update constructor callers.

Don't make svn_stream_skip segfault on write-only streams.

* subversion/libsvn_subr/stream.c

(svn_stream_skip): The fallback only applies when we can read data at all.

Following up on r1701792, remove a misplaced svn_error_trace() call.

* subversion/libsvn_subr/stream.c

(svn_stream_from_aprfile2): Don't call svn_error_trace() on a function

that returns a svn_stream_t object.

Following up on r1701633, do not use a custom svn_stream_skip() handler in

streams that wrap STDIN, STDOUT and STDERR.

This handler, skip_handler_apr(), performs a seek internally. We cannot rely

on its behavior when wrapping standard I/O streams like STDIN, because they

do not generally support seeking.

Discussion can be found in

(Subject: "Re: svn commit: r1701633 - ...").

* subversion/libsvn_subr/stream.c

(svn_stream_from_aprfile2): Extract the core of this function ...

(make_stream_from_apr_file): ...into this new helper. Add a new boolean

argument that specifies whether we should install callbacks that depend

on the ability to perform a seek().

(svn_stream_for_stdin, svn_stream_for_stdout, svn_stream_for_stderr):

Call the new helper while specifying that the handle does not support


Fix svn_stream_for_stdin() and related functions for STDOUT and STDERR

that were returning streams with mark() and seek() capabilities.

STDIN, STDOUT and STDERR don't provide general support for positioning

requests. This behavior is implementation-specific and depends on what's

passed as the corresponding handle. For example, on Linux, apr_file_seek()

that calls lseek() internally fails with ESPIPE [1] when the descriptor is

associated with a terminal device. As we cannot safely advertise mark()

and seek() support for these streams, don't do that.


* subversion/libsvn_subr/stream.c

(svn_stream_for_stdin, svn_stream_for_stdout, svn_stream_for_stderr):

Don't install mark and seek handlers.

* subversion/libsvn_subr/stream.c

(svn_stream__create_for_install): Update comment with rationale why we

use buffered mode for temporary file.

Reduce difference between Windows and non-Windows implementation of


* subversion/libsvn_subr/stream.c

(install_close): Remove #ifdef WIN32

(svn_stream__create_for_install): Always setup flush on close stream


(svn_stream__install_stream): Close temporary file before rename on all


(svn_stream__install_get_info): Obtain file info from file handle on all


(svn_stream__install_delete): Close temporary file before delete on all


Fix spurious 'Access Denied' errors during checkout or commit on Windows

reported on TortoiseSVN users mailing list [1].

The issue can be reproduced locally even though known problems reports are

with working copies stored on network share. The reproduction script is:

1. Checkout working copy with a file 'foo'

2. Run 'for /L %I in (1,1,50000000) do type foo' command in background to

emulate indexers/antivirus scanners

3. Modify file 'foo' in repository via another working copy.

4. Update the original working copy with 'type foo' running

The real fix is to add retry loop for SetFileInformationByHandle() call. All

other changes are refactoring to move existing platform specific code from

libsvn_subr/stream.c to libsvn_subr/io.c to use WIN32_RETRY_LOOP macro. We

already have retry loops for almost all IO operations on Windows.


* subversion/include/private/svn_io_private.h

(svn_io__win_delete_file_on_close, svn_io__win_rename_open_file): Declare

new library private functions.

* subversion/libsvn_subr/io.c


FileDispositionInfo, SetFileInformationByHandle_t,

set_file_information_by_handle_proc): Move it here from


(win_init_dynamic_imports): Obtain pointer to SetFileInformationByHandle().

(win32_set_file_information_by_handle): New helper.

(svn_io__win_delete_file_on_close): New. Implementation extracted from


(svn_io__win_rename_open_file): New. Implementation extracted from

svn_stream__install_stream(). Retry operation on Access Denied errors.

* subversion/libsvn_subr/stream.c


Move to libsvn_subr/io.c.

(SetFileInformationByHandle, SetFileInformationByHandle_a,

find_SetFileInformationByHandle): Remove.

(svn_stream__install_stream): Use svn_io__win_rename_open_file().

(svn_stream__install_delete): Use svn_io__win_delete_file_on_close().

  1. … 2 more files in changeset.