Checkout Tools
  • last updated 6 hours ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
event, worker: runtime pool.

MPMs event and worker both need a dedicated pool to handle the creation of

the threads (listener, workers) and synchronization objects (queues, pollset,

mutexes...) in the start_threads() thread, with at least the lifetime of

the connections they handle, and thus survive pchild destruction (notably

in ONE_PROCCESS mode, but SIG_UNGRACEFUL is concerned too).

For instance, without this fix, the below backtrace can happen in ONE_PROCCESS

mode and a signal/^C is received (with active connections):

Thread 1 "httpd" received signal SIGSEGV, Segmentation fault.

(gdb) bt

#0 <BOOM>

#1 0x00007ffff7c7e016 in apr_file_write (thefile=0x0, ...)

^ NULL (cleared)

at file_io/unix/readwrite.c:230

#2 0x00007ffff7c7e4a7 in apr_file_putc (ch=1 '\001', thefile=0x0)

^ NULL (cleared)

at file_io/unix/readwrite.c:377

#3 0x00007ffff7c8da4a in apr_pollset_wakeup (pollset=0x55555568b870)

^ already destroyed by pchild

at poll/unix/pollset.c:224

#4 0x00007ffff7fc16c7 in decrement_connection_count (cs_=0x7fff08000ea0)

at event.c:811

#5 0x00007ffff7c83e15 in run_cleanups (cref=0x7fffe4002b78)

at memory/unix/apr_pools.c:2672

#6 0x00007ffff7c82c2f in apr_pool_destroy (pool=0x7fffe4002b58)

^ master_conn

at memory/unix/apr_pools.c:1007

#7 0x00007ffff7c82c12 in apr_pool_destroy (pool=0x7fff08000c28)

^ ptrans

at memory/unix/apr_pools.c:1004

#8 0x00007ffff7c82c12 in apr_pool_destroy (pool=0x555555638698)

^ pconf

at memory/unix/apr_pools.c:1004

#9 0x00007ffff7c82c12 in apr_pool_destroy (pool=0x555555636688)

^ pglobal

at memory/unix/apr_pools.c:1004

#10 0x00005555555f4709 in ap_terminate ()

at unixd.c:522

#11 0x00007ffff6dbc8f1 in __run_exit_handlers (...)

at exit.c:108

#12 0x00007ffff6dbc9ea in __GI_exit (status=<optimized out>)

at exit.c:139

#13 0x00007ffff7fc1616 in clean_child_exit (code=0)

at event.c:774

^ pchild already destroyed here

#14 0x00007ffff7fc5ae4 in child_main (child_num_arg=0, child_bucket=0)

at event.c:2869

...

While at it, add comments about the lifetimes of MPMs pools and their objects,

and give each pool a tag (e.g. "pchild" accordingly to other MPMs).

(follow up for event_pollset in r1835846).

  1. … 3 more files in changeset.
mpm_winnt: Do not redefine the standard CONTAINING_RECORD() macro

in child.c.

This definition has been added in https://svn.apache.org/r88498 — perhaps,

because not every versions of SDK contained it at that time.

But since then, the macro has been available starting from Windows 2000

(https://msdn.microsoft.com/en-us/library/windows/hardware/ff542043),

and any available version of Windows SDK now should also contain it.

mpm_winnt: Remove an obsolete comment in child.c explaining why the

declarations of the structures and functions to access the completion

contexts reside in a header file.

This no longer holds, as all the necessary functions and structures are

located in the single .c file (child.c).

mpm_winnt: Tweak the names of the variables in child.c which are used to

represent a queue of the completion contexts.

Starting from r1801655, the "queue" isn't really a queue, as all the

access happens with a LIFO order. So, instead of that, call it a "pool

of completion contexts", adjust names of all relevant variables and

tweak the comments.

This patch changes

- qlock to ctxpool_lock,

- qhead to ctxpool_head, and

- qwait_event to ctxpool_wait_event.

mpm_winnt: Tweak the listener shutdown code to use a separate event

instead of the global variable (shutdown_in_progress).

This change has two purposes. First of all, it makes the listener threads

which are blocked waiting for a completion context exit immediately during

shutdown. Previously, such threads would only check for exit every second.

The second reason for this change is to put the child_main() function in

charge of controlling the listeners life cycle. Previously, such relation

was circumvented by the fact that the listeners were also waiting for the

global child exit_event. With the new separate listener_shutdown_event,

only the child_main() function is responsible for shutting down the

listeners, and I think that this makes the code a bit clearer.

All the original behavior, including the special APLOG_DEBUG diagnostic

message when we fail to acquire a free completion context in 1 second,

is kept unchanged.

  1. … 1 more file in changeset.
mpm_winnt: Following up on r1801655, add a comment that explains the

reason to choose the LIFO processing order for completion contexts.

It would be better to keep this important information in the code, instead

of just having it in the log message.

mpm_winnt: Remove unused values of the io_state_e enum.

Submitted By: Ivan Zhakov <ivan {at} visualsvn.com>

mpm_winnt: Remove a duplicated comment in the child_main() function.

mpm_winnt: Use a LIFO stack instead of a FIFO queue to hold unused

completion contexts, as that may significantly reduce the memory usage.

This simple change can have a noticeable impact on the amount of memory

consumed by the child process in various cases. Every completion context

in the queue has an associated allocator, and every allocator has it's

ap_max_mem_free memory limit which is not given back to the operating

system. Once the queue grows, it cannot shrink back, and every allocator

in each of the queued completion contexts keeps up to its max_free amount

of memory. The queue can only grow when a server has to serve multiple

concurrent connections at once.

With that in mind, consider a case with a server that doesn't encounter many

concurrent connections most of the time, but has occasional spikes when

it has to serve multiple concurrent connections. During such spikes, the

size of the completion context queue grows.

The actual difference between using LIFO and FIFO orders shows up after

such spikes, when the server is back to light load and doesn't see a lot

of concurrency. With FIFO order, every completion context in the queue

will be used in a round-robin manner, thus using *every* available allocator

one by one and ultimately claiming up to (N * ap_max_mem_free memory) from

the OS. With LIFO order, only the completion contexts that are close to

the top of the stack will be used and reused for subsequent connections.

Hence, only a small part of the allocators will be used, and this can

prevent all other allocators from unnecessarily acquiring memory from

the OS (and keeping it), and this reduces the overall memory footprint.

Please note that this change doesn't affect the worst case behavior, as

it's still (N * ap_max_mem_free memory), but tends to behave better in

practice, for the reasons described above.

Another thing worth considering is the new behavior when the OS decides

to swap out pages of the child process, for example, in a close-to-OOM

condition. Handling every new connection after the swap requires the OS

to load the memory pages for the allocator from the completion context that

is used for this connection. With FIFO order, the completion contexts are

used one by one, and this would cause page loads for every new connection.

With LIFO order, there will be almost no swapping, since the same completion

context is going to be reused for subsequent new connections.

mpm_winnt: Drop the APLOG_DEBUG diagnostic saying how many thread

are blocked on the I/O completion port during the shutdown.

Prior to r1801635, the shutdown code required to know the amount of blocked

threads, as it has been dispatching the same amount of completion packets.

But this no longer holds, and the only reason why we maintain the

corresponding g_blocked_threads variable is because of this debug

diagnostic message.

Drop it in order to reduce complexity of the quite critical code in the

winnt_get_connection() function and to reduce the amount of global

variables.

mpm_winnt: Remove an unnecessary Sleep() in the winnt_accept() function.

This sleep occured in a situation when:

- We don't have a free completion context in the queue

- We can't add one, as doing so would exceed the max_num_completion_contexts

limit (all worker threads are busy)

- We have exceeded a 1 second timeout while waiting for it

In this case, the Sleep() call is unnecessary, as there is no intermittent

failure that can be waited out, but rather than that, it's an ordinary

situation with all workers being busy. Presumably, calling Sleep() here

can be even considered harmful, as it affects the fairness between the

listeners that are blocked waiting for the completion context.

So, instead of calling Sleep() just check for the possible shutdown and

immediately retry acquiring a completion context. If all worker threads

are still busy, the retry will block in the same WaitForSingleObject() call,

which is fine.

mpm_winnt: Simplify the shutdown code that was waiting for multiple worker

thread handles in batches.

Starting from r1801636, there is no difference between ending the wait with

one or multiple remaining threads. This is because we terminate the process

if at least one thread is still active when we hit a timeout.

Therefore, instead of making an effort to evenly distribute and batch the

handles with WaitForMultipleObjects(), we could just start from one end,

and wait for one thread handle at a time.

mpm_winnt: Avoid using TerminateThread() in case the shutdown routine

hits a timeout while waiting for the worker threads to exit.

Using TerminateThread() can have dangerous consequences such as deadlocks —

say, if the the thread is terminated while holding a lock or a heap lock

in the middle of HeapAlloc(), as these locks would not be released.

Or it can corrupt the application state and cause a crash.

(See https://msdn.microsoft.com/en-us/library/windows/desktop/ms686717)

Rework the code to call TerminateProcess() in the described circumstances

and leave the cleanup to the operating system.

mpm_winnt: Make the shutdown faster by avoiding unnecessary Sleep()'s

when shutting down the worker threads.

Previously, the shutdown code was posting an amount of I/O completion

packets equal to the amount of the threads blocked on the I/O completion

port. Then it would Sleep() until all these threads "acknowledge" the

completion packets by decrementing the global amount of blocked threads.

A better way would be to send the number of IOCP_SHUTDOWN completion

packets equal to the total amount of threads and immediately proceed to

the next step. There is no need to block until the threads actually receive

the completion, as the shutdown process includes a separate step that waits

until the threads exit, and the new approach avoids an unnecessary delay.

mpm_winnt: Following up on r1801144, use the new accept_filter_e enum

values in a couple of missed places in winnt_accept().

mpm_winnt: Fix typo in the logged message in winnt_get_connection().

mpm_winnt: Refactor the mpm_get_completion_context() function so that it

would return a proper apr_status_t instead of yielding the result via the

*timeout out variable.

This makes the calling side easier to follow by avoiding an additional

layer of if's.

mpm_winnt: Remove an unnecessary retry after receiving a non-timeout failure

from the mpm_get_completion_context() function.

Currently, the only possible reasons why mpm_get_completion_context() could

fail are real errors such as being unable to WaitForSingleObject(), allocate

memory or create an event. Retrying under such circumstances doesn't make

sense, and could be as well considered harmful.

mpm_winnt: Factor out a helper function to parse the type of an accept

filter and use an appropriate enum for it.

This makes the code in winnt_accept() a bit easier to follow. As a minor

side effect, it also fixes a small bug where the "unrecognized AcceptFilter

'%s'" log entry would always contain "none" instead of the actually

unrecognized kind of the accept filter.

mpm_winnt: Don't forget to close the I/O completion port as part of the

cleanup in the child process.

If the lingering close does not leave the socket in a disconnected state,

do not recycle the socket.

On the trunk:

mpm_winnt: always invoke ap_lingering_close() at connection end.

mpm_winnt: clear OVERLAPPED structs before reuse

MSDN documentation states that

Any unused members of [an OVERLAPPED] structure should always be

initialized to zero before the structure is used in a function call.

Otherwise, the function may fail and return ERROR_INVALID_PARAMETER.

Prior to this patch, the internal state left over from previous

overlapped I/O was passed into the next call. It's unclear what effect

this might have, if any. (I have not personally witnessed an

ERROR_INVALID_PARAMETER myself.)

mpm_winnt: remove duplication of ap_process_connection

Further follow-up to the previous commit: now that we no longer patch a

network bucket into the brigade, we can revert to calling

ap_process_connection() directly instead of duplicating its logic.

mpm_winnt: remove the AcceptEx data network bucket

Follow-up to the prior commit: without an incoming data buffer, the

custom network bucket code is now orphaned and we can remove it

entirely. This has the added benefit that we are no longer using the

internal OVERLAPPED.Pointer field, which is discouraged by the MSDN

docs.

  1. … 2 more files in changeset.
mpm_winnt: remove 'data' AcceptFilter in favor of 'connect'

The 'data' AcceptFilter optimization instructs Windows to wait until

data is received on a connection before completing the AcceptEx

operation. Unfortunately, it seems this isn't performed atomically --

AcceptEx "partially" accepts the incoming connection during the wait for

data, leaving all other incoming connections in the accept queue. This

opens the server to a denial of service.

Since the fix for this requires a substantial rearchitecture (likely

involving multiple outstanding calls to AcceptEx), disable the 'data'

filter for now and replace it with 'connect', which uses the AcceptEx

interface but does not wait for data.

Users running prior releases of httpd on Windows should explicitly move

to a 'connect' AcceptFilter in their configurations if they are

currently using the default 'data' filter.

Many thanks to mludha, Arthur Ramsey, Paul Spangler, and many others for

their assistance in tracking down and diagnosing this issue.

PR: 59970

  1. … 4 more files in changeset.

Ensure http2 follows http in the meaning of

status WRITE (meaning 'in the request processing

phase' even if still consuming the request body,

not literally in a 'now writing' state).

Ensure a number of MPMs and the h2 connection io

no longer clobber the request status line during

state-only changes. While at it, clean up some

very ugly formatting and unnecessary decoration,

and avoid the wordy _from_conn() flavor when we

are not passing a connection_rec.

Ensure the useragent_ip is only used in the case

where it has been initialized, fall back on the

connection's remote_ip if the status is accidently

updated from an uninitialized request_rec.

  1. … 8 more files in changeset.
SECURITY (CVE-2014-3523): Fix a memory consumption denial of

service in the WinNT MPM used in all Windows installations.

Workaround: AcceptFilter <protocol> {none|connect}

Submitted by: trawick

Reviewed by: jorton, covener, jim

clarify a comment
Follow-up to r1606368: HANDLE is PVOID which is void * (fix format string)