Allows to throttle certain requests, that means limit the amount of requests
matching some path criteria per minute.
The primary goal of this implementation is to prevent that some requests are
eating up all available CPUs, while others (more important) requests will be
impacted by this behavior. This implementation should be activating itself by
detecting such situations and start throttling requests. A simple usecase for
this can be to limit the number if incoming replication requests and allow
more "other" requests be handled.
Requests which are supposed to handled by this implementation must match one
of the configured regular expressions in
filtered_paths
. Other requests are not considered at all by the throttling implementation.
The throttling algorithm is loosely based on the leaky bucket approach, but
it allows to adjust the throttling based on CPU load. This means:
The number of requests which are permitted to pass the throttling approach,
is determined solely by CPU load; this number starts at the configured
maximum (
max_requests_per_minute
and decreases linearly as the CPU usages increases. At 100% usage the number
of permitted requests is 0 (zero) and all requests (matching the expression
in
filtered_paths
are throttled.
This implementation supports 2 modes of throttling:
- rejecting the request with a configurable HTTP statuscode; is should be
used in cases when the client is able to handle this case.
- Or blocking the request unless it can be handled. This is transparent for
the client (the request might time out, though!), but it blocks this requests
for the complete time, which might lead to a shortage of threads.