-
Notifications
You must be signed in to change notification settings - Fork 220
Add docs for rate limit aware load balancing #2126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4d63367
a5d4a92
0c965fb
841e7a4
22230ff
ba022b7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -37,24 +37,54 @@ including endpoints made unavailable by failure accrual. | |||||||||||||||
|
|
||||||||||||||||
| A _failure accrual policy_ determines how failures are tracked for endpoints, | ||||||||||||||||
| and what criteria result in an endpoint becoming unavailable ("tripping the | ||||||||||||||||
| circuit breaker"). Currently, the Linkerd proxy implements one failure accrual | ||||||||||||||||
| policy, _consecutive failures_. Additional failure accrual policies may be added | ||||||||||||||||
| in the future. | ||||||||||||||||
|
|
||||||||||||||||
| {{< note >}} | ||||||||||||||||
|
|
||||||||||||||||
| HTTP responses are classified as _failures_ if their status code is a [5xx | ||||||||||||||||
| server error]. Future Linkerd releases may add support for configuring what | ||||||||||||||||
| status codes are classified as failures. | ||||||||||||||||
|
|
||||||||||||||||
| {{< /note >}} | ||||||||||||||||
| circuit breaker"). | ||||||||||||||||
|
|
||||||||||||||||
| ### Consecutive Failures | ||||||||||||||||
|
|
||||||||||||||||
| In this failure accrual policy, an endpoint is marked as failing after a | ||||||||||||||||
| configurable number of failures occur _consecutively_ (i.e., without any | ||||||||||||||||
| successes). For example, if the maximum number of failures is 7, the endpoint is | ||||||||||||||||
| made unavailable once 7 failures occur in a row with no successes. | ||||||||||||||||
| made unavailable once 7 failures occur in a row with no successes. For the | ||||||||||||||||
| purpose of this failure accrual policy, a _failure_ is an HTTP response with | ||||||||||||||||
| a [5xx server error] status code or a gRPC response with one of the following | ||||||||||||||||
| gRPC status codes: | ||||||||||||||||
|
|
||||||||||||||||
| - DATA_LOSS | ||||||||||||||||
| - DEADLINE_EXCEEDED | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
| - PERMISSION_DENIED | ||||||||||||||||
| - UNAVAILABLE | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
|
Comment on lines
+54
to
+57
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
|
|
||||||||||||||||
| ### Unified | ||||||||||||||||
|
|
||||||||||||||||
| In this failure accrual policy, an endpoint is marked as failing after _either_ | ||||||||||||||||
| of the following condiditions is met: | ||||||||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
|
|
||||||||||||||||
| - Success rate drops below a configured threshold. For the purposes of | ||||||||||||||||
| calculating success rate, a failure is any HTTP response with a | ||||||||||||||||
| [5xx server error] or 429 status code or a gRPC response with one of the | ||||||||||||||||
| following gRPC status codes: | ||||||||||||||||
| - DATA_LOSS | ||||||||||||||||
| - DEADLINE_EXCEEDED | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
| - PERMISSION_DENIED | ||||||||||||||||
| - UNAVAILABLE | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
|
Comment on lines
+70
to
+73
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
| - RESOURCE_EXHAUSTED | ||||||||||||||||
| - A configured number of failures occur _consecutively_. For the purpose of | ||||||||||||||||
| tracking consecutive failures, a _failure_ is an HTTP response with a | ||||||||||||||||
| [5xx server error] status code or a gRPC response with one of the following | ||||||||||||||||
| gRPC status codes: | ||||||||||||||||
| - DATA_LOSS | ||||||||||||||||
| - DEADLINE_EXCEEDED | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
| - PERMISSION_DENIED | ||||||||||||||||
| - UNAVAILABLE | ||||||||||||||||
| - INTERNAL | ||||||||||||||||
|
Comment on lines
+81
to
+84
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
|
|
||||||||||||||||
| For more information on the Unified failure | ||||||||||||||||
| accrual, see [Rate Limit Aware Load Balancing](../tasks/rate-limit-aware-load-balancing.md). | ||||||||||||||||
|
|
||||||||||||||||
| ## Probation and Backoffs | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -123,8 +153,7 @@ breaking when sending traffic to that Service: | |||||||||||||||
| - `balancer.linkerd.io/failure-accrual`: Selects the | ||||||||||||||||
| [failure accrual policy](#failure-accrual-policies) used when communicating | ||||||||||||||||
| with this Service. If this is not present, no failure accrual is performed. | ||||||||||||||||
| Currently, the only supported value for this annotation is `"consecutive"`, to | ||||||||||||||||
| perform [consecutive failures failure accrual](#consecutive-failures). | ||||||||||||||||
| Supported values for this annotation are `consecutive` and `unified`. | ||||||||||||||||
|
|
||||||||||||||||
| When the failure accrual mode is `"consecutive"`, the following annotations | ||||||||||||||||
| configure parameters for the consecutive-failures failure accrual policy: | ||||||||||||||||
|
|
@@ -150,6 +179,29 @@ configure parameters for the consecutive-failures failure accrual policy: | |||||||||||||||
| floating-point number, and must be between 0.0 and 100.0. If this annotation | ||||||||||||||||
| is not present, the default value is 0.5. | ||||||||||||||||
|
|
||||||||||||||||
| When the failure accrual mode is `"unified"`, the following annotations | ||||||||||||||||
| configure parameters for the unified failure accrual policy: | ||||||||||||||||
|
|
||||||||||||||||
| - `balancer.alpha.linkerd.io/failure-accrual-success-rate-threshold`: If the | ||||||||||||||||
| success rate of responses in the window drops below this threshold, then the | ||||||||||||||||
| endpoint will be made unavailable. Must be between `0.0` and `1.0`. | ||||||||||||||||
| Rate-limited responses such as HTTP 429 and gRPC RESOURCE_EXHAUSTED count as | ||||||||||||||||
| failures for this calculation. If this annotation is not present, the default | ||||||||||||||||
| value is `0.8` (80% success rate). | ||||||||||||||||
| - `balancer.alpha.linkerd.io/failure-accrual-success-rate-window`: The window of | ||||||||||||||||
| time over which success rate is calculated. If this annotation is not present, | ||||||||||||||||
| the default value is `10s`. | ||||||||||||||||
| - `balancer.alpha.linkerd.io/failure-accrual-success-rate-min-requests`: The | ||||||||||||||||
| minimum number of responses which must be in the window before this breaker | ||||||||||||||||
| can trip. This acts as a "cold start" protection to ensure we have a | ||||||||||||||||
| sufficient number of responses for the success rate calculation to be | ||||||||||||||||
| meaningful before tripping. If this annotation is not present, the default | ||||||||||||||||
| value is `5`. | ||||||||||||||||
| - `balancer.linkerd.io/failure-accrual-consecutive-max-failures`: See above. | ||||||||||||||||
| - `balancer.linkerd.io/failure-accrual-consecutive-min-penalty`: See above. | ||||||||||||||||
| - `balancer.linkerd.io/failure-accrual-consecutive-max-penalty`: See above. | ||||||||||||||||
| - `balancer.linkerd.io/failure-accrual-consecutive-jitter-ratio`: See above. | ||||||||||||||||
|
|
||||||||||||||||
| [^1]: | ||||||||||||||||
| The part of the proxy which handles connections from within the pod to the | ||||||||||||||||
| rest of the cluster. | ||||||||||||||||
|
|
||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,102 @@ | ||||||
| --- | ||||||
| title: Handling Rate-Limited Endpoints | ||||||
| description: Automatically route traffic away from rate-limited endpoints | ||||||
| --- | ||||||
|
|
||||||
| When backends implement rate limiting and return | ||||||
| [HTTP 429](https://www.rfc-editor.org/rfc/rfc6585.html#page-3) or | ||||||
| [gRPC RESOURCE_EXHAUSTED](https://grpc.github.io/grpc/core/md_doc_statuscodes.html) | ||||||
| by default, the proxy treats these as successful responses from a load | ||||||
| balancing perspective. Since these types of responses are typically very fast, | ||||||
| Linkerd's [EWMA load balancing](../features/load-balancing.md) may actually | ||||||
| send _more_ traffic to these rate-limited endpoints. This can create a feedback | ||||||
| loop where clients experience high 429 or RESOURCE_EXHAUSTED rates. | ||||||
|
|
||||||
| Linkerd has two experimental features to help route traffic away from endpoints | ||||||
| which are in a rate-limited state. | ||||||
|
|
||||||
| {{< docs/production-note >}} | ||||||
|
|
||||||
| {{< warning >}} | ||||||
|
|
||||||
| Rate Limit Aware Load Balancing is an experimental, opt-in feature. | ||||||
|
|
||||||
| {{< /warning >}} | ||||||
|
|
||||||
| ## Load Biaser | ||||||
|
|
||||||
| Linkerd can be configured to use a more sophisticated version of the EWMA | ||||||
| load balancing algorithm which takes rate-limit responses (HTTP 429 or gRPC | ||||||
| RESOURCE_EXHAUSTED) into account. This algorithm is called the Load Biaser | ||||||
|
Comment on lines
+29
to
+30
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This also takes failures into account -- some gRPC (server error ones) and all HTTP 5xx responses. |
||||||
| because it biases traffic away from endpoints which have returned rate-limit | ||||||
| responses recently. | ||||||
|
|
||||||
| The Load Biaser works exactly the same as [EWMA](../features/load-balancing.md) | ||||||
| except that when it receives a rate-limited response, it substitutes a fixed | ||||||
| penalty value for the response's actual latency (unless the latency is | ||||||
| higher). For example, if the penalty is configured to be `5s` and the Load | ||||||
| Biaser receives a 429 response in `10ms`, it will treat the latency of that | ||||||
| response as `5s` for load balancing purposes. | ||||||
|
|
||||||
| In this way, the load balancer will not favor endpoints which return | ||||||
| rate-limited responses quickly. | ||||||
|
|
||||||
| The penalty value can be further refined if the server sets the `Retry-After` | ||||||
| HTTP response header or the `grpc-retry-pushback-ms` gRPC trailer. If one of | ||||||
| these values is present and is higher than the configured penalty, it will be | ||||||
| used in place of the penalty. This allows servers to exert a higher or lower | ||||||
| amount of pushback. | ||||||
|
|
||||||
| To enable Linkerd to use the Load Biaser for a Service, set the following | ||||||
| annotation on the Service resource: | ||||||
|
|
||||||
| | Annotation | Type | Default | Notes | | ||||||
| |-----------------------------------------------|------|---------|------------------------------------------| | ||||||
| | `balancer.alpha.linkerd.io/penalize-failures` | bool | `false` | Enables the Load Biaser for this Service | | ||||||
|
|
||||||
| The Load Biaser can be further configured with these annotations on the Service | ||||||
| resource: | ||||||
|
|
||||||
| | Annotation | Type | Default | | | ||||||
| |---------------------------------------------------------------|----------|---------|----------------------------------------------------------------------------------------| | ||||||
| | `balancer.alpha.linkerd.io/load-biaser-penalty` | duration | `5s` | The latency value to inject for rate-limited responses and failures | | ||||||
| | `balancer.alpha.linkerd.io/load-biaser-max-retry-after` | duration | `300s` | The maximum allowed value of a Retry-After header | | ||||||
|
|
||||||
| ## Unified Circuit Breaker | ||||||
|
|
||||||
| Linkerd can be configured to use a more sophisticated version of | ||||||
| [consecutive failures failure accrual](../tasks/circuit-breakers.md) called | ||||||
| Unified failure accrual. | ||||||
|
|
||||||
| The Unified failure accrual can be configured with a success rate threshold. | ||||||
| If the percent of responses within a fixed time window drops below this | ||||||
| threshold, the circuit breaker will trip, temporarily cutting off traffic to | ||||||
| this endpoint and giving it time to recover. Critically, any rate-limited | ||||||
| responses will count as failures for this success rate calculation. | ||||||
|
|
||||||
| The Unified failure accrual will ALSO trip if it encounters a configured | ||||||
| number of consecutive failures, just like the consecutive failures accrual. | ||||||
|
|
||||||
| To enable the Unified failure accrual circuit breaker on a Service, set the | ||||||
| following annotation to `"unified"` on the Service resource: | ||||||
|
|
||||||
| | Annotation | Type | Default | Notes | | ||||||
| |---------------------------------------|----------|---------|------------------------------------------------------------------------------| | ||||||
| | `balancer.linkerd.io/failure-accrual` | string. | None | The failure-accrual mode. Set to `unified` to enable Unified failure accrual | | ||||||
|
|
||||||
| The Unified failure accrual can be further configured with these annotations on | ||||||
| the Service resouce: | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| | Annotation | Type | Default | Notes | | ||||||
| |-----------------------------------------------------------------------|------------------------------|---------|------------------------------------------------------------------| | ||||||
| | `balancer.alpha.linkerd.io/failure-accrual-success-rate-threshold` | number between 0 and 1 | `0.8` | The success rate threshold at which to trip the breaker | | ||||||
| | `balancer.alpha.linkerd.io/failure-accrual-success-rate-window` | duration | `10s` | The window over which the success rate is calculated | | ||||||
| | `balancer.alpha.linkerd.io/failure-accrual-success-rate-min-requests` | number | `5` | Only trip if there are at least this many requests in the window | | ||||||
| | `balancer.linkerd.io/failure-accrual-consecutive-max-failures` | number | `7` | Trip if we encounter this many consecutive failures | | ||||||
| | `balancer.linkerd.io/failure-accrual-consecutive-min-penalty` | duration | `1s` | The minimum duration for which to cut off traffic | | ||||||
| | `balancer.linkerd.io/failure-accrual-consecutive-max-penalty` | duration | `1m` | The maximum duration for which to cut off traffic | | ||||||
| | `balancer.linkerd.io/failure-accrual-consecutive-jitter-ratio` | number between 0.0 and 100.0 | `0.5` | The amount of randomness to inject into the backoff | | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of:
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 100.0 is actually the bound: https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/exp-backoff/src/lib.rs#L83-L85 |
||||||
|
|
||||||
| See the | ||||||
| [reference documentation](../reference/circuit-breaking/#configuring-failure-accrual) | ||||||
| for details on failure accrual configuration. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTERNALis duplicated. UnfortunatelyPERMISSION_DENIEDis part of this set due to how the classifier has always treated it, but it's arguably not an endpoint failure, so perhaps something to consider removing from this set in a future release.