Polly is a .NET resilience and transient-fault-handling library that allows developers to express policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback in a fluent and thread-safe manner.
CircuitBreaker
We start or journey from one quite useful resiliency pattern known as Circuit Breaker
Handle faults that might take a variable amount of time to recover from, when connecting to a remote service or resource. This can improve the stability and resiliency of an application.
This quote doesn`t make it clear what circuit breaker does. I want to describe how it works using simple example. We need to integration with API which provides information about weather and this weather API is not under our ownership. Working with a third-party APIs makes us think about a case when another API goes down. We don`t know the reason why 3rd party API went down and when it becomes available again . For our flow it`s better skip request to not working API instead of wait for timeout as example. And Circuit Breaker perfectly fits in this game. It breaks a flow and prevents any request to corrupter API for sometime.
Lets dive into examples
public static async Task BasicAsync()
{
var circuitBreakerPolicy = Policy
.Handle<Exception>()
.CircuitBreakerAsync(2, TimeSpan.FromSeconds(1));
// 2 errors break it for 1 second
for (int i = 0; i < 10; ++i)
{
try
{
Console.WriteLine($"Execution {i}");
await circuitBreakerPolicy.ExecuteAsync(async () =>
{
Console.WriteLine($"before throw exception {i}");
throw new Exception($"Error {i}");
});
}
catch (Exception ex)
{
Console.WriteLine($"Catch ex {ex.Message}");
}
await Task.Delay(500);
}
}
In this example circuit breaker opens after 2 exceptions and during 1 second stays open and then again allows us to get error. It`s pretty clear example how it works. Another system has 1 second break to become available again.
Here is more complex configuration for circuit breaker
public static async Task AdvancedAsync()
{
var advancedCircuitBreaker =Policy
.Handle<Exception>()
.AdvancedCircuitBreakerAsync(0.5, TimeSpan.FromSeconds(2),
3, TimeSpan.FromSeconds(1));
for (int i = 0; i < 10; i++)
{
try
{
Console.WriteLine($"Execution {i}");
await advancedCircuitBreaker.ExecuteAsync(async () =>
{
Console.WriteLine($"before throw exception {i}");
throw new Exception($"Error {i}");
});
}
catch (Exception ex)
{
Console.WriteLine($"Catch ex {ex.Message}");
}
await Task.Delay(500);
}
}
We can configure circuit breaker in more sophisticated way instead of just a number of errors, we provide that it should open if 50% of requests throw error during 2 seconds with condition that minimal amount of requests is 3.
Both examples are valid and it`s our responsibility to decide which configuration better suites for us.
Timeout policy and policy wrap
Another useful policy provided by Polly is Timeout. It determines how long operation can be running and if it exceeds that threshold timeout policy force stops invocation and throws an error.
In next example I want to combine timeout policy which throws error that timeout exceeded with circuit breaker policy which prevent execution after 100% or errors during last 3 seconds.
public static async Task TimeoutConsequenceAsync()
{
var advancedCircuitBreaker=Policy
.Handle<Exception>()
.AdvancedCircuitBreakerAsync(1, TimeSpan.FromSeconds(3),
2, TimeSpan.FromSeconds(1));
var timeoutPolicy = Policy.TimeoutAsync
(TimeSpan.FromMilliseconds(1000),
TimeoutStrategy.Pessimistic);
// note: Optimistic cancel operation via cancellation token
var wrapPolicy=Policy.WrapAsync(advancedCircuitBreaker,
timeoutPolicy);
for (int i = 0; i < 10; i++)
{
try
{
Console.WriteLine($"Execution {i}");
awaitwrapPolicy.ExecuteAsync(async () =>
{
Console.WriteLine($"before throw exception {i}");
await Task.Delay(TimeSpan.FromMilliseconds(1000));
Console.WriteLine($"after throw exception {i}");
});
Console.WriteLine($"Execution {i} after actual call");
}
catch (Exception ex)
{
Console.WriteLine($"Catch ex {ex.Message}");
}
await Task.Delay(100);
}
}
We created timeout policy which allows code to be executed up to 1 second and in case of Pessimistic strategy timeout policy throws exception. If we choose optimistic strategy then we have to configure our code to rely on CancelationToken provided by timeout policy.
Also we combined 2 policies into one complex policy using Policy.WrapAsync(… polices )
You need to remember that policy which triggers closer to your code must be placed at righter position in wrap invocation. In example above timeout policy wraps code and circuit policy wraps timeout policy.
If you run example above code will be executed 2 and failed by timeout policy circuit breaker decided that threshold reached and came to opened state for 1 second then again timeout policy triggered and again circuit comes to open state.
Let`s take a look what happens when we open tasks for execution in same time
public static async Task TimeoutRandomParallelAsync()
{
var advancedCircuitBreaker=Policy
.Handle<Exception>()
.AdvancedCircuitBreakerAsync(1, TimeSpan.FromSeconds(3),
2, TimeSpan.FromSeconds(1));
var timeoutPolicy = Policy.TimeoutAsync
(TimeSpan.FromMilliseconds(1000),
TimeoutStrategy.Pessimistic);
// note: Optimistic cancel operation via cancellation token
var wrapPolicy = Policy.WrapAsync(advancedCircuitBreaker,
timeoutPolicy);
var tasks = new List<Task>();
for (int i = 0; i < 10; i++)
{
try
{
tasks.Add(wrapPolicy.ExecuteAsync(async () =>
{
Console.WriteLine($"before throw exception {i}");
awaitTask.Delay(TimeSpan.FromMilliseconds(3500));
}));
}
catch (Exception ex)
{
// never come here
Console.WriteLine($"Catch ex {ex.Message}");
}
// without delay all tasks started invocation and circuit
breaker doesn`t know about fails
await Task.Delay(100);
}
try
{
await Task.WhenAll(tasks);
}
catch (Exception ex)
{
// here ex contains first error thrown by list of tasks
var errors = tasks.Select(t => t.Exception);
foreach (var error in errors)
{
Console.WriteLine($"HERE WE COME {error.Message}");
}
}
}
in this example we created 10 tasks and waiting when them all complete. In such example circuit breaker never comes into game and we get 10 timeout exceptions. Circuit breaker pass request before it gets information that another request threw an error.
This example can demonstrate what happens in real word when a lot of requests come to our application and circuit breaker allow them all to hit into broken part but after it detect threshold break next requests face opened circuit breaker.
What to do when circuit is open?
Polly provides a convenient way to handle errors with Fallback policy. Lets enrich previous example with fallback policy
public static async Task FallbackWithTimeoutAsync()
{
var advancedCircuitBreaker = Policy
.Handle<Exception>()
.AdvancedCircuitBreakerAsync(0.5, TimeSpan.FromSeconds(2),
3, TimeSpan.FromSeconds(1));
var timeoutPolicy = Policy.TimeoutAsync
(TimeSpan.FromMilliseconds(1000),
TimeoutStrategy.Pessimistic);
var fallback=Policy
.Handle<BrokenCircuitException>()
.Or<TimeoutException>()
.Or<AggregateException>()
.Or<TimeoutRejectedException>()
.FallbackAsync((cancellation) =>
{
Console.WriteLine("Fallback action");
return Task.CompletedTask;
});
var wrapPolicy = Policy.WrapAsync(fallback,
advancedCircuitBreaker, timeoutPolicy);
var tasks = new List<Task>();
for (int i = 0; i < 10; i++)
{
try
{
tasks.Add(wrapPolicy.ExecuteAsync(async () =>
{
Console.WriteLine($"before wait {i}");
await Task.Delay(TimeSpan.FromMilliseconds(3500));
Console.WriteLine($"after wait {i}");
}));
}
catch (AggregateException ex)
{
// never come here
Console.WriteLine($"Catch ex {ex.Message}");
}
await Task.Delay(500);
}
try
{
await Task.WhenAll(tasks);
}
catch (AggregateException ex)
{
// here ex contains first error thrown by list of tasks
var errors = tasks.Where(t => t.Exception != null).Select
(t => t.Exception);
foreach (var error in errors)
{
Console.WriteLine($"ERROR is {error.Message}
{error.GetType()}");
}
we created fallback policy with can handle list of exception types and provide a fallback strategy in that case. fallback should be specified as most left policy during Policy.Wrap operation.
What about Retries ?
As developer we should be careful with retry policies it can cause a cost spike with a pity configuration and even kill your cluster. Polly has a good list of examples how to configure retries. I want to highlight that if you need to create retry policy it`s better to configure it with exponential delays and limited count of times.
In conclusion
Circuit breaker is a must have pattern to use if you have a deal with distributed systems. As developer you should take care about case when some part of system doesn`t and here fallback policy perfectly fits. Also try to limit external calls with a timeout, no-one likes to wait.
Source: Medium - Andrew Kulta
The Tech Platform
Commenti