"After all, the engineers only needed to refuse to fix anything, and modern industry would grind to a halt." -Michael Lewis
Oct 2020
The source code for this post can be found on Github.
Intermittent network flapping, or any one downstream host of several clones responding slowly, is a not uncommon thing that happens in a microservices architecture, especially if you're using java applications, where the JIT compiler can often make initial requests slower than they ought to be.
Depending on the request that you're making, it can often be retried effectively to smooth out these effects to your consumer. Doing so in a straightforward and declarative way will be the subject of this post.
I'm going to build off of some work in a previous blog post about fallbacks. You'll recall that we had setup a WebClient like so:
@Configuration
public class Config {
@Bean("service-a-web-client")
public WebClient serviceAWebClient() {
HttpClient httpClient = HttpClient.create().tcpConfiguration(tcpClient ->
tcpClient.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 1000)
.doOnConnected(connection -> connection.addHandlerLast(new ReadTimeoutHandler(1000, TimeUnit.MILLISECONDS)))
);
return WebClient.builder()
.baseUrl("http://your-base-url.com")
.clientConnector(new ReactorClientHttpConnector(httpClient))
.build();
}
}
This WebClient already has a timeout of 1 second configured, which in many cases is quite conservative [well written, performance focused services usually respond much faster than that].
I'll also steal our DTO from the last post:
public class WelcomeMessage {
private String message;
public WelcomeMessage() {}
public WelcomeMessage(String message) {
this.message = message;
}
public String getMessage() {
return message;
}
public void setMessage(String message) {
this.message = message;
}
}
With this, let's set up a barebones services that will soon contain the code we're looking for:
@Service
public class RetryService {
private final WebClient serviceAWebClient;
public RetryService(@Qualifier("service-a-web-client") WebClient serviceAWebClient) {
this.serviceAWebClient = serviceAWebClient;
}
public Mono<WelcomeMessage> getWelcomeMessageAndHandleTimeout(String locale) {
return Mono.empty();
}
}
This code doesn't do anything yet. Now let's make a test class, configured with the familiar MockServer setup that we've leveraged before:
@ExtendWith(MockServerExtension.class)
public class RetryServiceIT {
public static final int WEBCLIENT_TIMEOUT = 50;
private final ClientAndServer clientAndServer;
private RetryService retryService;
private WebClient mockWebClient;
public RetryServiceIT(ClientAndServer clientAndServer) {
this.clientAndServer = clientAndServer;
HttpClient httpClient = HttpClient.create()
.tcpConfiguration(tcpClient ->
tcpClient.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, WEBCLIENT_TIMEOUT)
.doOnConnected(connection -> connection.addHandlerLast(
new ReadTimeoutHandler(WEBCLIENT_TIMEOUT, TimeUnit.MILLISECONDS))
)
);
this.mockWebClient = WebClient.builder()
.baseUrl("http://localhost:" + this.clientAndServer.getPort())
.clientConnector(new ReactorClientHttpConnector(httpClient))
.build();
}
@BeforeEach
public void setup() {
this.retryService = new RetryService(mockWebClient);
}
@AfterEach
public void clearExpectations() {
this.clientAndServer.reset();
}
@Test
public void retryOnTimeout() {
AtomicInteger counter = new AtomicInteger();
HttpRequest expectedRequest = request()
.withPath("/locale/en_US/message")
.withMethod(HttpMethod.GET.name());
this.clientAndServer.when(
expectedRequest
).respond(
httpRequest -> {
if (counter.incrementAndGet() < 2) {
Thread.sleep(WEBCLIENT_TIMEOUT + 10);
}
return HttpResponse.response()
.withBody("{\"message\": \"hello\"}")
.withContentType(MediaType.APPLICATION_JSON);
}
);
StepVerifier.create(retryService.getWelcomeMessageAndHandleTimeout("en_US"))
.expectNextMatches(welcomeMessage -> "hello".equals(welcomeMessage.getMessage()))
.verifyComplete();
this.clientAndServer.verify(expectedRequest, VerificationTimes.exactly(3));
}
}
This code:
Now, following TDD, let's write code that passes this test:
public Mono<WelcomeMessage> getWelcomeMessageAndHandleTimeout(String locale) {
return this.serviceAWebClient.get()
.uri(uriBuilder -> uriBuilder.path("/locale/{locale}/message").build(locale))
.retrieve()
.bodyToMono(WelcomeMessage.class)
.retryWhen(
Retry.backoff(2, Duration.ofMillis(25))
.filter(throwable -> throwable instanceof TimeoutException)
);
}
This code:
If you now run the test, it will pass. Remember to check out the source code on Github!
Nick Fisher is a software engineer in the Pacific Northwest. He focuses on building highly scalable and maintainable backend systems.