Evaluating Claude's abilities in concurrent programming.
After Anthropic released its newest model last week, it received mostly positive feedback. Some were comparing it to the GPT-4 level, while others suspected that Claude was even self-conscious. With this premise, I decided to test it in the most challenging programming paradigm: concurrency.
The problem
There are 3 services that capture temperature: analog, infrared, and digital:
private final AnalogSensor analogSensor;
private final DigitalSensor digitalSensor;
private final InfraredSensor infraredSensor;
Each sensor can return a temperature by calling the getTemp method. This method is a blocking call, which means that data retrieval may take some time (you can imagine that it's a network request to some online temperature service).
The task is to implement readTemperature(), which has a total timeout of 500ms and a requirement to return a result if at least the results of 2 services are available within that timeout.
The entire code:
public class TemperatureService {
    private final AnalogSensor analogSensor;
    private final DigitalSensor digitalSensor;
    private final InfraredSensor infraredSensor;
    public TemperatureService(AnalogSensor analogSensor,
                              DigitalSensor digitalSensor,
                              InfraredSensor infraredSensor) {
        this.analogSensor = analogSensor;
        this.digitalSensor = digitalSensor;
        this.infraredSensor = infraredSensor;
    }
    /**
     * This method calls three underlying services to get temperature snapshot. It should return within 500ms, otherwise
     * there's no point in waiting for the response and {@link TimeoutException} should be thrown.
     * <p>
     * TemperatureSnapshot should contain at least two internal responses but more is better.
     * E.g. analog + digital or analog + infrared or digital + infrared is a minimum.
     */
    TemperatureResult readTemperature() throws TimeoutException {
        BigDecimal analog = analogSensor.getTemp();
        BigDecimal digital = digitalSensor.getTemp();
        BigDecimal infrared = infraredSensor.getTemp();
        return new TemperatureResult(analog, digital, infrared);
    }
    static class TemperatureResult {
        final BigDecimal analog;
        final BigDecimal digital;
        final BigDecimal infrared;
        TemperatureResult(BigDecimal analog, BigDecimal digital, BigDecimal infrared) {
            this.analog = analog;
            this.digital = digital;
            this.infrared = infrared;
        }
    }
    interface AnalogSensor extends TempSensor {}
    interface DigitalSensor extends TempSensor {}
    interface InfraredSensor extends TempSensor {}
    
    interface TempSensor {
        BigDecimal getTemp();
    }
}
Another requirement was to use Java 8 to test Claude and see how well it can distinguish between different API versions.
First prompt
For the evaluation, I was using the claude-3-opus-20240229 model, which was the most advanced at the moment I was writing this article.
System prompt
You are a professional Java developer with deep knowledge of  
concurrent programming. You have been given some code.
[THE INSERTED CODE]
User
Implement readTemperature method following the discription from  
Java Doc
Output
1. Call the three underlying services concurrently to get the temperature  
results.
2. Wait for a maximum of 500ms for the responses.
3. Ensure that at least two internal responses are available before  
returning the `TemperatureResult`.
4. Throw a `TimeoutException` if the method takes longer than 500ms  
to complete.
The explanation is correct, but a bit abstract. The most interesting was the generated code:
TemperatureResult readTemperature() throws TimeoutException {
    long startTime = System.currentTimeMillis();
    ExecutorService executor = Executors.newFixedThreadPool(3);
    Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
    Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
    Future<BigDecimal> infraredFuture = executor.submit(infraredSensor::getTemp);
    try {
        BigDecimal analog = null;
        BigDecimal digital = null;
        BigDecimal infrared = null;
        try {
            analog = analogFuture.get(500, TimeUnit.MILLISECONDS);
        } catch (TimeoutException | InterruptedException | ExecutionException e) {
            // Ignore and proceed with other sensors
        }
        try {
            digital = digitalFuture.get(500 -
                            (System.currentTimeMillis() - startTime),
                    TimeUnit.MILLISECONDS);
        } catch (TimeoutException | InterruptedException | ExecutionException e) {
            // Ignore and proceed with other sensors
        }
        try {
            infrared = infraredFuture.get(500 -
                            (System.currentTimeMillis() - startTime),
                    TimeUnit.MILLISECONDS);
        } catch (TimeoutException | InterruptedException | ExecutionException e) {
            // Ignore and proceed with other sensors
        }
        if (
            (analog != null && digital != null) ||
            (analog != null && infrared != null) ||
            (digital != null && infrared != null)
        ) {
            return new TemperatureResult(analog, digital, infrared);
        } else {
            throw new TimeoutException("Could not retrieve " +
                    "at least two temperature results within 500ms");
        }
    } finally {
        executor.shutdownNow();
    }
}
The interesting aspect here is that services are polled in separate threads, but results are retrieved through a blocking get call and sequentially. This was incorrect and would result in a TimeoutException if the first service takes 3 seconds to return. I clicked the Add to Conversation button and attempted to be more specific this time.
Second prompt
System prompt
[COPIED FROM ABOVE]
User
It works for the majority of cases but fails for one specific use  
case: when two services return in 100ms, but one takes 3s. This is a  
valid use case by definition and should return the values of those two  
results without throwing a timeout exception. Please adjust the code  
accordingly.
Output
Apologies for the oversight. You're right, the previous implementation would throw a TimeoutException even if two services return within the specified time limit. Here's the adjusted code to handle the case where two services return quickly, and one takes longer:
TemperatureResult readTemperature() throws TimeoutException {
    ExecutorService executor = Executors.newFixedThreadPool(3);
    Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
    Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
    Future<BigDecimal> infraredFuture = executor.submit(digitalSensor::getTemp);
    try {
        BigDecimal analog = null;
        BigDecimal digital = null;
        BigDecimal infrared = null;
        long startTime = System.currentTimeMillis();
        long remainingTime = 500;
        while (remainingTime > 0 && hasPendingResults(analog, digital, infrared)) {
            if (analog == null) {
                try {
                    analog = analogFuture.get(remainingTime, TimeUnit.MILLISECONDS);
                } catch (TimeoutException | InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            if (digital == null) {
                try {
                    digital = digitalFuture.get(remainingTime, TimeUnit.MILLISECONDS);
                } catch (TimeoutException | InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            if (infrared == null) {
                try {
                    infrared = infraredFuture.get(remainingTime, TimeUnit.MILLISECONDS);
                } catch (TimeoutException | InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            remainingTime = 500 - (System.currentTimeMillis() - startTime);
        }
        if ((analog != null && digital != null) 
                || (analog != null && infrared != null) 
                || (digital != null && infrared != null)) {
            return new TemperatureResult(analog, digital, infrared);
        } else {
            throw new TimeoutException("Could not retrieve at least two temperature " +
                    "results within 500ms");
        }
    } finally {
        executor.shutdownNow();
    }
}
In this iteration, Claude introduced a while loop for a duration of 500ms, but this implementation didn't consider the case if one of the service calls takes more than 500ms. This time, I tried to be more precise about why this approach wouldn't work.
Third prompt. We got it
System prompt
[COPIED FROM ABOVE]
User
The code still doesn't work because the `get` method is a blocking call, and you receive results sequentially. If the first call returns in 3 seconds, you will automatically fail the result, but this is incorrect. Please fix the code.
Output
You're absolutely right. I apologize for the confusion. Here's the corrected implementation that avoids the blocking get calls and waits for the futures to complete concurrently:
TemperatureResult readTemperature() throws TimeoutException {
    ExecutorService executor = Executors.newFixedThreadPool(3);
    Future<BigDecimal> analogFuture = executor.submit(analogSensor::getTemp);
    Future<BigDecimal> digitalFuture = executor.submit(digitalSensor::getTemp);
    Future<BigDecimal> infraredFuture = executor.submit(infraredSensor::getTemp);
    try {
        BigDecimal analog = null;
        BigDecimal digital = null;
        BigDecimal infrared = null;
        long startTime = System.currentTimeMillis();
        long remainingTime = 500;
        while (remainingTime > 0 &&
                (analog == null || digital == null || infrared == null)) {
            if (analog == null && analogFuture.isDone()) {
                try {
                    analog = analogFuture.get();
                } catch (InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            if (digital == null && digitalFuture.isDone()) {
                try {
                    digital = digitalFuture.get();
                } catch (InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            if (infrared == null && infraredFuture.isDone()) {
                try {
                    infrared = infraredFuture.get();
                } catch (InterruptedException | ExecutionException e) {
                    // Ignore and proceed with other sensors
                }
            }
            if ((analog != null && digital != null)
                    || (analog != null && infrared != null)
                    || (digital != null && infrared != null)) {
                return new TemperatureResult(analog, digital, infrared);
            }
            remainingTime = 500 - (System.currentTimeMillis() - startTime);
            if (remainingTime > 0) {
                Thread.sleep(10); // Sleep for a short duration to avoid busy waiting
            }
        }
        throw new TimeoutException("Could not retrieve at least " +
                "two temperature results within 500ms");
    } finally {
        executor.shutdownNow();
    }
}
Even the code didn't compile , it did what was expected from it: we are checking the results of each service call until we run out of 500ms and in the end preparing the result. A neat addition was the use of the Thread.sleep method to avoid overwhelming the service API.
Next prompt. Further improvements
So far, I have already got the result, but I didn't like its style. For the next iteration, I wanted to see more use of the concurrency API available in Java 8.
System prompt
[COPIED FROM ABOVE]
User
This code is correct, thanks. Could you simplify or rework it using  
only Java 8 functionality?
Output
Certainly! Here's a simplified version of the code using Java 8 features, specifically the CompletableFuture class:
TemperatureResult readTemperature() throws TimeoutException {
    CompletableFuture<BigDecimal> analogFuture = 
            CompletableFuture.supplyAsync(analogSensor::getTemp);
    
    CompletableFuture<BigDecimal> digitalFuture = 
            CompletableFuture.supplyAsync(digitalSensor::getTemp);
    
    CompletableFuture<BigDecimal> infraredFuture = 
            CompletableFuture.supplyAsync(infraredSensor::getTemp);
    CompletableFuture<TemperatureResult> temperatureResultFuture = CompletableFuture
            .allOf(analogFuture, digitalFuture, infraredFuture)
            .thenApply(v -> {
                BigDecimal analog = analogFuture.getNow(null);
                BigDecimal digital = digitalFuture.getNow(null);
                BigDecimal infrared = infraredFuture.getNow(null);
                return new TemperatureResult(analog, digital, infrared);
            });
    try {
        return temperatureResultFuture.get(500, TimeUnit.MILLISECONDS);
    } catch (InterruptedException | ExecutionException e) {
        throw new RuntimeException("Error retrieving temperature result", e);
    } catch (TimeoutException e) {
        TemperatureResult result = temperatureResultFuture.getNow(null);
        
        if (result != null && 
                atLeastTwoNonNull(result.analog, result.digital, result.infrared)) {
            return result;
        } else {
            throw new TimeoutException("Could not retrieve " +
                    "at least two temperature results within 500ms");
        }
    }
}
This solution won't work again with the notorious 3s long service call. Essentially, in such a situation, we won't get to the thenApply and instead, a runtime error will be thrown. I'm not sure why Claude forgot about this use case, but I decided to mention it again.
Last prompt. We got it second time
System prompt
[COPIED FROM ABOVE]
User
This solution fails for the case when the first service returns in 3 seconds.
Output
Certainly! Here's a simplified version of the code using Java 8 features, specifically the CompletableFuture class:
TemperatureResult readTemperature() throws TimeoutException {
    CompletableFuture<BigDecimal> analogFuture = CompletableFuture.supplyAsync(analogSensor::getTemp);
    CompletableFuture<BigDecimal> digitalFuture = CompletableFuture.supplyAsync(digitalSensor::getTemp);
    CompletableFuture<BigDecimal> infraredFuture = CompletableFuture.supplyAsync(infraredSensor::getTemp);
    CompletableFuture<TemperatureResult> temperatureResultFuture = CompletableFuture
            .anyOf(
                CompletableFuture.allOf(analogFuture, digitalFuture),
                CompletableFuture.allOf(analogFuture, infraredFuture),
                CompletableFuture.allOf(digitalFuture, infraredFuture))
            .thenApply(v -> {
        
        BigDecimal analog = analogFuture.getNow(null);
        BigDecimal digital = digitalFuture.getNow(null);
        BigDecimal infrared = infraredFuture.getNow(null);
        return new TemperatureResult(analog, digital, infrared);
    });
    try {
        return temperatureResultFuture.get(500, TimeUnit.MILLISECONDS);
    } catch (InterruptedException | ExecutionException e) {
        throw new RuntimeException("Error retrieving temperature result", e);
    } catch (TimeoutException e) {
        TemperatureResult result = temperatureResultFuture.getNow(null);
        
        if (result != null && 
                atLeastTwoNonNull(result.analog, result.digital, result.infrared)) {
            
            return result;
        } else {
            throw new TimeoutException("Could not retrieve " +
                    "at least two temperature results within 500ms");
        }
    }
}
Here, I was surprised for the second time. Apart from the general knowledge of the Java Concurrency package, Claude used the composition of anyOf with a list of allOf calls that I probably wouldn't have implemented by myself, and I really liked it.
Conclusion
Even so the generated code was not the most idiomatic Java code, and Claude wasn't able to generate the correct solution on the first try, I still had a really good impression after working with Claude. After trying Phind and IntelliJ IDEA AI Assistant to help me with the same problem, I can confirm that Claude is probably the most capable programming model available on the market. It's not ideal, but at the same pace, we will be there really soon.
Discuss on X