Testing · Swift Code Composer

Question 3

Testing asynchronous & concurrent code

How do you reliably test async/await code and concurrency-heavy features to avoid flakiness and race conditions?

Answer outline

Flaky async tests almost always come from nondeterminism — real timers, real concurrency, or shared mutable state. The fix is to eliminate it rather than try to wait long enough for correctness.

There are five techniques that cover the majority of cases:

1.Inject a controllable clock — replace Task.sleep and real timers with a TestClock or ImmediateClock so the test controls time instead of waiting for it.
2.Use async test functions with direct await — prefer func test() async throws over XCTestExpectation; always await the exact work being tested so assertions run after completion, not during it.
3.Control task lifetimes — avoid fire-and-forget Task {} in production code; use withTaskGroup so callers (and tests) can await all spawned work to guaranteed completion.
4.Isolate shared mutable state — use actor-isolated types or thread-safe fakes; repeat the scenario 1 000× and run with Thread Sanitizer to surface data races early.
5.Inject every async dependency — networking, databases, queues, and clocks should all be swappable for deterministic fakes, letting tests simulate failures and force ordering.

Principles

Inject a TestClock instead of sleeping — your test controls time, not the scheduler.
Prefer async test functions with direct await; avoid mixing XCTestExpectation with structured concurrency.
Use Thread Sanitizer and repeat scenarios to surface race conditions reliably.
Test cancellation explicitly — cancel a task and assert the system under test respected it.
One concern per async test — every extra layer multiplies race surfaces.

A RetryHandler waits 2 seconds between attempts. Injecting a no-op TestClock means the test runs instantly — no real waiting:

Clock injection: retry with delay

protocol ClockType {
    func sleep(for duration: Duration) async throws
}

struct SystemClock: ClockType {
    func sleep(for duration: Duration) async throws {
        try await Task.sleep(for: duration)
    }
}

struct TestClock: ClockType {
    func sleep(for duration: Duration) async throws { /* no-op */ }
}

// Production type
struct RetryHandler {
    let clock: any ClockType

    func fetch(maxAttempts: Int, work: () async throws -> Data) async throws -> Data {
        var lastError: Error?
        for attempt in 1...maxAttempts {
            do {
                return try await work()
            } catch {
                lastError = error
                if attempt < maxAttempts {
                    try await clock.sleep(for: .seconds(2)) // skipped in tests
                }
            }
        }
        throw lastError!
    }
}

// Test — completes immediately despite the 2-second retry delays
func testRetriesOnFailure() async throws {
    var callCount = 0
    let sut = RetryHandler(clock: TestClock())

    _ = try await sut.fetch(maxAttempts: 3) {
        callCount += 1
        if callCount < 3 { throw URLError(.notConnectedToInternet) }
        return Data()
    }

    XCTAssertEqual(callCount, 3)
}

Cancel a task and assert the SUT observed it — don’t assume cleanup happens automatically:

Testing cancellation explicitly

func testCancellationIsRespected() async {
    let task = Task {
        await sut.longRunningWork()
    }
    task.cancel()
    await Task.yield()
    XCTAssertTrue(sut.didCancel)
}