v0.2.9 · MIT · JDK 11+ · JDK 21 native

Callback-based async control flow for Java, that plays nice with Loom.

A Java port of the Node.js async library. Compose Parallel, Series, Waterfall, Race, Map, Reduce, Queue, and Lock into pipelines. ~50 µs per orchestration overhead. Backed by virtual threads when you want them.

Install → See examples GitHub

// fan out two enrichment lookups, score, serialize
final var tasks = List.of(
  c -> exec.submit(() -> c.success(lookupA(req))),
  c -> exec.submit(() -> c.success(lookupB(req)))
);

Asyncc.Parallel(tasks, wrap(results -> {
  var scored = score(req, results.get(0), results.get(1));
  reply.send(serialize(scored));
}));

Why async.java

Three properties you can rely on.

Most async-coordination libraries on the JVM grew out of pre-Loom assumptions: they own their thread pool, they assume long-running flows, and they layer many frames between you and your code. async.java picks a different point in the design space.

// near-zero overhead

~50 µs per orchestration

No actor mailbox, no graph materialisation, no per-call scheduling layer. Asyncc.Parallel is a heap allocation + a couple of atomic increments + your callback. The library never gets in the way.

// virtual-thread native

Loom is a co-processor, not a replacement

Pass Executors.newVirtualThreadPerTaskExecutor() to NeoQueue or your tasks and every fan-out spawns on a virtual thread. The library handles the orchestration; Loom handles the threads.

// predictable

At-most-once final callback

Hardened in v0.2.x with dedup guards, atomic counters, slot-write-before-counter-increment ordering, and a v0.2.4 fix for the ArrayList resize race under high-throughput fan-out. Adversarial fuzz tests pin the at-most-once contract across all combinators.

// v0.2.4 ergonomics

`c.success(v)` / `c.fail(e)`

Shorthand for c.done(null, v) and c.done(e, null). The continuation parameter is named c — short for continuation — everywhere in the docs.

// no boilerplate

`WrapErrFirst.wrap(...)`

Wrap a value-only consumer into an error-first callback and skip the if (err != null)... preamble. Throws on unhandled errors; pair with an explicit error consumer if you want both branches.

// composability

Combinators nest cleanly

Waterfall wrapping a Map wrapping a Parallel wrapping a Race is a perfectly normal pipeline — they all use the same error-first callback shape. See the composability showcase.

What’s new in v0.2.9

Two changes, both diagnosed from AsyncFut.Whilst’s production behavior:

NeoWhilst.RunMap race fixed. A sync-completing body (AsyncFut.Whilst with an already-completed CompletableFuture — common in tests and cache-hit paths) was double-dispatching one extra body call past short-circuit. The truth-test ran in two places: inside the per-task done callback (which already recurses if the loop should continue) AND in a post-m.run block intended for async-body fan-out at limit > 1. For sync-completing bodies the post-m.run test would re-fire after the chain had already settled. Now gated on s.isShortCircuited() || taskRunner.isFinished() — the async-body fan-out path is unchanged.
Concat/ConcatSeries/ConcatLimit/ConcatDeep/ConcatDeepSeries/ConcatDeepLimit task-list variants widened to List<? extends AsyncTask<T, E>>. Same ? extends treatment we applied to Parallel/Series/ParallelLimit in v0.2.8-rc2. A List<Asyncc.Task<T>> (the Throwable-fixed shorthand) now flows into all nine Concat overloads without an explicit cast or defensive copy. Internal NeoParallel/NeoSeries methods widened too, so the public-API defensive ArrayList copy could be elided — one fewer allocation per Asyncc.Parallel/Series/ParallelLimit call.

Read the full deep-dive: Tracking down a Whilst race.

192 tests, 0 failures, 2 JDK 21-gated skips.

Install

JitPack (live within minutes of a git tag).

Add the JitPack repository and pin the version. Releases are signed git tags on the main repo; see the releases page for the latest.

<!-- pom.xml -->
<repositories>
  <repository>
    <id>jitpack.io</id>
    <url>https://jitpack.io</url>
  </repository>
</repositories>

<dependency>
  <groupId>com.github.async-java</groupId>
  <artifactId>async.java</artifactId>
  <version>v0.2.9</version>
</dependency>

For Gradle, see the JitPack page for v0.2.9. The library targets JDK 11 but is tested on 11, 17, and 21.

Combinators

A small, composable surface.

Every combinator takes tasks (or values) and an error-first final callback. Compose them freely — they nest without surprises because they all honor the same at-most-once final-callback contract.

Asyncc.Parallel Asyncc.ParallelLimit Asyncc.Series Asyncc.Waterfall Asyncc.Race Asyncc.Times Asyncc.Each Asyncc.Map Asyncc.FilterMap Asyncc.Reduce Asyncc.GroupBy Asyncc.Concat Asyncc.Inject Asyncc.Whilst Asyncc.DoWhilst NeoQueue NeoLock NeoRwLock WrapFuture AsyncFut

Examples for each →

Benchmark

async.java vs Akka Streams under load.

Same 5-stage pipeline, both orchestrators, 60-second sustained WebSocket runs from a Rust load tester. Numbers are end-to-end round-trip latency (parse → validate → enrich ∥ → score → serialize) on JDK 21 with a virtual-thread executor. Full methodology in the load-curve post.

offered load	library	p50	p99	max	drops
500 msg/s (50 × 10)	async.java	5.7 ms	14.3 ms	46 ms	0
500 msg/s (50 × 10)	akka-streams	17.8 ms	30.7 ms	55 ms	0
1 000 msg/s (200 × 5)	async.java	5.1 ms	14.8 ms	21 ms	0
1 000 msg/s (200 × 5)	akka-streams	5.9 ms	54.3 ms	100 ms	0
2 500 msg/s (50 × 50)	async.java	5.0 ms	11.5 ms	18 ms	0
2 500 msg/s (50 × 50)	akka-streams	2 017 ms	5 230 ms	6 258 ms	~14 %

The gap is dispatcher queue-wait. async.java's per-call overhead doesn't enqueue anything onto a shared contended structure, so it stays flat as load grows. Akka Streams' per-call runWith queues a fresh actor mailbox; under saturation the queue depth itself becomes the tail latency. Read the full breakdown.

Project Loom

Callbacks are continuations now.

Loom changed what "blocking" costs. It didn't change what coordinating a fan-out costs. async.java handles the coordination; Loom handles the threads. The two compose cleanly.

// One executor for the whole app. VT spawn is ~250 ns; cost is essentially free.
final var vt = Executors.newVirtualThreadPerTaskExecutor();

// Optional: route NeoQueue defaults through VTs too.
NeoQueue.setExecutor(vt);

// Now every task is a virtual thread. Blocking I/O inside a task is a continuation
// park, not a kernel thread block. The orchestration is still callbacks.
Asyncc.ParallelLimit(8, fetchTasks, (err, results) -> {
  // ...
});

For the full Loom-integration story — structured concurrency vs. callbacks, ThreadLocal vs ScopedValue, why NeoLock is still relevant — see the README's Project Loom section.

From the engineering log.

May 18, 2026
Tracking down a Whilst race, and finishing the variance pass
v0.2.9 fixes a sync-body race in NeoWhilst.RunMap and widens the Concat family to accept the Asyncc.Task shorthand. The race story is the interesting half.</p> </li>
May 17, 2026
async.java vs Akka Streams: a load-curve story
Where the 3-4× tail-latency gap comes from, why it isn't magic, and what it means for picking a JVM async coordination library in 2026.

All posts →