~50 µs per orchestration
No actor mailbox, no graph materialisation, no per-call scheduling layer. Asyncc.Parallel is a heap allocation + a couple of atomic increments + your callback. The library never gets in the way.
v0.2.9 · MIT · JDK 11+ · JDK 21 native
A Java port of the Node.js async library. Compose Parallel, Series, Waterfall, Race, Map, Reduce, Queue, and Lock into pipelines. ~50 µs per orchestration overhead. Backed by virtual threads when you want them.
// fan out two enrichment lookups, score, serialize final var tasks = List.of( c -> exec.submit(() -> c.success(lookupA(req))), c -> exec.submit(() -> c.success(lookupB(req))) ); Asyncc.Parallel(tasks, wrap(results -> { var scored = score(req, results.get(0), results.get(1)); reply.send(serialize(scored)); }));
Why async.java
Most async-coordination libraries on the JVM grew out of pre-Loom assumptions: they own their thread pool, they assume long-running flows, and they layer many frames between you and your code. async.java picks a different point in the design space.
No actor mailbox, no graph materialisation, no per-call scheduling layer. Asyncc.Parallel is a heap allocation + a couple of atomic increments + your callback. The library never gets in the way.
Pass Executors.newVirtualThreadPerTaskExecutor() to NeoQueue or your tasks and every fan-out spawns on a virtual thread. The library handles the orchestration; Loom handles the threads.
Hardened in v0.2.x with dedup guards, atomic counters, slot-write-before-counter-increment ordering, and a v0.2.4 fix for the ArrayList resize race under high-throughput fan-out. Adversarial fuzz tests pin the at-most-once contract across all combinators.
c.success(v) / c.fail(e)Shorthand for c.done(null, v) and c.done(e, null). The continuation parameter is named c — short for continuation — everywhere in the docs.
WrapErrFirst.wrap(...)Wrap a value-only consumer into an error-first callback and skip the if (err != null)... preamble. Throws on unhandled errors; pair with an explicit error consumer if you want both branches.
Waterfall wrapping a Map wrapping a Parallel wrapping a Race is a perfectly normal pipeline — they all use the same error-first callback shape. See the composability showcase.
Two changes, both diagnosed from AsyncFut.Whilst’s production behavior:
NeoWhilst.RunMap race fixed. A sync-completing body
(AsyncFut.Whilst with an already-completed CompletableFuture —
common in tests and cache-hit paths) was double-dispatching one extra
body call past short-circuit. The truth-test ran in two places: inside
the per-task done callback (which already recurses if the loop
should continue) AND in a post-m.run block intended for async-body
fan-out at limit > 1. For sync-completing bodies the post-m.run
test would re-fire after the chain had already settled. Now gated on
s.isShortCircuited() || taskRunner.isFinished() — the
async-body fan-out path is unchanged.
Concat/ConcatSeries/ConcatLimit/ConcatDeep/ConcatDeepSeries/ConcatDeepLimit
task-list variants widened to List<? extends AsyncTask<T, E>>.
Same ? extends treatment we applied to Parallel/Series/ParallelLimit
in v0.2.8-rc2. A List<Asyncc.Task<T>> (the Throwable-fixed shorthand)
now flows into all nine Concat overloads without an explicit cast or
defensive copy. Internal NeoParallel/NeoSeries methods widened too,
so the public-API defensive ArrayList copy could be elided — one
fewer allocation per Asyncc.Parallel/Series/ParallelLimit call.
Read the full deep-dive: Tracking down a Whilst race.
192 tests, 0 failures, 2 JDK 21-gated skips.
Install
Add the JitPack repository and pin the version. Releases are signed git tags on the main repo; see the releases page for the latest.
<!-- pom.xml -->
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependency>
<groupId>com.github.async-java</groupId>
<artifactId>async.java</artifactId>
<version>v0.2.9</version>
</dependency>For Gradle, see the JitPack page for v0.2.9. The library targets JDK 11 but is tested on 11, 17, and 21.
Combinators
Every combinator takes tasks (or values) and an error-first final callback. Compose them freely — they nest without surprises because they all honor the same at-most-once final-callback contract.
Benchmark
Same 5-stage pipeline, both orchestrators, 60-second sustained WebSocket runs from a Rust load tester. Numbers are end-to-end round-trip latency (parse → validate → enrich ∥ → score → serialize) on JDK 21 with a virtual-thread executor. Full methodology in the load-curve post.
| offered load | library | p50 | p99 | max | drops |
|---|---|---|---|---|---|
| 500 msg/s (50 × 10) | async.java | 5.7 ms | 14.3 ms | 46 ms | 0 |
| akka-streams | 17.8 ms | 30.7 ms | 55 ms | 0 | |
| 1 000 msg/s (200 × 5) | async.java | 5.1 ms | 14.8 ms | 21 ms | 0 |
| akka-streams | 5.9 ms | 54.3 ms | 100 ms | 0 | |
| 2 500 msg/s (50 × 50) | async.java | 5.0 ms | 11.5 ms | 18 ms | 0 |
| akka-streams | 2 017 ms | 5 230 ms | 6 258 ms | ~14 % |
The gap is dispatcher queue-wait. async.java's per-call overhead doesn't enqueue anything onto a shared
contended structure, so it stays flat as load grows. Akka Streams' per-call runWith queues a fresh actor
mailbox; under saturation the queue depth itself becomes the tail latency. Read
the full breakdown.
Project Loom
Loom changed what "blocking" costs. It didn't change what coordinating a fan-out costs. async.java handles the coordination; Loom handles the threads. The two compose cleanly.
// One executor for the whole app. VT spawn is ~250 ns; cost is essentially free.
final var vt = Executors.newVirtualThreadPerTaskExecutor();
// Optional: route NeoQueue defaults through VTs too.
NeoQueue.setExecutor(vt);
// Now every task is a virtual thread. Blocking I/O inside a task is a continuation
// park, not a kernel thread block. The orchestration is still callbacks.
Asyncc.ParallelLimit(8, fetchTasks, (err, results) -> {
// ...
});For the full Loom-integration story —
structured concurrency vs. callbacks, ThreadLocal vs ScopedValue, why NeoLock
is still relevant — see the README's Project Loom section.
Recent posts
v0.2.9 fixes a sync-body race in NeoWhilst.RunMap and widens the Concat family to accept the Asyncc.Task Where the 3-4× tail-latency gap comes from, why it isn't magic, and what it means for picking a JVM async coordination library in 2026.