v0.2.6 · MIT · JDK 11+ · JDK 21 native

Callback-based async control flow for Java, that plays nice with Loom.

A Java port of the Node.js async library. Compose Parallel, Series, Waterfall, Race, Map, Reduce, Queue, and Lock into pipelines. ~50 µs per orchestration overhead. Backed by virtual threads when you want them.

// fan out two enrichment lookups, score, serialize
final var tasks = List.of(
  c -> exec.submit(() -> c.success(lookupA(req))),
  c -> exec.submit(() -> c.success(lookupB(req)))
);

Asyncc.Parallel(tasks, wrap(results -> {
  var scored = score(req, results.get(0), results.get(1));
  reply.send(serialize(scored));
}));

Why async.java

Three properties you can rely on.

Most async-coordination libraries on the JVM grew out of pre-Loom assumptions: they own their thread pool, they assume long-running flows, and they layer many frames between you and your code. async.java picks a different point in the design space.

// near-zero overhead

~50 µs per orchestration

No actor mailbox, no graph materialisation, no per-call scheduling layer. Asyncc.Parallel is a heap allocation + a couple of atomic increments + your callback. The library never gets in the way.

// virtual-thread native

Loom is a co-processor, not a replacement

Pass Executors.newVirtualThreadPerTaskExecutor() to NeoQueue or your tasks and every fan-out spawns on a virtual thread. The library handles the orchestration; Loom handles the threads.

// predictable

At-most-once final callback

Hardened in v0.2.x with dedup guards, atomic counters, slot-write-before-counter-increment ordering, and a v0.2.4 fix for the ArrayList resize race under high-throughput fan-out. Adversarial fuzz tests pin the at-most-once contract across all combinators.

// v0.2.4 ergonomics

c.success(v) / c.fail(e)

Shorthand for c.done(null, v) and c.done(e, null). The continuation parameter is named c — short for continuation — everywhere in the docs.

// no boilerplate

WrapErrFirst.wrap(...)

Wrap a value-only consumer into an error-first callback and skip the if (err != null)... preamble. Throws on unhandled errors; pair with an explicit error consumer if you want both branches.

// composability

Combinators nest cleanly

Waterfall wrapping a Map wrapping a Parallel wrapping a Race is a perfectly normal pipeline — they all use the same error-first callback shape. See the composability showcase.

Install

JitPack (live within minutes of a git tag).

Add the JitPack repository and pin the version. Releases are signed git tags on the main repo; see the releases page for the latest.

<!-- pom.xml -->
<repositories>
  <repository>
    <id>jitpack.io</id>
    <url>https://jitpack.io</url>
  </repository>
</repositories>

<dependency>
  <groupId>com.github.async-java</groupId>
  <artifactId>async.java</artifactId>
  <version>v0.2.6</version>
</dependency>

For Gradle, see the JitPack page for v0.2.6. The library targets JDK 11 but is tested on 11, 17, and 21.

Combinators

A small, composable surface.

Every combinator takes tasks (or values) and an error-first final callback. Compose them freely — they nest without surprises because they all honor the same at-most-once final-callback contract.

Asyncc.Parallel Asyncc.ParallelLimit Asyncc.Series Asyncc.Waterfall Asyncc.Race Asyncc.Times Asyncc.Each Asyncc.Map Asyncc.FilterMap Asyncc.Reduce Asyncc.GroupBy Asyncc.Concat Asyncc.Inject Asyncc.Whilst Asyncc.DoWhilst NeoQueue NeoLock NeoRwLock

Examples for each →

Benchmark

async.java vs Akka Streams under load.

Same 5-stage pipeline, both orchestrators, 60-second sustained WebSocket runs from a Rust load tester. Numbers are end-to-end round-trip latency (parse → validate → enrich ∥ → score → serialize) on JDK 21 with a virtual-thread executor. Full methodology in the load-curve post.

offered load library p50 p99 max drops
500 msg/s
(50 × 10)
async.java5.7 ms14.3 ms46 ms0
akka-streams17.8 ms30.7 ms55 ms0
1 000 msg/s
(200 × 5)
async.java5.1 ms14.8 ms21 ms0
akka-streams5.9 ms54.3 ms100 ms0
2 500 msg/s
(50 × 50)
async.java5.0 ms11.5 ms18 ms0
akka-streams2 017 ms5 230 ms6 258 ms~14 %

The gap is dispatcher queue-wait. async.java's per-call overhead doesn't enqueue anything onto a shared contended structure, so it stays flat as load grows. Akka Streams' per-call runWith queues a fresh actor mailbox; under saturation the queue depth itself becomes the tail latency. Read the full breakdown.

Project Loom

Callbacks are continuations now.

Loom changed what "blocking" costs. It didn't change what coordinating a fan-out costs. async.java handles the coordination; Loom handles the threads. The two compose cleanly.

// One executor for the whole app. VT spawn is ~250 ns; cost is essentially free.
final var vt = Executors.newVirtualThreadPerTaskExecutor();

// Optional: route NeoQueue defaults through VTs too.
NeoQueue.setExecutor(vt);

// Now every task is a virtual thread. Blocking I/O inside a task is a continuation
// park, not a kernel thread block. The orchestration is still callbacks.
Asyncc.ParallelLimit(8, fetchTasks, (err, results) -> {
  // ...
});

For the full Loom-integration story — structured concurrency vs. callbacks, ThreadLocal vs ScopedValue, why NeoLock is still relevant — see the README's Project Loom section.

Recent posts

From the engineering log.

All posts →