Patchbay: Browser-to-Browser Audio in 800 Lines of Code
A WebRTC deep dive. How Patchbay ships peer-to-peer audio with Cloudflare Durable Objects for signaling, AudioWorklet for zero-jank routing, and lossless PCM for musicians.
Patchbay started as a weekend experiment. Two people, one browser tab each, audio flowing peer to peer. No install, no login, no servers touching the audio. It is open source, it is still online at patchbay.eagerhq.com, and the entire thing is under 800 lines of code. This is a walk through how it works.
A room is a Durable Object. A call is a PeerConnection. The server's only job is to introduce the two peers and then get out of the way.
Three pieces, no surprises.
- Signaling. A Cloudflare Worker with Durable Objects. One Durable Object per room, routed by room code.
- STUN and TURN. Public STUN servers. TURN spun up on demand for clients behind symmetric NATs.
- Client. A single Svelte component, a handful of Web Audio nodes, zero third-party analytics.
After signaling completes, audio is strictly peer-to-peer. The server has no way to inspect it.
Durable Objects as rooms.
A room is a single Durable Object instance. Durable Objects give us single-threaded, strongly-consistent state per room with near-zero boilerplate.
export class Room {
peers = new Map<string, WebSocket>();
async fetch(req: Request) {
const pair = new WebSocketPair();
const [client, server] = Object.values(pair);
const id = crypto.randomUUID();
server.accept();
this.peers.set(id, server);
server.addEventListener("message", (ev) => {
const msg = JSON.parse(ev.data as string);
for (const [peerId, ws] of this.peers) {
if (peerId !== id) ws.send(JSON.stringify({ from: id, ...msg }));
}
});
server.addEventListener("close", () => this.peers.delete(id));
return new Response(null, { status: 101, webSocket: client });
}
}The room relays three kinds of message: offer, answer, and ice. Nothing else touches the server after the PeerConnection is established.
Why Durable Objects
- Per-room single-threaded state means zero race conditions on join and leave.
- They hibernate when idle and cost nothing. Patchbay's monthly signaling bill is pennies even with hundreds of rooms per day.
- They migrate transparently if a Cloudflare data center goes down. No extra work to handle failover.
The classic WebRTC handshake.
const pc = new RTCPeerConnection({
iceServers: [
{ urls: "stun:stun.cloudflare.com:3478" },
// TURN fetched from server on demand, short-lived creds
],
});
pc.onicecandidate = (ev) => {
if (ev.candidate) signal.send({ type: "ice", candidate: ev.candidate });
};
pc.ontrack = (ev) => attachToOutput(ev.streams[0]);
// Caller
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
signal.send({ type: "offer", sdp: offer });
// Callee receives offer
await pc.setRemoteDescription(offer);
const answer = await pc.createAnswer();
await pc.setLocalDescription(answer);
signal.send({ type: "answer", sdp: answer });The polite-peer pattern handles glare, where both sides try to renegotiate at the same time. We follow the perfect negotiation example from the W3C spec with minor tweaks for reconnects.
The one thing that breaks.
ICE restart is where nearly every WebRTC codebase falls over. Networks change. Users walk from a cafe WiFi onto mobile. The PeerConnection does not magically recover.
- We listen for
iceconnectionstatechange. Ondisconnected, we wait 2 seconds for a natural recovery. - If we land in
failed, the caller initiates an ICE restart withpc.restartIce()and a fresh offer withiceRestart: true. - If the restart itself fails twice, we tear down and rebuild the whole PeerConnection. This is ugly but reliable.
Build ICE restart on day one, not sprint three. Every shipped WebRTC product lives or dies on this code path.
Two profiles, one toggle.
Patchbay offers two audio modes.
- HD Opus at 256kbps. Default. Great for voice, decent for music, wide device support.
- Lossless PCM at 48kHz. For musicians and collaborators who need frame-accurate audio. Forces a higher-bandwidth path but bypasses codec artefacts.
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: mode === "voice",
noiseSuppression: mode === "voice",
autoGainControl: mode === "voice",
sampleRate: 48000,
},
});
if (mode === "lossless") {
const transceiver = pc.addTransceiver(stream.getAudioTracks()[0]);
const params = transceiver.sender.getParameters();
params.encodings = [{ maxBitrate: 1_600_000, priority: "high" }];
await transceiver.sender.setParameters(params);
}For voice mode we let Chrome's built-in DSP do echo cancellation and noise suppression. For lossless we disable all of it and let the musician hear exactly what is on the wire.
Zero-jank routing.
If you are touching audio in 2026, use AudioWorklet. ScriptProcessorNode runs on the main thread, gets blocked by anything that jank the UI, and has been deprecated for years.
class LevelMeter extends AudioWorkletProcessor {
process(inputs) {
const input = inputs[0][0];
if (!input) return true;
let peak = 0;
for (let i = 0; i < input.length; i++) {
const v = Math.abs(input[i]);
if (v > peak) peak = v;
}
this.port.postMessage(peak);
return true;
}
}
registerProcessor("level-meter", LevelMeter);The level meter, any in-line filters, and the clip detector all run on the audio thread. The main thread only ever sees a post-message with a scalar value.
On-demand and cheap.
We only spin up TURN when ICE reports no direct path. A symmetric NAT somewhere between the peers is the usual cause.
- The client POSTs to a Worker endpoint which issues short-lived TURN creds signed with a shared secret.
- Creds expire in 10 minutes. If the call outlives that, we issue a fresh one on the next renegotiation.
- Bandwidth cost for TURN is maybe 10 percent of our monthly spend because the common case stays direct.
Simplicity as a feature.
- No accounts. A room code is the only primitive.
- No rooms over two peers. Two is the sweet spot for P2P. Adding a third participant pushes you into SFU territory and triples the complexity.
- No recording. Patchbay's whole point is that audio never hits our servers.
- No analytics. The Worker logs request counts and nothing else.
Patchbay has never been our most support-heavy project. That is not an accident.
Read the source.
The whole codebase is under 800 lines, MIT licensed, and sitting under the EagerHQ org on GitHub. Clone, deploy, extend. If you want to build something similar and want a second pair of eyes, write to hello@eagerhq.com.
Inside EagerHQ: The Projects We Build, Open Source and Otherwise
A field report on everything we ship from the EagerHQ workshop. Voxlit and Patchbay under the hood, Webnite on stage, and the principles that hold them together.
9 min read →TechnicalVoxlit Under the Hood: How We Built a Voice-First AI Agent for macOS
A full engineering breakdown of Voxlit. CoreML hotword detection, streaming STT over WebSocket, the tool-enabled agent, and the Go cloud backend that ties it together.
14 min read →TechnicalOn-Device Hotword Detection on macOS With CoreML: A Practical Guide
How to ship a private, battery-friendly wake-word detector on macOS. Data collection, a compact CNN, CoreML conversion, AVAudioEngine plumbing, and the gotchas nobody warns you about.
13 min read →