{"id":403,"date":"2026-03-09T13:22:18","date_gmt":"2026-03-09T13:22:18","guid":{"rendered":"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/webassembly-in-2026-where-it-actually-makes-sense\/"},"modified":"2026-03-18T22:30:14","modified_gmt":"2026-03-18T22:30:14","slug":"webassembly-in-2026-where-it-actually-makes-sense","status":"publish","type":"post","link":"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/webassembly-in-2026-where-it-actually-makes-sense\/","title":{"rendered":"WebAssembly in 2026: Where It Actually Makes Sense to Replace JavaScript"},"content":{"rendered":"<p>I have been meaning to write this post for months. I kept putting it off because every time I sat down, the answer felt more nuanced than I wanted it to be \u2014 and nuanced answers don&#8217;t get clicks. But here we are.<\/p>\n<p>So: I spent about six weeks last fall doing a real evaluation of WebAssembly across several different projects. Not toy benchmarks. Actual work problems, on a five-person team building an AI-assisted code review product. My takeaway is not &#8220;Wasm is finally ready&#8221; or &#8220;Wasm is overhyped.&#8221; It&#8217;s more specific than either of those, and specificity is the whole point.<\/p>\n<h2>The Compute-Bound Cases Where Wasm Actually Delivers<\/h2>\n<p>The honest answer to &#8220;where does Wasm win?&#8221; is: wherever you&#8217;re doing heavy computation that doesn&#8217;t need to touch the DOM, and where startup time isn&#8217;t your bottleneck.<\/p>\n<p>Image processing is the canonical example and it <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/05\/github-copilot-vs-cursor-vs-codeium-best-ai-coding\/\" title=\"Holds Up\">holds up<\/a>. We use a Wasm-compiled custom image pipeline for thumbnail generation in our CI preview system. Pure Rust, compiled with wasm-pack 0.13.1, targeting <code>wasm32-unknown-unknown<\/code>. Compared to the equivalent JS implementation, we see roughly 4\u20135x throughput improvement on the image manipulation steps. That matches <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/docker-compose-vs-kubernetes-when-to-use-which-in\/\" title=\"What I\">what I<\/a>&#8217;ve seen elsewhere, so no surprise there.<\/p>\n<p>What&#8217;s less talked about is compression. We switched from a JS implementation of zstd to a Wasm-compiled version late last year. Before: decompressing large artifact bundles took around 340ms in profiling. After: 80\u201390ms. Real numbers, real workload \u2014 your mileage may vary depending on hardware. The main artifact is a ~2.4MB compressed blob, if that gives context.<\/p>\n<p>Cryptographic operations are another clean win, if you need algorithms outside <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/redis-vs-valkey-in-2026-why-the-license-change-for\/\" title=\"What the\">what the<\/a> Web Crypto API covers. For everything in SubtleCrypto, just use that. But we had a specific need for a custom hash function for deduplication (legacy system, long story), and the Wasm version was night-and-day faster.<\/p>\n<pre><code class=\"language-rust\">\/\/ wasm-exposed hash function, simplified\nuse wasm_bindgen::prelude::*;\n\n#[wasm_bindgen]\npub fn compute_dedup_hash(data: &amp;[u8]) -&gt; String {\n    \/\/ custom parameterized blake3 variant\n    \/\/ this was the bottleneck \u2014 ~20ms per call <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> JS version\n    let hash = my_custom_hasher::hash_with_params(data, DEDUP_PARAMS);\n    hex::encode(hash)\n}\n<\/code><\/pre>\n<p>The call overhead from JS into Wasm is real \u2014 I&#8217;ll get to that \u2014 but when the computation per call is expensive enough, the boundary cost gets swamped. That threshold is somewhere around 5\u201310ms of actual computation in my experience. Below that, you start paying more in overhead than you save.<\/p>\n<p>Practical takeaway: if you&#8217;re doing image processing, codec work, compression, or custom crypto <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> browser or Node, Wasm is a straightforward choice. The tooling (wasm-pack, Emscripten 3.x) is mature enough that the setup cost is real but predictable.<\/p>\n<h2>The AI Inference Story Is Better Than I Expected \u2014 With One Catch<\/h2>\n<p>This is where I expected to get burned, and I sort of did, but not <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> way I anticipated.<\/p>\n<p>We run small classification models client-side for a few features in our product. Mostly for privacy reasons: code snippets contain customer IP and we&#8217;d rather not round-trip them. I tested ONNX Runtime Web&#8217;s Wasm backend against the WebGPU backend <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/edge-computing-in-2026-why-developers-are-adopting\/\" title=\"for Our\">for our<\/a> specific models \u2014 a couple of fine-tuned DistilBERT-class things, nothing huge, largest is about 45MB.<\/p>\n<p>The Wasm backend was more consistent than I expected. Not fast-fast, but consistent. WebGPU is theoretically faster, but in practice \u2014 at least when we evaluated, running ONNX Runtime Web v1.18 \u2014 the WebGPU backend had initialization variance that made it hard <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/05\/claude-vs-gpt-4o-vs-gemini-20-which-ai-model-to-us\/\" title=\"to Use for\">to use for<\/a> features where latency predictability matters more than raw throughput. One model showed WebGPU at 3x faster steady state, but with a 1.2s cold start on Chrome and a 4s cold start on Firefox 134. That&#8217;s a real problem when the feature is supposed to feel instant.<\/p>\n<p>The Wasm backend had roughly 200\u2013300ms cold start and ~110ms inference per call on our test machine (M2 MacBook Air \u2014 not exactly representative of user hardware, I know). Slower, but predictable.<\/p>\n<p>One thing I noticed: ONNX Runtime Web&#8217;s Wasm build as of v1.18 includes SIMD support by default, and that made a meaningful difference \u2014 roughly 30\u201340% on our workloads compared to the non-SIMD build. If you&#8217;re on an older version, check whether you&#8217;re getting <code>ort-wasm-simd-threaded.wasm<\/code> or the fallback. It matters.<\/p>\n<p>Here&#8217;s where I got genuinely caught out though. I thought startup time would be my main problem. It turned out that memory pressure \u2014 running multiple inference sessions simultaneously \u2014 was the thing <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/typescript-5x-in-2026-features-that-actually-matte\/\" title=\"That Actually\">that actually<\/a> bit us. Wasm linear memory doesn&#8217;t interact with the browser&#8217;s GC the same way the JS heap does. We had workers leaking memory across sessions, which took me an embarrassingly long time to track down. The fix was trivially simple: call <code>session.release()<\/code> after inference completes. Obvious in retrospect, completely invisible <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> moment.<\/p>\n<p>Bottom line for client-side ML: Wasm is the pragmatic choice right now if you need broad browser support and predictable latency. WebGPU will probably flip that calculus by late 2026, but it&#8217;s not there yet across the range of hardware and browser versions our users actually run.<\/p>\n<h2>Where Wasm Keeps Disappointing Me<\/h2>\n<p>Right, so \u2014 the wins are real but narrowly scoped. Here&#8217;s where I&#8217;ve repeatedly reached for Wasm and pulled my hand back.<\/p>\n<p>Startup time is the killer for anything that needs to feel fast. Our first Wasm attempt was for syntax highlighting. I thought: the existing JS implementation is a hotspot in our profiles, Wasm should help. I pushed this on a Friday afternoon and had to revert before end of day. The .wasm binary was 480KB (after wasm-opt), initialization was adding 60\u201380ms to first meaningful paint (measured with PerformanceObserver, not a gut feeling), and users noticed. The JS library we were replacing was tree-shakeable to ~15KB with zero startup cost. That comparison was humiliating.<\/p>\n<p>The JS-to-Wasm boundary tax is under-discussed. Every call from JS into Wasm has overhead. For primitives, it&#8217;s small. But passing strings or arrays \u2014 anything requiring serialization into linear memory \u2014 adds up fast. We benchmarked passing 50KB of text data into a Wasm function 100 times: serialization overhead alone was about 8ms per call. For a function computing in 2ms, that&#8217;s catastrophic.<\/p>\n<pre><code class=\"language-javascript\">\/\/ naive: serialization overhead dominates\nfor (const chunk of chunks) {\n  result.push(wasmModule.processChunk(chunk)); \/\/ serializes chunk on every call\n}\n\n\/\/ better: one boundary crossing, process everything in Wasm\nconst merged = mergeChunks(chunks);          \/\/ one JS allocation\nconst out = wasmModule.processAll(merged);   \/\/ one boundary crossing\n\/\/ parse out back into JS objects\n<\/code><\/pre>\n<p>DOM manipulation \u2014 obvious, but I&#8217;ll say it anyway. If your code touches the DOM at all, Wasm doesn&#8217;t help. There&#8217;s no shortcut here regardless of what you&#8217;ve read about Wasm Components. The DOM lives in JS land. If you&#8217;re trying to speed up rendering or layout, look at virtualization, CSS containment, <code>content-visibility<\/code>. Wasm won&#8217;t touch those problems.<\/p>\n<p>AssemblyScript is worth mentioning because TypeScript-to-Wasm sounds genuinely appealing. I experimented with it for about <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/langchain-vs-llamaindex-vs-haystack-building-produ\/\" title=\"Two Weeks\">two weeks<\/a>. The experience was rough \u2014 tooling gaps, limited stdlib, and the mental model mismatch between AS and TypeScript is bigger than the syntax similarity suggests. I haven&#8217;t gone back since.<\/p>\n<h2>The Developer Experience Tax That Doesn&#8217;t Show <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/05\/github-copilot-vs-cursor-vs-codeium-best-ai-coding\/\" title=\"Up in\">Up in<\/a> Benchmarks<\/h2>\n<p>Shipping Wasm adds real ongoing maintenance overhead, and I don&#8217;t see this acknowledged often enough.<\/p>\n<p>Debugging Wasm in browser devtools is passable if you have DWARF symbols embedded, but nowhere near JS debugging ergonomics. Wasm-pack&#8217;s source map output is inconsistent across my team&#8217;s setups \u2014 we have a mix of Intel and ARM Macs plus Linux in CI, and getting consistent debug symbols across <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/fastapi-vs-django-vs-flask-choosing-the-right-pyth\/\" title=\"All Three\">all three<\/a> took more configuration than I&#8217;d like to admit. I spent an afternoon tracking down why a panicking Rust function was producing an opaque error <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> browser console. Turned out <code>console_error_panic_hook<\/code> wasn&#8217;t enabled <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/08\/rag-deep-dive-chunking-strategies-vector-databases\/\" title=\"in the\">in the<\/a> release build. Obvious once you know. Invisible before.<\/p>\n<p>Binary size is a whole separate project. Our Rust\u2192Wasm binary before optimization: 1.1MB. After <code>wasm-opt -O3<\/code>: 380KB. After brotli: 95KB. That&#8217;s fine \u2014 but getting there required adding wasm-opt to the build pipeline and figuring out the right optimization flags. O3 regressed one function&#8217;s performance due to inlining decisions, and I&#8217;m still not entirely sure why. We ended up using <code>--optimize-level 2<\/code> for that specific module.<\/p>\n<p>Wasm GC shipped across all major browsers in late 2023\/early 2024, which opens up Kotlin and Dart for Wasm targets without the historical overhead of bundling a full GC. That matters if you&#8217;re evaluating language options for a non-Rust team. For us it&#8217;s irrelevant, but worth knowing.<\/p>\n<h2>What I&#8217;d Actually Tell My Team If We Were Starting From Scratch<\/h2>\n<p>Stop asking &#8220;should we use Wasm?&#8221; and start asking: what specific computation do I need to run, <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/deno-20-in-production-2026-migration-from-nodejs-a\/\" title=\"and What\">and what<\/a> are the latency and startup constraints on it?<\/p>\n<p>My heuristic, not a framework \u2014 just how I think about it now:<\/p>\n<p>Reach for Wasm if the operation is compute-bound (not I\/O-bound), takes more than ~5ms in pure JS, doesn&#8217;t need frequent DOM access, and startup latency is either amortized over a long session or can be hidden behind a loading state. Image processing, compression, custom crypto, ML inference \u2014 yes.<\/p>\n<p>Don&#8217;t reach for Wasm if the code touches the DOM, the operation is under 2ms (boundary cost will dominate), startup time matters and can&#8217;t be hidden, or the existing JS library is already optimized and tree-shakeable. Syntax highlighting, string formatting, most UI logic, simple JSON parsing \u2014 stay in JS.<\/p>\n<p>The case I&#8217;m watching more closely than the browser story: WASI 0.2 and the Component Model are enabling server-side Wasm in ways <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/typescript-5x-in-2026-features-that-actually-matte\/\" title=\"That Actually\">that actually<\/a> interest me. <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/cloudflare-workers-vs-aws-lambda-which-edge-runtim\/\" title=\"Cloudflare Workers\">Cloudflare Workers<\/a> runs Wasm natively; Fastly&#8217;s Compute platform is built on it. Running Wasm as a compute unit <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/cloudflare-workers-vs-aws-lambda-which-edge-runtim\/\" title=\"at the\">at the<\/a> edge removes most of the startup-time and binary-size concerns. I&#8217;m more bullish on that story than the in-browser replacement narrative at this point.<\/p>\n<p>The honest answer to &#8220;is Wasm ready to replace JavaScript?&#8221; is no \u2014 not as a general replacement, and framing it that way was always the wrong question. Wasm fills specific, well-defined gaps. <a href=\"https:\/\/blog.rebalai.com\/en\/2026\/03\/09\/github-copilot-vs-cursor-vs-windsurf-which-ai-codi\/\" title=\"in 2026\">In 2026<\/a>, those gaps are real, the tooling is workable, and the performance wins are genuine. But you have to know exactly what problem you&#8217;re solving before you pay the complexity tax. Most web app code shouldn&#8217;t touch Wasm. The parts that should \u2014 you&#8217;ll know them by their profiler traces.<\/p>\n<p><!-- Reviewed: 2026-03-09 | Status: ready_to_publish | Changes: meta_description expanded to 159 chars, second \"Practical takeaway:\" varied to \"Bottom line for client-side ML:\" to break repeated pattern --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I have been meaning to write this post for months.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[1],"tags":[],"class_list":["post-403","post","type-post","status-publish","format-standard","hentry","category-general"],"_links":{"self":[{"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/posts\/403","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/comments?post=403"}],"version-history":[{"count":11,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/posts\/403\/revisions"}],"predecessor-version":[{"id":572,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/posts\/403\/revisions\/572"}],"wp:attachment":[{"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/media?parent=403"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/categories?post=403"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.rebalai.com\/en\/wp-json\/wp\/v2\/tags?post=403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}