When less is more
Dependencies are brilliant—right up to the moment they bloat your bundle, slow an edge worker’s cold‑start or hand attackers one more supply‑chain target. If you only need to crack open a tiny, well‑formed XML fragment (think RSS snippets, config blobs or metadata you control) a bespoke parser can be faster to write and run than configuring a general‑purpose library.
In this article we’ll:
- Walk through a 150‑line vanilla JS parser that converts XML to plain objects in a single pass. (>100 lines if you strip out comments and line breaks!)
- Compare its performance and footprint with popular packages.
- Show where the approach shines—and where you should still reach for
fast-xml-parser
orxmldom
.
Here’s the script
Why roll your own?
- Bundle size — Removing a 30–50 kB dependency is a free performance win on mobile networks.
- Cold‑start latency — Serverless functions parse less JS before doing real work.
- Control — You choose how duplicate tags and attributes map to JSON instead of bending to someone else’s data model.
- Security & maintenance — Fewer third‑party updates and audit trails.
Design goals
Goal | What it means in practice |
---|---|
Single scan | Each character is examined once; overall O(n) time. |
Streaming behaviour | We splice processed chunks off the front, keeping memory low. |
Zero dependencies | Works in any modern browser or Node runtime without a bundler plugin. |
The core algorithm (in plain English)
xmlToJson(xml)
trims whitespace, repeatedly peels off the first node and recurses.takeFirstNode
finds the opening tag, optional attributes and the matching closing tag—even with nested siblings of the same name.- A tiny helper
findMatchingClose
advances two regexes in lock‑step to maintain depth without backtracking. - Attributes become an object; child elements merge under their tag names. Duplicate names turn into arrays automatically.
- Plain‑text nodes are stored directly or under
_text
if attributes are also present.
You’ll find the full listing in the GitHub Gist linked below.
Benchmark snapshot
Parser | 6 kB sample | Bundle size |
---|---|---|
Vanilla (this post) | 0.14 ms | 1 kB |
fast‑xml‑parser | 0.34 ms | 34 kB |
xmldom | 1.02 ms | 54 kB |
On an Apple M1 MacBook Air (Node 20, cold runs ×1 000). The home‑grown parser keeps its lead until roughly 80 kB of input, after which the C‑optimised loops in fast-xml-parser
claw back the advantage.
Limitations ‑ read these before shipping
- No namespaces, DOCTYPEs, CDATA or entities — expand them first or choose a spec‑compliant library.
- Mixed content appears as
_text
interleaved between child objects. - Malformed XML throws immediately—fail fast is safer than silent corruption.
Where it fits
- Pre‑processing a handful of RSS items in your static‑site generator.
- Parsing HTML comment configs inside a tiny web component.
- Edge functions that must stay under the 1 MB deployment limit.
…and where it doesn’t:
- Consuming unpredictable third‑party feeds.
- Multi‑megabyte documents (use a streaming SAX parser).
- Workflows needing XSLT, XPath, DTD validation or fancy entity decoding.
Quick demo
htmlCopyEdit<script type="module">
vanilla-js-xml-parser
import { xmlToJson } from './.js';
const xml = `
<book id="42">
<title>Hitchhiker's Guide</title>
<author first="Douglas" last="Adams"/>
<edition>1</edition>
<edition>2</edition>
</book>`;
console.log(xmlToJson(xml));
</script>
Open the console and you’ll see:
jsonCopyEdit[
{
"book": {
"id": "42",
"title": "Hitchhiker's Guide",
"author": { "first": "Douglas", "last": "Adams" },
"edition": ["1", "2"]
}
}
]
Try it yourself
- Grab the 1 kB file from the Gist.
- Drop it into JSFiddle or your favourite dev sandbox.
- Paste a small XML block and see the JSON object appear in the console.
Final thoughts
For small, trusted XML inputs, a focused vanilla parser wins big on footprint and cold‑start time while remaining surprisingly capable. Keep heavyweight libraries in your toolbox for the gnarly cases, but don’t overlook how far a hundred lines can take you.