-
-
Notifications
You must be signed in to change notification settings - Fork 51
Home
- Release history — what was added and when.
- Recipes — learn tricky parts of streams.
stream-json is a micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can:
- Parse JSON files far exceeding available memory — even individual keys, strings, and numbers can be streamed piece-wise.
- All components are optimized for throughput. See Performance for tuning tips.
- Stream using a SAX-inspired event-based API.
- Handle huge Django-like JSON database dumps.
- Support JSON Streaming (concatenated/line-delimited JSON).
Built on stream-chain for pipeline composition. TypeScript definitions are bundled.
Companion projects:
- stream-chain — pipeline composition from streams, functions, and generators.
-
stream-csv-as-json — streams CSV files in a
stream-json-compatible format. - stream-join — merges side channels for complex pipeline topologies.
Both CommonJS require() and ESM import are supported. Always include the .js extension in paths — see FAQ for details.
// CommonJS
const {parser} = require('stream-json/parser.js');
const {pick} = require('stream-json/filters/pick.js');
const {streamArray} = require('stream-json/streamers/stream-array.js');
const {chain} = require('stream-chain');// ESM
import {parser} from 'stream-json/parser.js';
import {pick} from 'stream-json/filters/pick.js';
import {streamArray} from 'stream-json/streamers/stream-array.js';
import chain from 'stream-chain';Overview and cheat sheet. Click component names for detailed API docs and examples.
The main module returns a factory function that creates a parser decorated with emit().
Parser is the core — a streaming JSON parser that consumes text and produces a token stream. Both standard JSON and JSON Streaming are supported.
const {parser} = require('stream-json');
const pipeline = fs.createReadStream('data.json').pipe(parser.asStream());For invalid JSON input, use Verifier to pinpoint the exact error position.
Filters edit a token stream on the fly:
-
Pick — selects matching subobjects, ignoring the rest.
const {pick} = require('stream-json/filters/pick.js'); chain([source, parser(), pick({filter: 'data'})]);
If
pickselects more than one subobject, follow it withstreamValues(). -
Replace — substitutes matching subobjects.
const {replace} = require('stream-json/filters/replace.js'); chain([source, parser(), replace({filter: /^\d+\.extra\b/})]);
-
Ignore — removes subobjects entirely.
const {ignore} = require('stream-json/filters/ignore.js'); chain([source, parser(), ignore({filter: /^\d+\.extra\b/})]);
-
Ignore — removes subobjects entirely.
-
Filter — filters subobjects while preserving the original JSON shape.
const {filter} = require('stream-json/filters/filter.js'); chain([source, parser(), filter({filter: /^data\b/})]);
Filters go after the parser and can be chained. They return functions for use in chain() — for .pipe() usage, use .withParserAsStream().
While source files are huge, individual data items often fit in memory. Streamers assemble tokens into JavaScript objects. All streamers support early rejection via objectFilter — see StreamBase.
-
StreamValues — streams successive values (JSON Streaming or after
pick()).const {streamValues} = require('stream-json/streamers/stream-values.js');
-
StreamArray — streams elements of a single top-level array.
const {streamArray} = require('stream-json/streamers/stream-array.js');
-
StreamObject — streams top-level properties of a single object.
const {streamObject} = require('stream-json/streamers/stream-object.js');
Streamers go after the parser and optional filters.
-
Assembler — reconstructs JavaScript objects from a token stream (EventEmitter, not a stream).
const Assembler = require('stream-json/assembler.js'); const pipeline = chain([fs.createReadStream('data.json.gz'), zlib.createGunzip(), parser()]); const asm = Assembler.connectTo(pipeline); asm.on('done', asm => console.log(asm.current));
-
Disassembler — converts JavaScript objects into a token stream.
const {disassembler} = require('stream-json/disassembler.js'); const pipeline = chain([ fs.createReadStream('array.json.gz'), zlib.createGunzip(), parser(), streamArray(), disassembler(), pick({filter: 'value'}), streamValues() ]);
-
Stringer — converts a token stream back into JSON text.
const stringer = require('stream-json/stringer.js'); chain([ fs.createReadStream('data.json.gz'), zlib.createGunzip(), parser(), pick({filter: 'data'}), stringer(), zlib.createGzip(), fs.createWriteStream('edited.json.gz') ]);
-
Emitter — re-emits tokens as named events.
const emitter = require('stream-json/emitter.js'); const e = emitter(); chain([fs.createReadStream('data.json'), parser(), e]); let counter = 0; e.on('startObject', () => ++counter); e.on('finish', () => console.log(counter, 'objects'));
-
emit() — attaches token events to any stream (lightweight
Emitter).const emit = require('stream-json/utils/emit.js'); const pipeline = chain([fs.createReadStream('data.json'), parser()]); emit(pipeline); let counter = 0; pipeline.on('startObject', () => ++counter); pipeline.on('finish', () => console.log(counter, 'objects'));
-
withParser() — creates a
parser() + componentpipeline. Most components expose.withParser()and.withParserAsStream()as static methods.const {streamArray} = require('stream-json/streamers/stream-array.js'); const pipeline = streamArray.withParserAsStream();
-
FlexAssembler — like Assembler but with custom containers (Map, Set, etc.) at specific paths.
const FlexAssembler = require('stream-json/utils/flex-assembler.js'); const asm = FlexAssembler.connectTo(pipeline, { objectRules: [{filter: () => true, create: () => new Map(), add: (m, k, v) => m.set(k, v)}] });
-
Batch — groups items into arrays of a configurable size.
-
Verifier — validates JSON text; errors include offset, line, and position.
-
Utf8Stream — sanitizes multibyte UTF-8 input split across chunk boundaries.
Efficient JSONL support:
-
jsonl/Parser — parses JSONL into
{key, value}objects (likeStreamValues). -
jsonl/Stringer — serializes objects to JSONL text.
const jsonlStringer = require('stream-json/jsonl/stringer.js'); const jsonlParser = require('stream-json/jsonl/parser.js'); const {chain} = require('stream-chain'); const fs = require('fs'); const zlib = require('zlib'); const pipeline = chain([ fs.createReadStream('sample1.jsonl.br'), zlib.createBrotliDecompress(), jsonlParser(), data => data.value, jsonlStringer(), zlib.createBrotliCompress(), fs.createWriteStream('sample2.jsonl.br') ]);
JSONC (JSON with Comments) support:
-
jsonc/Parser — streaming JSONC parser with comment and whitespace tokens.
-
jsonc/Stringer — converts JSONC token streams back to text.
-
jsonc/Verifier — validates JSONC text; errors include offset, line, and position.
const {parser: jsoncParser} = require('stream-json/jsonc/parser.js'); const {stringer: jsoncStringer} = require('stream-json/jsonc/stringer.js'); const {chain} = require('stream-chain'); const fs = require('fs'); const pipeline = chain([fs.createReadStream('settings.jsonc'), jsoncParser(), jsoncStringer()]);
All existing filters, streamers, and utilities are compatible with the JSONC parser — they ignore tokens they don't recognize.
- Recipes — streaming basics, splitting objects, and more.
- FAQ — frequently asked questions and tips.
- Performance — tuning considerations.
- Benchmarks — micro-benchmarks comparing related components.
- Migrating from 1.x to 2.x.
- Migrating from 0.x to 1.x.
tests/sample.json.gz combines several public datasets (Japanese birth/marriage statistics, US HUD metadata catalog, and a synthetic sample with non-ASCII data).
tests/sample.jsonl.gz is the first 100 rows of the CDC "Database of COVID-19 Research Articles" (7/9/2020 snapshot), converted to JSONL.