Skip to content
Eugene Lazutkin edited this page Mar 26, 2026 · 65 revisions

Dashboard

Node.js CI NPM version Known Vulnerabilities

Useful links

About

stream-json is a micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can:

  • Parse JSON files far exceeding available memory — even individual keys, strings, and numbers can be streamed piece-wise.
    • All components are optimized for throughput. See Performance for tuning tips.
  • Stream using a SAX-inspired event-based API.
  • Handle huge Django-like JSON database dumps.
  • Support JSON Streaming (concatenated/line-delimited JSON).

Built on stream-chain for pipeline composition. TypeScript definitions are bundled.

Companion projects:

  • stream-chain — pipeline composition from streams, functions, and generators.
  • stream-csv-as-json — streams CSV files in a stream-json-compatible format.
  • stream-join — merges side channels for complex pipeline topologies.

Usage

Both CommonJS require() and ESM import are supported. Always include the .js extension in paths — see FAQ for details.

// CommonJS
const {parser} = require('stream-json/parser.js');
const {pick} = require('stream-json/filters/pick.js');
const {streamArray} = require('stream-json/streamers/stream-array.js');
const {chain} = require('stream-chain');
// ESM
import {parser} from 'stream-json/parser.js';
import {pick} from 'stream-json/filters/pick.js';
import {streamArray} from 'stream-json/streamers/stream-array.js';
import chain from 'stream-chain';

Documentation

Overview and cheat sheet. Click component names for detailed API docs and examples.

The main module

The main module returns a factory function that creates a parser decorated with emit().

Parser

Parser is the core — a streaming JSON parser that consumes text and produces a token stream. Both standard JSON and JSON Streaming are supported.

const {parser} = require('stream-json');
const pipeline = fs.createReadStream('data.json').pipe(parser.asStream());

For invalid JSON input, use Verifier to pinpoint the exact error position.

Filters

Filters edit a token stream on the fly:

  • Pick — selects matching subobjects, ignoring the rest.

    const {pick} = require('stream-json/filters/pick.js');
    chain([source, parser(), pick({filter: 'data'})]);

    If pick selects more than one subobject, follow it with streamValues().

  • Replace — substitutes matching subobjects.

    const {replace} = require('stream-json/filters/replace.js');
    chain([source, parser(), replace({filter: /^\d+\.extra\b/})]);
    • Ignore — removes subobjects entirely.
      const {ignore} = require('stream-json/filters/ignore.js');
      chain([source, parser(), ignore({filter: /^\d+\.extra\b/})]);
  • Filter — filters subobjects while preserving the original JSON shape.

    const {filter} = require('stream-json/filters/filter.js');
    chain([source, parser(), filter({filter: /^data\b/})]);

Filters go after the parser and can be chained. They return functions for use in chain() — for .pipe() usage, use .withParserAsStream().

Streamers

While source files are huge, individual data items often fit in memory. Streamers assemble tokens into JavaScript objects. All streamers support early rejection via objectFilter — see StreamBase.

  • StreamValues — streams successive values (JSON Streaming or after pick()).
    const {streamValues} = require('stream-json/streamers/stream-values.js');
  • StreamArray — streams elements of a single top-level array.
    const {streamArray} = require('stream-json/streamers/stream-array.js');
  • StreamObject — streams top-level properties of a single object.
    const {streamObject} = require('stream-json/streamers/stream-object.js');

Streamers go after the parser and optional filters.

Essentials

  • Assembler — reconstructs JavaScript objects from a token stream (EventEmitter, not a stream).

    const Assembler = require('stream-json/assembler.js');
    
    const pipeline = chain([fs.createReadStream('data.json.gz'), zlib.createGunzip(), parser()]);
    
    const asm = Assembler.connectTo(pipeline);
    asm.on('done', asm => console.log(asm.current));
  • Disassembler — converts JavaScript objects into a token stream.

    const {disassembler} = require('stream-json/disassembler.js');
    
    const pipeline = chain([
      fs.createReadStream('array.json.gz'),
      zlib.createGunzip(),
      parser(),
      streamArray(),
      disassembler(),
      pick({filter: 'value'}),
      streamValues()
    ]);
  • Stringer — converts a token stream back into JSON text.

    const stringer = require('stream-json/stringer.js');
    
    chain([
      fs.createReadStream('data.json.gz'),
      zlib.createGunzip(),
      parser(),
      pick({filter: 'data'}),
      stringer(),
      zlib.createGzip(),
      fs.createWriteStream('edited.json.gz')
    ]);
  • Emitter — re-emits tokens as named events.

    const emitter = require('stream-json/emitter.js');
    
    const e = emitter();
    
    chain([fs.createReadStream('data.json'), parser(), e]);
    
    let counter = 0;
    e.on('startObject', () => ++counter);
    e.on('finish', () => console.log(counter, 'objects'));

Utilities

  • emit() — attaches token events to any stream (lightweight Emitter).

    const emit = require('stream-json/utils/emit.js');
    
    const pipeline = chain([fs.createReadStream('data.json'), parser()]);
    emit(pipeline);
    
    let counter = 0;
    pipeline.on('startObject', () => ++counter);
    pipeline.on('finish', () => console.log(counter, 'objects'));
  • withParser() — creates a parser() + component pipeline. Most components expose .withParser() and .withParserAsStream() as static methods.

    const {streamArray} = require('stream-json/streamers/stream-array.js');
    const pipeline = streamArray.withParserAsStream();
  • FlexAssembler — like Assembler but with custom containers (Map, Set, etc.) at specific paths.

    const FlexAssembler = require('stream-json/utils/flex-assembler.js');
    const asm = FlexAssembler.connectTo(pipeline, {
      objectRules: [{filter: () => true, create: () => new Map(), add: (m, k, v) => m.set(k, v)}]
    });
  • Batch — groups items into arrays of a configurable size.

  • Verifier — validates JSON text; errors include offset, line, and position.

  • Utf8Stream — sanitizes multibyte UTF-8 input split across chunk boundaries.

JSONL

Efficient JSONL support:

  • jsonl/Parser — parses JSONL into {key, value} objects (like StreamValues).

  • jsonl/Stringer — serializes objects to JSONL text.

    const jsonlStringer = require('stream-json/jsonl/stringer.js');
    const jsonlParser = require('stream-json/jsonl/parser.js');
    const {chain} = require('stream-chain');
    
    const fs = require('fs');
    const zlib = require('zlib');
    
    const pipeline = chain([
      fs.createReadStream('sample1.jsonl.br'),
      zlib.createBrotliDecompress(),
      jsonlParser(),
      data => data.value,
      jsonlStringer(),
      zlib.createBrotliCompress(),
      fs.createWriteStream('sample2.jsonl.br')
    ]);

JSONC

JSONC (JSON with Comments) support:

  • jsonc/Parser — streaming JSONC parser with comment and whitespace tokens.

  • jsonc/Stringer — converts JSONC token streams back to text.

  • jsonc/Verifier — validates JSONC text; errors include offset, line, and position.

    const {parser: jsoncParser} = require('stream-json/jsonc/parser.js');
    const {stringer: jsoncStringer} = require('stream-json/jsonc/stringer.js');
    const {chain} = require('stream-chain');
    
    const fs = require('fs');
    
    const pipeline = chain([fs.createReadStream('settings.jsonc'), jsoncParser(), jsoncStringer()]);

All existing filters, streamers, and utilities are compatible with the JSONC parser — they ignore tokens they don't recognize.

Advanced use

Credits

tests/sample.json.gz combines several public datasets (Japanese birth/marriage statistics, US HUD metadata catalog, and a synthetic sample with non-ASCII data).

tests/sample.jsonl.gz is the first 100 rows of the CDC "Database of COVID-19 Research Articles" (7/9/2020 snapshot), converted to JSONL.

Clone this wiki locally