Skip to content

Sharryy#284

Open
sharryy wants to merge 2 commits into
tempestphp:mainfrom
sharryy:sharryy
Open

Sharryy#284
sharryy wants to merge 2 commits into
tempestphp:mainfrom
sharryy:sharryy

Conversation

@sharryy

@sharryy sharryy commented Mar 15, 2026

Copy link
Copy Markdown

No description provided.

- gc_disable() to eliminate GC pauses
- Flat integer array with pre-computed date IDs (no ksort needed)
- Binary pack/unpack IPC instead of serialize
- Pre-computed output strings
- stream_set_read_buffer(0) to avoid double buffering
- Parent process works on last chunk in parallel
@sharryy

sharryy commented Mar 15, 2026

Copy link
Copy Markdown
Author

/bench

@brendt

brendt commented Mar 15, 2026

Copy link
Copy Markdown
Member

Benchmarking complete! Best execution time: 4.513s

Full results:

{
  "results": [
    {
      "command": "cd /Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../.benchmark/pr-284 && php -dmax_execution_time=300 tempest data:parse --input-path=\"/Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../data/real-data.csv\" --output-path=\"/Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../data/real-data-actual.json\"",
      "mean": 4.5277077714,
      "stddev": 0.01018006581088136,
      "median": 4.5331425880000005,
      "user": 29.960297319999995,
      "system": 2.28186738,
      "min": 4.513275255,
      "max": 4.53747038,
      "times": [
        4.53747038,
        4.5331425880000005,
        4.520957463,
        4.533693171,
        4.513275255
      ],
      "memory_usage_byte": [
        102727680,
        102744064,
        102744064,
        102744064,
        102744064
      ],
      "exit_codes": [
        0,
        0,
        0,
        0,
        0
      ]
    }
  ]
}

- Unroll inner parsing loop 8x with fence guard to reduce loop overhead
- Scan first 200 lines to detect actual date range (narrower dateCount)
- Shorter variable names in hot path to reduce opcode overhead
@sharryy

sharryy commented Mar 15, 2026

Copy link
Copy Markdown
Author

/bench

@brendt

brendt commented Mar 15, 2026

Copy link
Copy Markdown
Member

Benchmarking complete! Best execution time: 4.341s

Full results:

{
  "results": [
    {
      "command": "cd /Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../.benchmark/pr-284 && php -dmax_execution_time=300 tempest data:parse --input-path=\"/Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../data/real-data.csv\" --output-path=\"/Users/brentroose/Dev/100-million-row-challenge/app/Commands/../../data/real-data-actual.json\"",
      "mean": 4.3574054884399995,
      "stddev": 0.014048715864259763,
      "median": 4.358004955039999,
      "user": 29.034052420000002,
      "system": 2.3102427800000003,
      "min": 4.341690455039999,
      "max": 4.3776522470399994,
      "times": [
        4.341690455039999,
        4.34712470504,
        4.36255508004,
        4.358004955039999,
        4.3776522470399994
      ],
      "memory_usage_byte": [
        78790656,
        78954496,
        78954496,
        79101952,
        79151104
      ],
      "exit_codes": [
        0,
        0,
        0,
        0,
        0
      ]
    }
  ]
}

@brendt

brendt commented Mar 15, 2026

Copy link
Copy Markdown
Member

I think there's room for one more improvement… 👀
🏆 leaderboard.csv

@brendt brendt removed the verified label Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants