Restore multi-threaded parallelism in NeutronNova prove#123
Merged
Conversation
The optimizations in baac9f5 inadvertently serialized several parallel operations in the NeutronNova prover, causing a ~2x regression on multi-core machines while improving single-threaded performance. This commit restores parallelism in four areas without affecting the single-threaded optimizations: 1. Rerandomization: par_iter_mut() for step circuit rerandomization 2. Instance/witness generation: rayon::join + par_iter_mut with order-preserving collect (not try_reduce, which scrambles order) 3. Non-i64 NIFS rounds 1+: parallel fold+prove via par_chunks_mut on A/B/C layers (mirrors the existing i64 path structure) 4. Witness fold in evaluation claims: restore par_iter for the folded_W + core_witness linear combination Benchmark (32 circuits x 1024B SHA-256, target-cpu=native): Old baac9f5 This commit 1 thread: 11003 ms 6849 ms 6738 ms 16 thread: 1752 ms 3732 ms 1318 ms
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The optimizations in baac9f5 inadvertently serialized several parallel operations in the NeutronNova prover, causing a ~2x regression on multi-core machines while improving single-threaded performance.
This commit restores parallelism in four areas without affecting the single-threaded optimizations:
Benchmark (32 circuits x 1024B SHA-256, target-cpu=native):
1 thread: 11003 ms 6849 ms 6738 ms
16 thread: 1752 ms 3732 ms 1318 ms