Skip to content

Conversation

@wujingyue
Copy link
Collaborator

No description provided.

@wujingyue
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented Feb 6, 2026

Review updated until commit 8dec3f9

Description

  • Removed unnecessary 0-dim tensor filtering logic in contiguous inner dimensions mapper

  • Simplified function by eliminating corner case check that was deemed unnecessary

  • Function now proceeds directly to dimension matching logic after contiguity computation

Changes walkthrough

Relevant files
Enhancement
vectorize_helper.cpp
Remove 0-dim tensor filtering corner case                               

csrc/scheduler/vectorize_helper.cpp

  • Removed block that filtered out 0-dimensional tensors (lines with
    alloc_no_reductions_size < 1 check)
  • Eliminated early return path for empty allocation sizes
  • Streamlined function logic by removing unnecessary corner case
    handling
  • Function now proceeds directly to dimension matching validation
  • +0/-5     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Corner Case Removal Validation

    The removal of the 0-dimensional tensor check (alloc_no_reductions_size < 1) needs validation. Verify that: 1) This corner case cannot occur in practice, 2) Any existing code paths that might produce 0-dim tensors are properly handled elsewhere, 3) The behavior when alloc_no_reductions_size < 0 is well-defined and won't cause crashes or incorrect results.

    auto alloc_no_reductions_size = alloc_no_reductions.size();
    
    NVF_ERROR_EQ(alloc_no_reductions_size, contiguity.size());
    
    // Order is important, need to make sure dimensions match up correctly with

    Test failures

    • (Medium, 3) Shape mismatch in Thunder nvFuser higher-order inplace alias update tests

      Test Name A100 GB200 H100 Source
      thunder.tests.test_update_aliases.test_higher_order_inplace_alias_update_nvfuser_cuda_thunder.dtypes.float32
    • (Medium, 1) Scalar numerical mismatch in thunder nanogpt NVFuser CUDA test_networks

      Test Name H100 Source
      thunder.tests.test_networks.test_nanogpt_complete_autograd_nvfuser_cuda_thunder.dtypes.float32

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 6, 2026

    Greptile Overview

    Greptile Summary

    • Removes an early-return branch in ContiguousInnerDimensionsMapper::getContigMergeOfInnerSize for the (commented) “0-dim tensor” case.
    • The function still computes the contiguous inner-extent product by walking allocation dims from inner to outer and stopping when contiguity/mapping/device-split constraints break.
    • No behavioral changes elsewhere; change is localized to csrc/scheduler/vectorize_helper.cpp and relies on existing identity product initialization when there are no eligible dims.

    Confidence Score: 5/5

    • This PR is safe to merge with minimal risk.
    • Change is a small deletion of an early-return that was redundant with the existing identity initialization (oneVal()) and loop structure; core invariants are still enforced by NVF_ERROR_EQ and contiguity handling, and no other code paths are modified.
    • No files require special attention

    Important Files Changed

    Filename Overview
    csrc/scheduler/vectorize_helper.cpp Removes an early-return branch in ContiguousInnerDimensionsMapper::getContigMergeOfInnerSize; remaining code already handles 0-dim tensors via size checks and returns the identity product when no alloc dims exist.

    Sequence Diagram

    sequenceDiagram
      participant Caller as Scheduler/Heuristic
      participant Mapper as ContiguousInnerDimensionsMapper
      participant TV as TensorView
      participant TD as TensorDomain
      participant Dep as DependencyCheck
      participant IR as SimplifyingIrBuilder
    
      Caller->>Mapper: getContigMergeOfInnerSize(tv)
      Mapper->>TV: getMaybeAllocationDomain()
      Mapper->>Mapper: mappedLogicalIds(tv)
      Mapper->>TD: noReductions(alloc)
      Mapper->>TV: domain()->contiguity()
      alt contiguity includes reductions
        Mapper->>Mapper: filter contiguity to alloc_no_reductions
      end
      Mapper->>Mapper: assert alloc_no_reductions.size == contiguity.size
      loop iterate inner alloc dims (reverse)
        Mapper->>Mapper: skip if extent==1 or broadcast
        Mapper->>Dep: getAllExprsBetween(logicalDomain, alloc_iid)
        Mapper->>Mapper: validate only device splits
        Mapper->>IR: divExpr(getProjectedExtent, num_devices)
        Mapper->>IR: mulExpr(product, sharded_extent)
      end
      Mapper-->>Caller: simplifyExpr(product_of_inner_extents)
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 6, 2026

    Additional Comments (1)

    csrc/scheduler/vectorize_helper.cpp
    0-dim contiguity invariant

    Removing the of_tv_alloc_no_reductions_size < 1 early-return means 0-dim tensors now hit NVF_ERROR(of_tv_alloc_no_reductions_size == contiguity.size()). If domain()->contiguity() can be empty for 0-dim while getMaybeAllocationDomain() still has entries (or vice versa), this will now fail even though product_of_inner_extents is already 1. Please confirm the invariant (both sizes are 0 for 0-dim) or adjust logic so 0-dim tensors still return 1 without requiring contiguity alignment.

    @wujingyue wujingyue requested a review from jjsjann123 February 6, 2026 19:51
    @wujingyue
    Copy link
    Collaborator Author

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @wujingyue wujingyue merged commit 231d48c into main Feb 8, 2026
    62 of 65 checks passed
    @wujingyue wujingyue deleted the wjy/corner branch February 8, 2026 01:11
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants