Skip to content

Conversation

@wujingyue
Copy link
Collaborator

No description provided.

@github-actions
Copy link

github-actions bot commented Feb 4, 2026

Description

  • Add new test_transpose function to reproduce misaligned memory access issue

  • Move mesh definition inside FusionDefinition context in test_pointwise

  • Test implements complex device mesh sharding with allocation domain manipulation

  • Validates transpose operations across multi-device configurations

Changes walkthrough

Relevant files
Tests
test_multidevice.py
Add transpose test and fix mesh definition                             

tests/python/multidevice/test_multidevice.py

  • Move mesh definition inside FusionDefinition context in test_pointwise
    function
  • Add new test_transpose function with complex device mesh and
    allocation domain setup
  • Implement multi-device tensor sharding validation for transpose
    operations
  • Create reproduction case for memory alignment issues in multi-device
    scenarios
  • +48/-1   

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Missing Output Validation

    The new test_transpose function creates test data and executes the fusion but lacks any assertions to validate the correctness of the output. The test should verify that the transpose operation produces the expected results.

    inp = multidevice_test.shard_tensor(inp_ref, inp_tv)
    fd.execute([inp])
    Complex Allocation Domain

    The set_allocation_domain call with axis reordering (lines 79-89) creates a complex memory layout transformation. This should be carefully reviewed to ensure the axis ordering is correct and the transformation achieves the intended memory layout.

    out_tv.set_allocation_domain(
        (
            out_tv.axis(2),
            out_tv.axis(0),
            out_tv.axis(1),
            out_tv.axis(3),
            out_tv.axis(4),
            out_tv.axis(5),
        ),
        True,
    )
    Mesh Definition Location

    The mesh definition was moved from outside to inside the FusionDefinition context in test_pointwise. This change should be validated to ensure it doesn't affect the mesh lifecycle or cause any scoping issues.

    mesh = nvfuser.multidevice.DeviceMesh(torch.arange(num_devices))
    for tv in [inp_tv, tv1, tv2]:
        tv.set_device_mesh(mesh)

    @wujingyue wujingyue closed this Feb 8, 2026
    @wujingyue wujingyue deleted the wjy/transpose branch February 8, 2026 01:12
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant