Fix Conv1d (Convnd) implementation by sentient-codebot · Pull Request #157 · microsoft/LoRA

sentient-codebot · 2024-02-12T09:58:24Z

The current Conv1d (and Conv3d) is not working due to the incompatible shape of (lora_A @ lora_B). I changed only the lora_B's initialization. The shape of lora_B now depends on the dimensions of conv.weight, so it works for 1d to n-d case.

Before

self.lora_B = nn.Parameter(
              self.conv.weight.new_zeros((out_channels//self.conv.groups*kernel_size, r*kernel_size))
            )

After

self.lora_B = nn.Parameter(
              self.conv.weight.new_zeros((out_channels//self.conv.groups*kernel_size**(self.conv.weight.dim()-3), r*kernel_size))
            )

Fixes #115

vzmath · 2024-03-03T01:29:36Z

Hi Nan, @sentient-codebot great implementation! Wondering whether you have attempted nn.convtranspose1d, nn.convtranspose2d, nn.convtranspose3d. Been bugging me quite a while. Would you mind sharing your implementation in case you did? Thanks!

meeselizabeth · 2024-05-16T15:17:27Z

Hi, I am trying to make lora applicable to ConvTranspose3d with this code:

`class ConvTransposeLoRA(nn.Module, LoRALayer):
def init(self, conv_module, in_channels, out_channels, kernel_size, r=0, lora_alpha=1, lora_dropout=0., merge_weights=True, **kwargs):
super(ConvTransposeLoRA, self).init()
self.conv = conv_module(in_channels, out_channels, kernel_size, **kwargs)
LoRALayer.init(self, r=r, lora_alpha=lora_alpha, lora_dropout=lora_dropout, merge_weights=merge_weights)
assert isinstance(kernel_size, int)
# Actual trainable parameters
if r > 0:
self.lora_A = nn.Parameter(
self.conv.weight.new_zeros((r * kernel_size, in_channels * kernel_size))
)
self.lora_B = nn.Parameter(
self.conv.weight.new_zeros((out_channels//self.conv.groups*kernel_size, r * kernel_size))
)
self.scaling = self.lora_alpha / self.r
# Freezing the pre-trained weight matrix
self.conv.weight.requires_grad = False
self.reset_parameters()
self.merged = False

def reset_parameters(self):
    self.conv.reset_parameters()
    if hasattr(self, 'lora_A'):
        # initialize A the same way as the default for nn.Linear and B to zero
        nn.init.kaiming_uniform_(self.lora_A, a=math.sqrt(5))
        nn.init.zeros_(self.lora_B)

def train(self, mode=True):
    super(ConvTransposeLoRA, self).train(mode)
    if mode:
        if self.merge_weights and self.merged:
            if self.r > 0:
                # Make sure that the weights are not merged
                self.conv.weight.data -= (self.lora_B @ self.lora_A).view(self.conv.weight.shape) * self.scaling
            self.merged = False
    else:
        if self.merge_weights and not self.merged:
            if self.r > 0:
                # Merge the weights and mark it
                self.conv.weight.data += (self.lora_B @ self.lora_A).view(self.conv.weight.shape) * self.scaling
            self.merged = True

def forward(self, x, output_size = None):
    if self.r > 0 and not self.merged:
        print(x.shape)
        
        num_spatial_dims = 3

        output_size = (33, 33, 33)

        output_padding = nn.ConvTranspose3d._output_padding(x, 
                                              output_size, 
                                              self.conv.stride, 
                                              self.conv.padding, 
                                              self.conv.kernel_size,  # type: ignore[arg-type]
                                              num_spatial_dims, 
                                              self.conv.dilation)  # type: ignore[arg-type]

        return F.conv_transpose3d(x, 
                                  self.conv.weight + (self.lora_B @ self.lora_A).view(self.conv.weight.shape) * self.scaling, 
                                  self.conv.bias, 
                                  self.conv.stride, 
                                  self.conv.padding,
                                  output_padding, 
                                  self.conv.groups, 
                                  self.conv.dilation)

    return self.conv(x, output_size)`

However, a problem occurs as output_size is None, and I cannot define output_size as it fluctuates. Would you know how to solve this?

emi-dm

I tested it and works properly

sentient-codebot · 2024-11-21T16:10:43Z

Should we merge?

emi-dm · 2024-11-21T17:31:05Z

Yes @sentient-codebot

wsty1234 · 2025-03-24T08:47:35Z

great job ! thank you a lot for solving my problems @sentient-codebot

Fix Conv1d (Convnd) implementation

d3a8d08

sentient-codebot mentioned this pull request Feb 12, 2024

Conv1d and Conv3d are not working #115

Open

emi-dm approved these changes Nov 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Conv1d (Convnd) implementation#157

Fix Conv1d (Convnd) implementation#157
sentient-codebot wants to merge 1 commit into
microsoft:mainfrom
sentient-codebot:main

sentient-codebot commented Feb 12, 2024 •

edited

Loading

Uh oh!

vzmath commented Mar 3, 2024

Uh oh!

meeselizabeth commented May 16, 2024 •

edited

Loading

Uh oh!

emi-dm left a comment

Uh oh!

sentient-codebot commented Nov 21, 2024

Uh oh!

emi-dm commented Nov 21, 2024

Uh oh!

wsty1234 commented Mar 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

sentient-codebot commented Feb 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vzmath commented Mar 3, 2024

Uh oh!

meeselizabeth commented May 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emi-dm left a comment

Choose a reason for hiding this comment

Uh oh!

sentient-codebot commented Nov 21, 2024

Uh oh!

emi-dm commented Nov 21, 2024

Uh oh!

wsty1234 commented Mar 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sentient-codebot commented Feb 12, 2024 •

edited

Loading

meeselizabeth commented May 16, 2024 •

edited

Loading