Skip to content

FPTQuant end-to-end training #4089

@phiahr

Description

@phiahr

I'm trying to recreate the results of the FPTQuant paper but the results I get using my own end-to-end training are different (a lot worse).

So my questions are:

  1. Is there an official or reference implementation of FPTQuant (or AIMET configuration) end-to-end training (not only the local optimization) that reproduces the results reported in the paper?
  2. Since MultiHeadValueTransformOp is not explicitly made to be invertible in AIMET’s implementation, but should be invertible, is it expected that a proper end-to-end training will never result in a singular transform or can that be caused through specific models or hyperparameters?
    (in my end-to-end training the MultiHeadValue transform becomes non-invertible during training with the current implementation)
  3. Could the degradation and non-invertible transform I’m seeing be due to applying this method to a smaller model (e.g., LLaMA 160M), or should the approach still work in this regime?

Any guidance on these points would be very helpful for understanding whether this is a usage issue or a limitation in the current implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions