FPTQuant end-to-end training

I'm trying to recreate the results of the FPTQuant paper but the results I get using my own end-to-end training are different (a lot worse).

So my questions are:
1. Is there an official or reference implementation of FPTQuant (or AIMET configuration) end-to-end training (not only the local optimization) that reproduces the results reported in the paper?
2. Since MultiHeadValueTransformOp is not explicitly made to be invertible in AIMET’s implementation, but should be invertible, is it expected that a proper end-to-end training will never result in a singular transform or can that be caused through specific models or hyperparameters?
(in my end-to-end training the MultiHeadValue transform becomes non-invertible during training with the current implementation)
3. Could the degradation and non-invertible transform I’m seeing be due to applying this method to a smaller model (e.g., LLaMA 160M), or should the approach still work in this regime?

Any guidance on these points would be very helpful for understanding whether this is a usage issue or a limitation in the current implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FPTQuant end-to-end training #4089

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FPTQuant end-to-end training #4089

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions