Compensate metadata.loudness when _ScaleOutputHook scales head_scale#668
Conversation
_ScaleOutputHook undoes a dataset y_scale on export by scaling
config.head_scale and the duplicated weights[-1]. Until now the
exported metadata.loudness still described the pre-compensation model,
so the loudness in the .nam file disagreed with what a plugin actually
loads when output RMS normalization was used during training.
WaveNet (no top-level head) and SlimmableContainer outputs are linear
in head_scale, so the dB adjustment is exact and closed-form:
loudness_new = loudness_old + 20 * log10(self._scale)
metadata.gain is a normalized compression heuristic that is invariant
under uniform output scaling and is left alone.
The adjustment lives next to the head_scale mutation that creates the
need for it, so the hook's full effect is visible in one place. No
changes to the dict hook contract or to model state. Hooks that don't
attach (no normalization used) produce identical output.
sdatkinson
left a comment
There was a problem hiding this comment.
Couple nits. I can pick them up on a follow-up PR.
Thanks for getting this!
| ) | ||
| # head_scale was actually compensated on disk | ||
| assert entry["model"]["config"]["head_scale"] == _pytest.approx( | ||
| 0.25 * scale |
There was a problem hiding this comment.
Where's the 0.25 come from? Line 26 I think?
Chan you make this a const in this file _DEFAULT_HEAD_SCALE or pass it as an argument to packed_config so that it's not a literal?
| scale = 0.5 | ||
| container = { | ||
| "architecture": "SlimmableContainer", | ||
| "metadata": {"loudness": -18.0, "gain": 0.4}, |
There was a problem hiding this comment.
Eek, these aren't supposed to be the averages of the values in the submodels 😅
It's just a test, but I think I'd prefer for these to track the values of the highest-quality submodel. If that's not happening already, then that's a bug that should also be squashed.
Ideally, there'd be validation (i.e. Pydantic) to enforce this.
There was a problem hiding this comment.
Not the end of the world if it's already happening elsewhere though--I have to admit I'm not sure I know the answer off the top of my head.
Summary
When dataset output RMS normalization is used (e.g. `nam.data.normalize_joint_dataset_output` in the data config), `Dataset._ScaleOutputHook` scales the exported `config.head_scale` (and the duplicated `weights[-1]`) so the `.nam` produces the original capture level at inference. Until now `metadata.loudness` was not adjusted, so its value described the pre-compensation in-memory model instead of the file a plugin actually loads. This PR fixes that.
Root cause
`metadata.loudness` is set inside `Exportable._get_export_dict()` before `_apply_export_model_dict_post_hooks` runs. The `_ScaleOutputHook` then mutates `config.head_scale` on the way out, but the loudness number is never refreshed.
Fix
Co-locate the loudness adjustment with the head_scale mutation inside the hook. Output of WaveNet (no top-level head) and SlimmableContainer is linear in `head_scale`, so the dB adjustment is exact and closed-form:
```
loudness_new = loudness_old + 20 * log10(self._scale)
```
`metadata.gain` is a normalized compression heuristic that is invariant under uniform output scaling and is left alone.
Tests
Test plan
Compatibility
Hooks that don't attach (no normalization used) produce identical output. No change to the dict hook contract or to model state.