The team has done an impressive job setting up Tabarena.
To further improve usability, it would be very helpful to have clear documentation outlining how to integrate a new model into the Tabarena pipeline and evaluate its performance within the current framework.
At present, much of the material is spread across tabarena_benchmarking_examples and tabrepo, which makes it challenging to determine the correct workflow and run the necessary scripts in the right order for a proper, consistent comparison with other models in Tabarena.
Would it be possible to consolidate these resources or provide step-by-step guidance for model integration and benchmarking?
This would ensure a smooth process for apple-to-apple comparisons between newly added models and existing ones.
Thank you
The team has done an impressive job setting up Tabarena.
To further improve usability, it would be very helpful to have clear documentation outlining how to integrate a new model into the Tabarena pipeline and evaluate its performance within the current framework.
At present, much of the material is spread across tabarena_benchmarking_examples and tabrepo, which makes it challenging to determine the correct workflow and run the necessary scripts in the right order for a proper, consistent comparison with other models in Tabarena.
Would it be possible to consolidate these resources or provide step-by-step guidance for model integration and benchmarking?
This would ensure a smooth process for apple-to-apple comparisons between newly added models and existing ones.
Thank you