Skip to content

Validity and Stability Pipeline Development FS Benchmark#288

Open
schaeferbasti wants to merge 9 commits intoautogluon:fe_benchmark_mainfrom
schaeferbasti:fe_benchmark_main_val_pipeline
Open

Validity and Stability Pipeline Development FS Benchmark#288
schaeferbasti wants to merge 9 commits intoautogluon:fe_benchmark_mainfrom
schaeferbasti:fe_benchmark_main_val_pipeline

Conversation

@schaeferbasti
Copy link
Copy Markdown
Contributor

@schaeferbasti schaeferbasti commented Apr 10, 2026

Issue #, if available:

Description of changes:

  • Edit feature_selection_benchmark_runner.py for validity and stability (fix bugs, make usable for cli, save results in csv files)
  • Soon: Add batch script for executing the runner with all datasets, methods and repeats

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@schaeferbasti schaeferbasti marked this pull request as ready for review April 10, 2026 11:18

print(result)
result = pd.DataFrame([result.__dict__])
path = f"results/{args.mode}_{args.method_name}_{args.data_foundry_task_id.split('|')[3].split('/')[0]}_{args.repeat}.csv"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding code that checks if the cache exists and then skips unless we pass --ignore-cache.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe make this logic a function so that we can easily go from argumetns to cache path



@dataclass
class ExtraBenchmarkSetup2026:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us not have this in the TabArena TabFlow code, but in the experimental part. Also, I think this kind of setup won't work the same way, might be a good starting point, but likely you can make it much simpler and hardcode a lot of choices. In theory, you just need to generate a loop + array job with the values of this loop. For example, check out some suggestions from Claude Code or ChatGPT for how to do this

SLURM does not need all this setup loigc but most importantly the batch file. There is also submitit as a python package as an alternative

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants