-
Notifications
You must be signed in to change notification settings - Fork 51
Open
Description
See comment in this PR
Streamline Data Designer metadata generation: Currently, we generate column_configs.json, model_configs.json, and metadata.json. Ideally, we should collapse these into a single metadata.json (sanitized for Hugging Face) and a new sdg.json file (or a different name). The latter would capture the serialized version of the entire Data Designer SDG pipeline, allowing anyone to recreate it using the DataDesignerConfigBuilder.from_config(...) API. This approach allows us to focus initially on the push_to_hub integration, as "re-hydrating" a DatasetCreationResults object via pull_to_hub seems a bit more complex. I can create a GitHub issue and a PR for this at the beginning of next year.