-
-
Notifications
You must be signed in to change notification settings - Fork 1
Add EUVD mirror pipeline #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Samk <[email protected]>
Signed-off-by: Samk <[email protected]>
ziadhany
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Samk1710 The code looks good 🚀. Some nits for your consideration.
| if path.exists(): | ||
| self.log(f"skip existing file: {path}") | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we skip updating the existing file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We skip existing files to make the sync idempotent and resumable. If the script is interrupted or re-run, it won't overwrite data that was already successfully saved. Since the data is fetched using the dateUpdated param and updates are forward-only, which makes skipping previously written files safe.
sync_catalog.py
Outdated
| if not isinstance(data, dict): | ||
| return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any examples of this invalid output from the endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a safety check to handle unexpected API responses. The EUVD API could potentially return non-dict responses (e.g empty response, error strings, or malformed objects). Just a safety measure for now — don’t have a concrete example yet :)
Signed-off-by: Samk <[email protected]>
|
I have made the changes as per suggestions. If any other changes are required, do let me know. Looking forward! |
EUVD Mirror Pipeline
The script has two modes:
As of now the PR focuses on the code, once reviewed and approved I can add the locally collected backfill data.