Skip to content

fix: limit metadata items for validation to source INSPIRE#42

Merged
DajanaSnopkova merged 1 commit into
mainfrom
fix/limit-to-inspire
Apr 28, 2026
Merged

fix: limit metadata items for validation to source INSPIRE#42
DajanaSnopkova merged 1 commit into
mainfrom
fix/limit-to-inspire

Conversation

@stempler

Copy link
Copy Markdown
Member

No description provided.

@stempler stempler requested a review from DajanaSnopkova April 28, 2026 07:51
@DajanaSnopkova DajanaSnopkova merged commit 2c1c912 into main Apr 28, 2026
3 checks passed
@pvgenuchten

Copy link
Copy Markdown
Contributor

consider that we harvest.items has a versioning mechanism, so mulitple editions for each identifier exist, it is imortant to fecth the latest edition of any identifier, the latest edition is stored in harvest.vw_unique_harvest_items

@stempler

Copy link
Copy Markdown
Member Author

consider that we harvest.items has a versioning mechanism, so mulitple editions for each identifier exist, it is imortant to fecth the latest edition of any identifier, the latest edition is stored in harvest.vw_unique_harvest_items

@pvgenuchten Considering that vw_unique_harvest_items is not used in the scripts, can we assume then that it is done incorrectly now? Could you provide suggestions how the queries should be adapted and possibly a way to verify this?

@pvgenuchten

pvgenuchten commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

just discussed with dajana, seems the validator script includes a mechanism to select latest edition, so it would not require the view... (although would be better to use the view)

indeed also insert_date > COALESCE(last_validation, '1900-01-01'::timestamp) will give similar result

view has similar structure as harvest.items, so select * from harvest.items can be replaced for select * from harvest.vw_unique_harvest_items

@stempler

Copy link
Copy Markdown
Member Author

indeed also insert_date > COALESCE(last_validation, '1900-01-01'::timestamp) will give similar result

@pvgenuchten @DajanaSnopkova The script that is currently used (src/validationINSPIRE/validationByTestSuites.py) does not seem to use this, so there we should switch to using the view?

@DajanaSnopkova

Copy link
Copy Markdown
Contributor

but are we running that script in the workflow? I thought that we are running the concurrent one.

I would say that for now we keep the selection logic as it is, and we can improve that later in the next iteration. I don't think this is causing errors now.

Btw Simon, could you please also check pr: #46 ? That one was causing an error...

@stempler

stempler commented Apr 30, 2026

Copy link
Copy Markdown
Member Author

but are we running that script in the workflow? I thought that we are running the concurrent one.

Yes we do, in your email from March 18th you had suggested this one as the preferred one:

From MU's side, we consider the development finished, we suggest using the validationByTestSuites.py script, since it is most reliable, and only the first run will take long. Later, it will validate only newly added records, which should not be a problem.

I now changed to use the concurrent script as you had suggested in our call, I provided the log via email to you. In principle the run looked good, but suspicious was that it only took ~90s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants