Skip to content

Conversation

@ziadhany
Copy link
Collaborator

@ziadhany ziadhany commented Nov 3, 2025

@ziadhany ziadhany force-pushed the advisory-fix-commit-1 branch from 2af10cf to a8ec9f1 Compare November 4, 2025 15:58
@ziadhany ziadhany marked this pull request as ready for review November 4, 2025 16:01
@ziadhany ziadhany changed the title Add support for affected_by_commits and fixed_by_commits Add support for affected_by_commits, fixed_by_commits, and OSV code fix commits Nov 5, 2025
@ziadhany ziadhany requested review from TG1999 and keshav-space and removed request for keshav-space November 5, 2025 15:40
@TG1999
Copy link
Contributor

TG1999 commented Nov 6, 2025

@ziadhany add description in the PR please!

@ziadhany ziadhany requested a review from TG1999 November 7, 2025 02:48
@TG1999
Copy link
Contributor

TG1999 commented Nov 7, 2025

@ziadhany mostly looks good! Please run the importer once and paste the logs here. Thanks!

I want to see if we are missing on any data in OSV format. And how does the AdvisoryData and ImpactedPackages looks with the new CommitData. Thanks!

@ziadhany
Copy link
Collaborator Author

ziadhany commented Nov 7, 2025

@TG1999 This is the log output for the following importers:

  • pysec_importer_v2
  • pypa_importer_v2
  • oss_fuzz_importer_v2

importers_logs.zip

the database query result :
vulnerabilities_advisoryv2 Total rows: 10274
vulnerabilities_impactedpackage_fixed_by_commits Total rows: 4013
vulnerabilities_impactedpackage_affecting_commits Total rows: 3623
vulnerabilities_codecommit Total rows: 3791

@ziadhany ziadhany requested a review from TG1999 November 7, 2025 14:56
@TG1999
Copy link
Contributor

TG1999 commented Nov 10, 2025

@ziadhany

Invalid VersionRange  for affected_pkg: {'package': {'name': 'apache-commons-io', 'ecosystem': 'OSS-Fuzz', 'purl': 'pkg:generic/apache-commons-io'}, 'ranges': [{'type': 'GIT', 'repo': 'https://github.com/apache/commons-io.git', 'events': [{'introduced': '72b1f88fb722def136ce87c9b2bfdd3c9126bb3d'}, {'fixed': 'd3e5bd6de8bc96abbadccea8b934dc038a32e90c'}]}], 'versions': ['commons-io-2.14.0-RC1', 'rel/commons-io-2.14.0'], 'ecosystem_specific': {'severity': 'LOW'}, 'database_specific': {'introduced_range': 'c511d15294d1a406a177368804014313948e2601:06fde31494c279ad940149e1a3d4944040c73c0d', 'fixed_range': '247c8e7d85a8df293011c7e9c94fd50bb2986fb7:d3e5bd6de8bc96abbadccea8b934dc038a32e90c'}} for OSV id: 'OSV-2023-962': error:InvalidVersion("'commons-io-2.14.0-RC1' is not a valid <class 'univers.versions.SemverVersion'>")
Invalid VersionRange  for affected_pkg: {'package': {'name': 'apache-commons-io', 'ecosystem': 'OSS-Fuzz', 'purl': 'pkg:generic/apache-commons-io'}, 'ranges': [{'type': 'GIT', 'repo': 'https://github.com/apache/commons-io.git', 'events': [{'introduced': '72b1f88fb722def136ce87c9b2bfdd3c9126bb3d'}, {'fixed': 'd3e5bd6de8bc96abbadccea8b934dc038a32e90c'}]}], 'versions': ['commons-io-2.14.0-RC1', 'rel/commons-io-2.14.0'], 'ecosystem_specific': {'severity': 'LOW'}, 'database_specific': {'introduced_range': 'c511d15294d1a406a177368804014313948e2601:06fde31494c279ad940149e1a3d4944040c73c0d', 'fixed_range': '247c8e7d85a8df293011c7e9c94fd50bb2986fb7:d3e5bd6de8bc96abbadccea8b934dc038a32e90c'}} for OSV id: 'OSV-2023-618': error:InvalidVersion("'commons-io-2.14.0-RC1' is not a valid <class 'univers.versions.SemverVersion'>")

Why are we getting in this logs? The commit data should have been created for this

@TG1999
Copy link
Contributor

TG1999 commented Nov 10, 2025

See all Invalid VersionRange errors. Why these are coming?

{'package': {'name': 'apache-commons-codec', 'ecosystem': 'OSS-Fuzz', 'purl': 'pkg:generic/apache-commons-codec'}, 'ranges': [{'type': 'GIT', 'repo': 'https://gitbox.apache.org/repos/asf/commons-codec.git', 'events': [{'introduced': '44e4c4d778c3ab87db09c00e9d1c3260fd42dad5'}, {'fixed': '3bf874e2141dc08550c0b330c7a7006f358bb0f0'}]}], 'versions': ['commons-codec-1.16.1-RC1', 'rel/commons-codec-1.16.1'], 'ecosystem_specific': {'severity': 'LOW'}, 'database_specific': {'fixed_range': '72c40fe6f62410bcaa019dbf2cb570ee4e49b70e:3bf874e2141dc08550c0b330c7a7006f358bb0f0'}} for OSV id: 'OSV-2023-1195': error:InvalidVersion("'commons-codec-1.16.1-RC1' is not a valid <class 'univers.versions.SemverVersion'>")

when we have introduced and fixed events to create code commit data.

@ziadhany
Copy link
Collaborator Author

ziadhany commented Nov 11, 2025

I updated the script to handle unsupported packages (especially for OSS-Fuzz). CodeCommit is no longer ignored even if the package is unsupported, and logs are now more meaningful.

This is the updated logs:
importers_v2.zip

the database query result :
vulnerabilities_advisoryv2 Total rows: 17041
vulnerabilities_impactedpackage_fixed_by_commits Total rows: 7343
vulnerabilities_impactedpackage_affecting_commits Total rows: 6553
vulnerabilities_codecommit Total rows: 6553

Issues related:

  • pysec_importer_v2 / pypa_importer_v2:
  • oss_fuzz_importer_v2
    • Unsupported package type: None in OSV: 'OSV-2021-1227' This means the package type is unknown (e.g., generic, etc.), and there is no PURL associated with it.
    • Invalid VersionRange for affected_pkg It depends on whether this is a valid version, for example, a semver version or not.
      example:
      > SemverVersion('commons-io-2.14.0-RC1')
      > univers.versions.InvalidVersion: 'commons-io-2.14.0-RC1' is not a valid <class 'univers.versions.SemverVersion'>

@TG1999
Copy link
Contributor

TG1999 commented Nov 11, 2025

ERROR 2025-11-11 13:34:49.213781 UTC Unsupported PyPI advisory data file: GHSA-227r-w5j2-6243.json

This log does not tell me a lot, what's the data. Why this is unsupported.

@TG1999
Copy link
Contributor

TG1999 commented Nov 11, 2025

Invalid VersionRange for affected_pkg: ['0.8', '0.9', '0.9.3', '0.9.4', '0.9.5', '0.9.6', '0.9.7', '0.9.8', '0.9.9', '2.0.1', '2.0.1rc1', '2.0.1rc2-git', '2.0.1rc3', '2.0.1rc4', '2.0.2', '2.0.3', '2.0.4', '2.0.5', '2.0b4', '2.0b5', '2.0b6', '2.0b7', '2.0b8', '2.0b9', '3.0.0', '3.0.0b1', '3.0.0b2', '3.0.1', '3.0.2', '3.0.3', '3.0.4', '3.0.5', '3.1', '3.2', '3.2.1', '3.2.2', '3.2.3', '3.2.4', '3.2.5', '3.3', '3.4', '3.4.1', '3.4.2', '3.4.3', '3.4.4', '3.4.5', '3.5', '3.5b1', '3.6', '3.6.1', '3.6.2', '3.6.3', '3.6.4'] for OSV id: 'PYSEC-2021-859': error:InvalidVersion("'2.0.1rc2-git' is not a valid <class 'univers.versions.PypiVersion'>")

One of the list might not be a valid version, but all others are valid, are we ingesting them or skipping whole list if we can't ingest one.

@ziadhany
Copy link
Collaborator Author

ERROR 2025-11-11 13:34:49.213781 UTC Unsupported PyPI advisory data file: GHSA-227r-w5j2-6243.json

This log does not tell me a lot, what's the data. Why this is unsupported.

@TG1999 We are ignoring GHSA files since we target only PYSEC files.
https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/pipelines/v2_importers/pysec_importer.py#L54

@TG1999
Copy link
Contributor

TG1999 commented Nov 11, 2025

ERROR 2025-11-11 13:34:49.213781 UTC Unsupported PyPI advisory data file: GHSA-227r-w5j2-6243.json

This log does not tell me a lot, what's the data. Why this is unsupported.

@TG1999 We are ignoring GHSA files since we target only PYSEC files.

https://github.com/aboutcode-org/vulnerablecode/blob/main/vulnerabilities/pipelines/v2_importers/pysec_importer.py#L54

Then add that to the log as well :)

@ziadhany
Copy link
Collaborator Author

ziadhany commented Nov 11, 2025

Invalid VersionRange for affected_pkg: ['0.8', '0.9', '0.9.3', '0.9.4', '0.9.5', '0.9.6', '0.9.7', '0.9.8', '0.9.9', '2.0.1', '2.0.1rc1', '2.0.1rc2-git', '2.0.1rc3', '2.0.1rc4', '2.0.2', '2.0.3', '2.0.4', '2.0.5', '2.0b4', '2.0b5', '2.0b6', '2.0b7', '2.0b8', '2.0b9', '3.0.0', '3.0.0b1', '3.0.0b2', '3.0.1', '3.0.2', '3.0.3', '3.0.4', '3.0.5', '3.1', '3.2', '3.2.1', '3.2.2', '3.2.3', '3.2.4', '3.2.5', '3.3', '3.4', '3.4.1', '3.4.2', '3.4.3', '3.4.4', '3.4.5', '3.5', '3.5b1', '3.6', '3.6.1', '3.6.2', '3.6.3', '3.6.4'] for OSV id: 'PYSEC-2021-859': error:InvalidVersion("'2.0.1rc2-git' is not a valid <class 'univers.versions.PypiVersion'>")

One of the list might not be a valid version, but all others are valid, are we ingesting them or skipping whole list if we can't ingest one.

We are skipping this since the version range would likely be inconsistent if we processed it.
I also created a related issue in univers:

I can changes this if needed.

@TG1999
Copy link
Contributor

TG1999 commented Nov 11, 2025

Invalid VersionRange for affected_pkg: ['0.8', '0.9', '0.9.3', '0.9.4', '0.9.5', '0.9.6', '0.9.7', '0.9.8', '0.9.9', '2.0.1', '2.0.1rc1', '2.0.1rc2-git', '2.0.1rc3', '2.0.1rc4', '2.0.2', '2.0.3', '2.0.4', '2.0.5', '2.0b4', '2.0b5', '2.0b6', '2.0b7', '2.0b8', '2.0b9', '3.0.0', '3.0.0b1', '3.0.0b2', '3.0.1', '3.0.2', '3.0.3', '3.0.4', '3.0.5', '3.1', '3.2', '3.2.1', '3.2.2', '3.2.3', '3.2.4', '3.2.5', '3.3', '3.4', '3.4.1', '3.4.2', '3.4.3', '3.4.4', '3.4.5', '3.5', '3.5b1', '3.6', '3.6.1', '3.6.2', '3.6.3', '3.6.4'] for OSV id: 'PYSEC-2021-859': error:InvalidVersion("'2.0.1rc2-git' is not a valid <class 'univers.versions.PypiVersion'>")

One of the list might not be a valid version, but all others are valid, are we ingesting them or skipping whole list if we can't ingest one.

We are skipping this since the version range would likely be inconsistent if we processed it.

I also created a related issue in univers:

I can changes this if needed.

@keshav-space @pombredanne thoughts on this one ?

@TG1999
Copy link
Contributor

TG1999 commented Nov 11, 2025

For PYSEC data we would be using github version range, coz the versions are Semver. And if a version is not parsable that version should be skipped. Not the entire range. Also we should introduce a flag for advisories that were not completely parsed. So in future if our parsing techniques gets better we can delete the incomplete parsed advisory with a new one.

Copy link
Member

@keshav-space keshav-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ziadhany, See the comments below and also make sure to adjust the insert_advisory method accordingly.

@ziadhany
Copy link
Collaborator Author

This is the log output for the following importers:

  • pysec_importer_v2
  • pypa_importer_v2
  • oss_fuzz_importer_v2
  • github_osv_importer_v2

importers.zip

@TG1999
Copy link
Contributor

TG1999 commented Nov 17, 2025

Failed to extract fixed commits: ValueError('Commit must be a valid a commit_hash.') We need to know the hash here. Do log the hash as well.

Copy link
Member

@keshav-space keshav-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ziadhany, LGTM!

@TG1999
Copy link
Contributor

TG1999 commented Dec 15, 2025

@ziadhany LGTM! please rebase and adjust the migrations! great work 🙌

@ziadhany ziadhany closed this Dec 15, 2025
@ziadhany ziadhany deleted the advisory-fix-commit-1 branch December 15, 2025 17:55
@ziadhany ziadhany restored the advisory-fix-commit-1 branch December 15, 2025 17:56
@ziadhany ziadhany reopened this Dec 15, 2025
Fix patch_checksum constraint
Remove unused imports

Signed-off-by: ziad hany <[email protected]>
Update get_or_create_advisory_references to store the reference type correctly.
Update get_or_create_advisory_package_commit_patches to correctly create or update the patch_text field.

Signed-off-by: ziad hany <[email protected]>
Signed-off-by: ziad hany <[email protected]>
Add constraint to make sure we have at least one field to create a valid Patch obj.
Update patch_text only if patch_text field is empty.
Return multiple objects for classify_patch_source function
Add patch in AdviosryData.from_dict()

Signed-off-by: ziad hany <[email protected]>
Update migration file

Signed-off-by: ziad hany <[email protected]>
@ziadhany ziadhany force-pushed the advisory-fix-commit-1 branch from 747dc5f to f7ee8c2 Compare December 15, 2025 19:13
@ziadhany
Copy link
Collaborator Author

@ziadhany LGTM! please rebase and adjust the migrations! great work 🙌

@TG1999 Done , Please merge 🚀

Copy link
Contributor

@TG1999 TG1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@TG1999 TG1999 merged commit 32d9724 into aboutcode-org:main Dec 15, 2025
5 of 6 checks passed
@ziadhany ziadhany deleted the advisory-fix-commit-1 branch December 15, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants