Skip to content

Add path-aware matching for duplicate package names#10

Open
neddp wants to merge 3 commits intoanthonyharrison:mainfrom
neddp:add-path-aware-matching
Open

Add path-aware matching for duplicate package names#10
neddp wants to merge 3 commits intoanthonyharrison:mainfrom
neddp:add-path-aware-matching

Conversation

@neddp
Copy link

@neddp neddp commented Feb 8, 2026

When comparing SBOMs, packages with the same name at different filesystem locations were being deduplicated, making it impossible to track version changes for packages embedded in multiple binaries (e.g., Go stdlib in 5 different executables).

Instead of using just the package name as a key, I changed the code to use (name, path) tuples. For CycloneDX files, it looks for any property that has both "location" and "path" in the name - so it works with syft's syft:location:0:path or similar conventions from other tools. When displaying differences, if a path exists, it shows the binary name like stdlib (service-a) so you know which binary it's in. SPDX formats don't have path metadata, so they just use an empty string to keep everything consistent.

You can now track the same package appearing in different binaries independently. If you have Go stdlib embedded in 5 executables and one gets updated, you'll see exactly which one changed. It works with any SBOM generation tool that puts path info in properties (syft, grype, trivy, etc), but also handles SBOMs that don't have paths at all - it just falls back gracefully. The output only shows the binary name when it's actually useful, keeping things clean.

Added 42 tests.

Fixes #8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handling multiple versions of same component

1 participant