Support multi-file spec loading, --entity filter, and --emit-spec#203
Open
amc-corey-cox wants to merge 2 commits intomainfrom
Open
Support multi-file spec loading, --entity filter, and --emit-spec#203amc-corey-cox wants to merge 2 commits intomainfrom
amc-corey-cox wants to merge 2 commits intomainfrom
Conversation
- Add spec merge utility (resolve paths, load sub-spec list format, merge class/enum/slot derivations) in utils/spec_merge.py - Change -T to accept multiple files/directories on all CLI commands - Add --entity filter to map-data and validate-spec for processing a single class_derivation by name - Add --emit-spec PATH to map-data (side-effect file) and validate-spec (file or stdout with '-') - Add --merge flag to validate-spec for combined validation Closes #202
Contributor
There was a problem hiding this comment.
Pull request overview
Adds first-class support for composing transformation specs at runtime (multi--T and directory inputs), filtering execution to a single entity, and emitting the resolved spec for inspection/reproducibility—implemented in shared spec-merge utilities and wired through the CLI/engine.
Changes:
- Introduces
spec_mergeutilities to resolve spec paths (files/dirs), load both dict and list-of-blocks YAML formats, and merge specs with conflict detection. - Extends CLI commands to accept multiple
-Tvalues, adds--entityfiltering (map-data + validate-spec), and adds--emit-specplus--mergefor validate-spec. - Updates streaming engine to optionally process only a single class derivation by name; adds unit + CLI integration tests for the new behaviors.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_utils/test_spec_merge.py | Unit tests for path resolution, load formats, merge semantics, and conflict handling. |
| tests/test_cli/test_cli_multi_spec.py | CLI integration coverage for multi--T, directory specs, --entity, --emit-spec, and validate-spec --merge. |
| src/linkml_map/utils/spec_merge.py | New utilities to load/merge multiple spec inputs (including list-of-blocks sub-spec format). |
| src/linkml_map/transformer/transformer.py | Adds load_transformer_specifications() to load + normalize merged specs. |
| src/linkml_map/transformer/engine.py | Adds entity filter support to transform_spec() iteration over class derivations. |
| src/linkml_map/cli/cli.py | Wires multi--T, --entity, --emit-spec, and validate-spec --merge into CLI behaviors. |
Comments suppressed due to low confidence (1)
src/linkml_map/cli/cli.py:277
--entityis accepted for single-object inputs, but it does not affect the transformation path here (it only impacts--emit-spec). This meansmap-data some.yaml --entity Foosilently ignores the filter for actual mapping, unlike the streaming path whereentityis enforced intransform_spec(). Either reject--entityfor non-streaming inputs with a clear error, or apply it by selecting/validating the class_derivation used formap_objectso behavior is consistent.
if emit_spec:
_emit_spec_to_file(tr, emit_spec, entity)
# Load input data (YAML or JSON)
with open(input_data) as file:
content = file.read()
try:
input_obj = yaml.safe_load(content)
except yaml.YAMLError:
import json
input_obj = json.loads(content)
tr.index(input_obj, source_type)
try:
tr_obj = tr.map_object(input_obj, source_type)
except TransformationError as err:
- Apply --entity filter before validation in validate-spec --merge - Expand directories to individual files in non-merge validate-spec - Fix docstring ordering guarantee in resolve_spec_paths - Tighten click.Path constraints on --emit-spec options
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
-Tsupport: All CLI commands (map-data,validate-spec,compile,derive-schema,invert) now accept multiple-Tflags and/or directories. Specs are merged at load time —class_derivationsappended,enum_derivationsandslot_derivationsunioned by name with conflict detection.--entityfilter:map-dataandvalidate-specaccept--entity <class_name>to restrict processing to matching top-levelclass_derivations. Nestedobject_derivationsare unaffected.--emit-spec: Onmap-data, writes the resolved (merged + filtered) spec to a file as a side-effect. Onvalidate-spec(with--merge), writes to a file or stdout (-).--mergeonvalidate-spec: Merges all input spec files before validation, enabling combined validation and--entity/--emit-specusage.- class_derivations: {Entity: ...}) used by per-variable spec repos.Test plan
Closes #202