Skip to content

Support multi-file spec loading, --entity filter, and --emit-spec#203

Open
amc-corey-cox wants to merge 2 commits intomainfrom
multi-spec-loading
Open

Support multi-file spec loading, --entity filter, and --emit-spec#203
amc-corey-cox wants to merge 2 commits intomainfrom
multi-spec-loading

Conversation

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Summary

  • Multi--T support: All CLI commands (map-data, validate-spec, compile, derive-schema, invert) now accept multiple -T flags and/or directories. Specs are merged at load time — class_derivations appended, enum_derivations and slot_derivations unioned by name with conflict detection.
  • --entity filter: map-data and validate-spec accept --entity <class_name> to restrict processing to matching top-level class_derivations. Nested object_derivations are unaffected.
  • --emit-spec: On map-data, writes the resolved (merged + filtered) spec to a file as a side-effect. On validate-spec (with --merge), writes to a file or stdout (-).
  • --merge on validate-spec: Merges all input spec files before validation, enabling combined validation and --entity/--emit-spec usage.
  • Also handles the list-of-blocks sub-spec format (e.g., - class_derivations: {Entity: ...}) used by per-variable spec repos.

Test plan

  • 25 unit tests for spec merge logic (resolve paths, load formats, merge strategies, conflict detection)
  • 12 CLI integration tests covering multi-T, --entity filtering, --emit-spec on both commands, --merge validation
  • All 569 existing tests pass unchanged

Closes #202

- Add spec merge utility (resolve paths, load sub-spec list format,
  merge class/enum/slot derivations) in utils/spec_merge.py
- Change -T to accept multiple files/directories on all CLI commands
- Add --entity filter to map-data and validate-spec for processing
  a single class_derivation by name
- Add --emit-spec PATH to map-data (side-effect file) and
  validate-spec (file or stdout with '-')
- Add --merge flag to validate-spec for combined validation

Closes #202
Copilot AI review requested due to automatic review settings April 10, 2026 18:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support for composing transformation specs at runtime (multi--T and directory inputs), filtering execution to a single entity, and emitting the resolved spec for inspection/reproducibility—implemented in shared spec-merge utilities and wired through the CLI/engine.

Changes:

  • Introduces spec_merge utilities to resolve spec paths (files/dirs), load both dict and list-of-blocks YAML formats, and merge specs with conflict detection.
  • Extends CLI commands to accept multiple -T values, adds --entity filtering (map-data + validate-spec), and adds --emit-spec plus --merge for validate-spec.
  • Updates streaming engine to optionally process only a single class derivation by name; adds unit + CLI integration tests for the new behaviors.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_utils/test_spec_merge.py Unit tests for path resolution, load formats, merge semantics, and conflict handling.
tests/test_cli/test_cli_multi_spec.py CLI integration coverage for multi--T, directory specs, --entity, --emit-spec, and validate-spec --merge.
src/linkml_map/utils/spec_merge.py New utilities to load/merge multiple spec inputs (including list-of-blocks sub-spec format).
src/linkml_map/transformer/transformer.py Adds load_transformer_specifications() to load + normalize merged specs.
src/linkml_map/transformer/engine.py Adds entity filter support to transform_spec() iteration over class derivations.
src/linkml_map/cli/cli.py Wires multi--T, --entity, --emit-spec, and validate-spec --merge into CLI behaviors.
Comments suppressed due to low confidence (1)

src/linkml_map/cli/cli.py:277

  • --entity is accepted for single-object inputs, but it does not affect the transformation path here (it only impacts --emit-spec). This means map-data some.yaml --entity Foo silently ignores the filter for actual mapping, unlike the streaming path where entity is enforced in transform_spec(). Either reject --entity for non-streaming inputs with a clear error, or apply it by selecting/validating the class_derivation used for map_object so behavior is consistent.
    if emit_spec:
        _emit_spec_to_file(tr, emit_spec, entity)

    # Load input data (YAML or JSON)
    with open(input_data) as file:
        content = file.read()
        try:
            input_obj = yaml.safe_load(content)
        except yaml.YAMLError:
            import json

            input_obj = json.loads(content)

    tr.index(input_obj, source_type)
    try:
        tr_obj = tr.map_object(input_obj, source_type)
    except TransformationError as err:

- Apply --entity filter before validation in validate-spec --merge
- Expand directories to individual files in non-merge validate-spec
- Fix docstring ordering guarantee in resolve_spec_paths
- Tighten click.Path constraints on --emit-spec options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multi-file spec loading, --entity filter, and --emit-spec

2 participants