Skip to content

Missing support for "YAML Schema" (spec 1.2) #703

@ulidtko

Description

@ulidtko

Hi!

Let me open a new thread to discuss a solution to some of the annoyingly recurring issues.

Parsing of Date, right?.. #262 huge thread, #393, #483, from the oldest #137 to new generation #672, #676; there's more.

Datetime in the same bucket.

Parsing of Symbol — far from a hardcore Ruby-head myself — but I see it in the same bucket, too.

The thing is: spec 1.2 solves it all. Let me explain.

Spec TL;DR

I'll reword from Chapter 10 of the spec, the last chapter.

It defines:

  • Failsafe Schema,
  • JSON Schema,
  • Core Schema,
  • and hints at Other Schemas.

To simplify, think of "YAML Schema" as a list of classes (yaml "tags") that are allowed to de/serialize.

The FAILSAFE_SCHEMA comprises tags [map, seq, str].

The JSON_SCHEMA comprises tags [null, bool, int, float] plus those from FAILSAFE_SCHEMA.

The CORE_SCHEMA comprises the same tags as JSON_SCHEMA. (Find exact difference in the spec.)

Why did I change to ALL_CAPS?.. Because these are supposed to be in yaml library API. ⚠️

Example

I'd recently hit an issue trying to validate a bunch of yamls containing... dates, you guessed it — against a pre-existing json-schema.

Used json_schemer, nice library. Long story short, davishmcclurg/json_schemer#203 — it didn't work well. I found no better solution than a filthy monkey-patch.

It's because of a basic impedance mismatch:

  • The best json-schema can do, is {"type": "string", "format": "date"}. Reference if you don't believe me. (Although should be obvious: there's no concept of Date type in JSON, just regexes ABNF).
    • The validator library fully supports that.
  • YAML, however, has a notion of date tag (type) — which is distinct from str tag (a string). That's why field: "2024-12-27" and field: 2024-12-27 do not — and should not! — have the same meaning and behavior.
    • The validator library blows up when verifying {"type": "string", "format": "date"} on #<Date: 2024-12-26 ((2460671j,0s,0n),+0s,-Infj)> parsed from yaml, saying … is not a string. It indeed, correctly, isn't.

What gives

Now, a bit of cross-pollination.

JavaScript folks had hit the exact same conundrum, not so long ago. ajv-validator/ajv-cli#122

How did they solve it?.. Well, their yaml library supports ✨ YAML Schemas! ✨

image

So their fix was literally a single-line change that switched the validator's parser to CORE_SCHEMA. YAML-native date tag no more; date validation issues no more; solved, done!

Now, I can't do the same using Psych, can I?

Will I?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions