Add callable support for decimal_format option#978
Add callable support for decimal_format option#978NyanFisher wants to merge 1 commit intojcrist:mainfrom
decimal_format option#978Conversation
2357ed5 to
076b58c
Compare
2ee9741 to
31effee
Compare
117001b to
b5ff6b7
Compare
|
CI failures are unrelated to this change:
All build, test, and wheel jobs pass across all platforms. |
b5ff6b7 to
09b8d01
Compare
|
Code looks solid and CI is green across the matrix - nice work, especially the test coverage in One API design question I'd like to raise before this moves forward: the current shape places An alternative would be to attach quantization to the type via Price = Annotated[Decimal, Meta(decimal_quantize="0.0001", decimal_rounding="ROUND_HALF_EVEN")]That composes naturally with per-field configuration, lives next to the type where the constraint is logically defined, and matches how Did you consider the cc @jcrist @ofek — this expands the encoder API surface, so I'd like your read on whether the encoder-kwarg shape is the one we want, or whether |
|
Instead of two new options for quantization, how about adding a single # Uses default rounding
enc = Encoder(decimal_format="0.0001")
# Custom rounding
enc = Encoder(decimal_format=lambda d: d.quanitize(decimal.Decimal("0.001"), "ROUND_DOWN"))I like this since it's more flexible, and also only adds a single new option. Otherwise I'd worry about other users needing further customization, resulting in a number of I wouldn't expect a callable here to have a perf cost - calling into python here is negligible, most of the time will be in the
In For now a single setting on an |
|
@Siyet @jcrist Hello! Thanks for the review!
I hadn't considered using
I like this idea, but the |
|
After thinking it through I'm coming around to @jcrist's single-kwarg shape. One slot for everything is, in my view, the right call here. Encoder(decimal_format=lambda d: d.quantize(Decimal("0.001"), ROUND_DOWN)) # custom
Encoder(decimal_format="string") # existing
Encoder(decimal_format="number") # existingWe could split it along There is also the naming angle: @NyanFisher regarding your concern about overloading an existing kwarg: the three shapes ( |
09b8d01 to
f1d9799
Compare
decimal_format option
9485ce8 to
4810273
Compare
|
Please review this PR when you have a moment 🙂 I changed the implementation. |
Siyet
left a comment
There was a problem hiding this comment.
Re-checked after the rework. CI is green across the matrix, design matches what we landed on. Ran some scenarios beyond the existing tests locally (WSL Ubuntu 22.04, Python 3.10, build at 4810273), three blockers inline.
Plus docs: docs/supported-types.rst:595-606 only mentions 'string'/'number', would be good to add a callable example covering the use case from #848.
Nit: test_encoder_decimal_callable_raise_error_if_fn_return_decimal should use match="must not return a Decimal" to pin the message rather than any TypeError.
| else if (PyUnicode_CheckExact(decimal_format)) { | ||
| bool ok = false; | ||
| if (PyUnicode_CheckExact(decimal_format)) { | ||
| if (PyUnicode_CompareWithASCIIString(decimal_format, "string") == 0) { |
There was a problem hiding this comment.
Indentation broke after flattening the nesting. The inner if/elif here are at 12 spaces instead of 8. Same below: bodies of else if (PyCallable_Check) (lines 9596-9600) and else (lines 9601-9608) are at 8 spaces instead of 4.
| @@ -23,15 +24,21 @@ schema_hook_sig = Optional[Callable[[type], dict[str, Any]]] | |||
|
|
|||
| class Encoder: | |||
| enc_hook: enc_hook_sig | |||
There was a problem hiding this comment.
Callable[[Decimal], Union[str, float]] is narrower than what the runtime actually accepts. After the callable returns, the value goes through the regular encode path and accepts anything encodable (int, bool, dict, Struct, bytes, etc.). A realistic case: scaled integer for cents (lambda d: int(d * 100)), which mypy rejects with the current stub but works at runtime. Should be Callable[[Decimal], Any]. Same for msgpack.pyi.
| if (type == (PyTypeObject *)(self->mod->DecimalType)) { | ||
| PyErr_SetString( | ||
| PyExc_TypeError, | ||
| "decimal_format callable must not return a Decimal" |
There was a problem hiding this comment.
The guard only catches a direct Decimal return, not a nested one. Verified locally:
lambda d: [Decimal("0.5")] -> RecursionError
lambda d: {"v": Decimal("0.5")} -> RecursionError
lambda d: Struct(inner=Decimal("0.5")) -> RecursionError
Worse: with sys.setrecursionlimit(10**6) (common in projects with deep graphs) the Python-level safety net is gone and the encoder hits SIGSEGV with a core dump. Cleanest fix: add an in_decimal_callable flag to EncoderState, on re-entry into *_encode_decimal raise TypeError: callable returned a value containing a Decimal. ~5 lines of C, covers all the nested cases. Same applies to _core.c:14028 (json_encode_decimal).
Hello!
Description of the problem solved by this PR
The msgspec library has many useful features, but the current version lacks the ability to correctly quantize Decimal values during encoding. In applications related to finance and precise calculations, it is critical to take into account maximum accuracy and return values rounded to a specified precision. Without implementing this functionality, a complete transition from pydantic to msgspec is not possible.
Changes implemented in this PR
Core Functionality
Validation & Safety
Type Hints
Examples
Rounding to 2 Decimal Places
MessagePack with Rounding
Error: Returning Decimal from Callable
I would appreciate any comments on improving or restructuring the code, as I don't often write in C.
Fix my issue - Closes #848