Add callable support for `decimal_format` option by NyanFisher · Pull Request #978 · jcrist/msgspec

NyanFisher · 2026-02-11T11:34:59Z

Hello!

Description of the problem solved by this PR

The msgspec library has many useful features, but the current version lacks the ability to correctly quantize Decimal values during encoding. In applications related to finance and precise calculations, it is critical to take into account maximum accuracy and return values rounded to a specified precision. Without implementing this functionality, a complete transition from pydantic to msgspec is not possible.

Changes implemented in this PR

Core Functionality

Added DECIMAL_FORMAT_CALLABLE enum value to support callable decimal_format
Added decimal_callable field to EncoderState and Encoder structs
Implemented callable invocation for both JSON and MessagePack encoders

Validation & Safety

Added runtime check preventing callable from returning Decimal (avoids infinite recursion)
Updated error messages to reflect new callable option

Type Hints

Updated json.pyi and msgpack.pyi stubs to include callable type hints:

decimal_format: Union[
    Literal["string", "number"],
    Callable[[decimal.Decimal], Union[str, float]],
]

Examples

Rounding to 2 Decimal Places

import msgspec
import decimal

enc = msgspec.json.Encoder(
    decimal_format=lambda d: str(d.quantize(decimal.Decimal("0.01")))
)

value = decimal.Decimal("123.456789")
print(enc.encode(value))  # b'"123.46"'

MessagePack with Rounding

import msgspec
import decimal

# MessagePack with custom rounding
enc = msgspec.msgpack.Encoder(
    decimal_format=lambda d: float(d.quantize(decimal.Decimal("0.001")))
)

value = decimal.Decimal("3.14159265")
msg = enc.encode(value)
print(msgspec.msgpack.decode(msg))  # 3.142

Error: Returning Decimal from Callable

import msgspec
import decimal

# INVALID: callable must not return Decimal
enc = msgspec.json.Encoder(
    decimal_format=lambda d: d.quantize(decimal.Decimal("0.01"))  # Error!
)

try:
    enc.encode(decimal.Decimal("1.234"))
except TypeError as e:
    print(e)  # decimal_format callable must not return a Decimal

I would appreciate any comments on improving or restructuring the code, as I don't often write in C.

Fix my issue - Closes #848

NyanFisher · 2026-04-08T04:22:34Z

CI failures are unrelated to this change:

"Profile Windows (ARM64)" - runner infra issue ("Install command runner" step)
"build" (documentation) - link checker fails on a pre-existing broken link (Broken link to ravyn.dev encoders example in docs #976)

All build, test, and wheel jobs pass across all platforms.

Siyet · 2026-04-10T12:28:57Z

Code looks solid and CI is green across the matrix - nice work, especially the test coverage in test_common.py.

One API design question I'd like to raise before this moves forward: the current shape places decimal_quantize / decimal_rounding on the encoder itself, which means every Decimal field in every struct passing through that encoder gets the same scale and rounding mode. In financial code it's common to have heterogeneous Decimal fields in the same payload (e.g. price at scale 4, quantity at scale 0, tax_rate at scale 6) - with an encoder-level setting you'd need separate encoders per shape, which defeats most of the ergonomic win.

An alternative would be to attach quantization to the type via Annotated[Decimal, Meta(...)], e.g.

Price = Annotated[Decimal, Meta(decimal_quantize="0.0001", decimal_rounding="ROUND_HALF_EVEN")]

That composes naturally with per-field configuration, lives next to the type where the constraint is logically defined, and matches how gt/ge/pattern etc. already work today. The downside is more plumbing through TypeNode instead of one encoder kwarg.

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

cc @jcrist @ofek — this expands the encoder API surface, so I'd like your read on whether the encoder-kwarg shape is the one we want, or whether Meta-based quantization is preferable.

jcrist · 2026-04-10T17:13:30Z

Instead of two new options for quantization, how about adding a single decimal_format option to Encoder? This would take either a string to pass to quantize (something like decimal.quantize(Decimal(decimal_format))), or a callable that takes in the decimal and returns a new value to encode. A few examples:

# Uses default rounding
enc = Encoder(decimal_format="0.0001")

# Custom rounding
enc = Encoder(decimal_format=lambda d: d.quanitize(decimal.Decimal("0.001"), "ROUND_DOWN"))

I like this since it's more flexible, and also only adds a single new option. Otherwise I'd worry about other users needing further customization, resulting in a number of decimal_* kwargs.

I wouldn't expect a callable here to have a perf cost - calling into python here is negligible, most of the time will be in the quantize call itself.

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

In msgspec, (currently) encoding doesn't have any type-level information, it only has the values. This means customization for encoding cannot rely on information in annotations, it has to rely on the actual object instances themselves. This is admittedly less flexible in cases where you might want to encode different values differently, but keeps the encoder simple and supports values that exist outside of containers with attached annotations (e.g. encode(decimal_object) wouldn't have annotations, but encode(struct_with_a_decimal_field) would).

For now a single setting on an Encoder is both straightforward to implement, and matches the current conventions.

NyanFisher · 2026-04-13T09:17:16Z

@Siyet @jcrist Hello! Thanks for the review!

@Siyet

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

I hadn't considered using Meta, but I think that approach would result in a large number of TypeNode. I work at a bank and know that a single Price isn't enough, since it's too general a concept. But it's a good idea for future 😃

@jcrist

Instead of two new options for quantization, how about adding a single decimal_format option to Encoder?

I like this idea, but the decimal_format parameter already exists. If you plan to extend the interface with additional types, I don’t think this is the best solution, as it will confuse users. I suggest using a separate additional parameter called decimal_quantize with the types Decimal | Callable[[Decimal], Decimal], which would be responsible exclusively for quantization.
This way, we’ll retain the ability to convert Decimal to “string”/“number”, add quantization, and maintain backward compatibility.

Siyet · 2026-04-15T10:04:03Z

After thinking it through I'm coming around to @jcrist's single-kwarg shape. One slot for everything is, in my view, the right call here.

Encoder(decimal_format=lambda d: d.quantize(Decimal("0.001"), ROUND_DOWN))  # custom
Encoder(decimal_format="string")                                            # existing
Encoder(decimal_format="number")                                            # existing

We could split it along dataclasses.field(default=..., default_factory=...) lines (value in one kwarg, callable in another), but that split exists specifically to disambiguate "the value is a callable" from "call this to produce the value", and neither "string" nor "number" is callable. Introducing a separate decimal_hook just to satisfy a pattern we do not actually need feels like overcomplicating the interface.

There is also the naming angle: decimal_format reads as a verb just as naturally as it reads as a noun ("how to format the decimal"), which makes "pass a callable that does the formatting" fit the name rather than fight it.

@NyanFisher regarding your concern about overloading an existing kwarg: the three shapes ("string" / "number" / callable) dispatch unambiguously on type (string vs. callable), so the dispatch logic in C stays simple and the user-facing docs just enumerate the three accepted shapes in one place.

NyanFisher · 2026-04-28T05:38:12Z

@Siyet

Please review this PR when you have a moment 🙂 I changed the implementation.

Siyet

Re-checked after the rework. CI is green across the matrix, design matches what we landed on. Ran some scenarios beyond the existing tests locally (WSL Ubuntu 22.04, Python 3.10, build at 4810273), three blockers inline.

Plus docs: docs/supported-types.rst:595-606 only mentions 'string'/'number', would be good to add a callable example covering the use case from #848.

Nit: test_encoder_decimal_callable_raise_error_if_fn_return_decimal should use match="must not return a Decimal" to pin the message rather than any TypeError.

Siyet · 2026-04-28T16:46:18Z

+    else if (PyUnicode_CheckExact(decimal_format)) {
        bool ok = false;
-        if (PyUnicode_CheckExact(decimal_format)) {
            if (PyUnicode_CompareWithASCIIString(decimal_format, "string") == 0) {


Indentation broke after flattening the nesting. The inner if/elif here are at 12 spaces instead of 8. Same below: bodies of else if (PyCallable_Check) (lines 9596-9600) and else (lines 9601-9608) are at 8 spaces instead of 4.

Siyet · 2026-04-28T16:46:18Z

@@ -23,15 +24,21 @@ schema_hook_sig = Optional[Callable[[type], dict[str, Any]]]

 class Encoder:
    enc_hook: enc_hook_sig


Callable[[Decimal], Union[str, float]] is narrower than what the runtime actually accepts. After the callable returns, the value goes through the regular encode path and accepts anything encodable (int, bool, dict, Struct, bytes, etc.). A realistic case: scaled integer for cents (lambda d: int(d * 100)), which mypy rejects with the current stub but works at runtime. Should be Callable[[Decimal], Any]. Same for msgpack.pyi.

Siyet · 2026-04-28T16:46:18Z

+        if (type == (PyTypeObject *)(self->mod->DecimalType)) {
+            PyErr_SetString(
+                PyExc_TypeError,
+                "decimal_format callable must not return a Decimal"


The guard only catches a direct Decimal return, not a nested one. Verified locally:

lambda d: [Decimal("0.5")] -> RecursionError lambda d: {"v": Decimal("0.5")} -> RecursionError lambda d: Struct(inner=Decimal("0.5")) -> RecursionError

Worse: with sys.setrecursionlimit(10**6) (common in projects with deep graphs) the Python-level safety net is gone and the encoder hits SIGSEGV with a core dump. Cleanest fix: add an in_decimal_callable flag to EncoderState, on re-entry into *_encode_decimal raise TypeError: callable returned a value containing a Decimal. ~5 lines of C, covers all the nested cases. Same applies to _core.c:14028 (json_encode_decimal).

NyanFisher force-pushed the decimal-quantize branch from 2357ed5 to 076b58c Compare February 11, 2026 12:06

NyanFisher changed the title ~~Implement quantization for Decimal type when encode~~ Draft: Implement quantization for Decimal type when encode Feb 11, 2026

NyanFisher force-pushed the decimal-quantize branch 2 times, most recently from 2ee9741 to 31effee Compare February 11, 2026 13:51

NyanFisher changed the title ~~Draft: Implement quantization for Decimal type when encode~~ Implement quantization for Decimal type when encode Feb 11, 2026

NyanFisher force-pushed the decimal-quantize branch 2 times, most recently from 117001b to b5ff6b7 Compare February 11, 2026 15:32

NyanFisher force-pushed the decimal-quantize branch from b5ff6b7 to 09b8d01 Compare April 9, 2026 10:07

NyanFisher force-pushed the decimal-quantize branch from 09b8d01 to f1d9799 Compare April 27, 2026 17:36

NyanFisher closed this Apr 27, 2026

NyanFisher reopened this Apr 27, 2026

NyanFisher changed the title ~~Implement quantization for Decimal type when encode~~ Draft: Implement quantization for Decimal type when encode Apr 27, 2026

Add callable support for decimal_format option

4810273

NyanFisher changed the title ~~Draft: Implement quantization for Decimal type when encode~~ Add callable support for decimal_format option Apr 28, 2026

NyanFisher force-pushed the decimal-quantize branch from 9485ce8 to 4810273 Compare April 28, 2026 05:26

Siyet reviewed Apr 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add callable support for `decimal_format` option#978

Add callable support for `decimal_format` option#978
NyanFisher wants to merge 1 commit intojcrist:mainfrom
NyanFisher:decimal-quantize

NyanFisher commented Feb 11, 2026 •

edited

Loading

Uh oh!

NyanFisher commented Apr 8, 2026

Uh oh!

Siyet commented Apr 10, 2026 •

edited

Loading

Uh oh!

jcrist commented Apr 10, 2026

Uh oh!

NyanFisher commented Apr 13, 2026

Uh oh!

Siyet commented Apr 15, 2026 •

edited

Loading

Uh oh!

NyanFisher commented Apr 28, 2026

Uh oh!

Siyet left a comment

Uh oh!

Siyet Apr 28, 2026

Uh oh!

Siyet Apr 28, 2026

Uh oh!

Siyet Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -23,15 +24,21 @@ schema_hook_sig = Optional[Callable[[type], dict[str, Any]]]

		class Encoder:
		enc_hook: enc_hook_sig

Conversation

NyanFisher commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the problem solved by this PR

Changes implemented in this PR

Core Functionality

Validation & Safety

Type Hints

Examples

Rounding to 2 Decimal Places

MessagePack with Rounding

Error: Returning Decimal from Callable

Uh oh!

NyanFisher commented Apr 8, 2026

Uh oh!

Siyet commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcrist commented Apr 10, 2026

Uh oh!

NyanFisher commented Apr 13, 2026

Uh oh!

Siyet commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NyanFisher commented Apr 28, 2026

Uh oh!

Siyet left a comment

Choose a reason for hiding this comment

Uh oh!

Siyet Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Siyet Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Siyet Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NyanFisher commented Feb 11, 2026 •

edited

Loading

Siyet commented Apr 10, 2026 •

edited

Loading

Siyet commented Apr 15, 2026 •

edited

Loading