Skip to content

BUG: Bug when inputs have uint dtype #97

@MImmesberger

Description

@MImmesberger

Bug description

TTSIM produces a (silent) bug when operating with uint dtypes. Often, variables are measured as uints (e.g. earnings which cannot be negative by definition). However, internally we do all sorts of operations with them making the uint dtype invalid.

I stumbled over this issue when looking at the output of the following policy function:

@policy_function()
def einnahmen_nach_abzug_werbungskosten_y(
    einnahmen__bruttolohn_y: float,
    werbungskosten_y: float,
) -> float:
    """Take gross wage and deduct Werbungskosten."""
    return max(einnahmen__bruttolohn_y - werbungskosten_y, 0.0)

Now, pyarrow would throw an ArrowInvalid error here. However, internally, we transform uint dtypes to numpy uints. They produce silent bugs in these cases

def test_uint_bruttolohn_does_not_overflow():
    result = max(np.uint32(0) - np.uint32(1230), 0.0)
    assert result == 0.0 # fails because result returns 4294966066.0

The following is a reproducer that demonstrated the bug still occurs if the input is uint32[pyarrow]:

def test_uint_bruttolohn_does_not_overflow():
    result = main(
        main_target=MainTarget.results.tree,
        tt_targets=TTTargets.qname([
            "einkommensteuer__einkünfte__aus_nichtselbstständiger_arbeit__einnahmen_nach_abzug_werbungskosten_y",
        ]),
        input_data=InputData.tree({
            "p_id": pd.Series([1]),
            "einnahmen": {
                "bruttolohn_y": pd.Series([0], dtype="uint32[pyarrow]"),
            },
        }),
        policy_date_str="2023-01-01",
    )

    einnahmen = result["einkommensteuer"]["einkünfte"]["aus_nichtselbstständiger_arbeit"]["einnahmen_nach_abzug_werbungskosten_y"]
    assert float(einnahmen[0]) == 0.0

Proposed Solution

Transform uints internally to regular ints/floats.

Related to #94

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions