Skip to content

Fix SWO issue in ICL core templates#57

Open
gl84 wants to merge 6 commits into
boostorg:developfrom
gl84:fix_icl_swo
Open

Fix SWO issue in ICL core templates#57
gl84 wants to merge 6 commits into
boostorg:developfrom
gl84:fix_icl_swo

Conversation

@gl84

@gl84 gl84 commented Jun 8, 2026

Copy link
Copy Markdown

Overview

LLVM 22 libc++ switched from the classic search to an improved single root-to-leaf descent for std::set/std::map. The change is standard conforming and on a balanced tree of height H it almost halves the number of comparisons to find an element. This requires the std::set/std::map comparators to satisfy Strict Weak Ordering (SWO).

Boost.ICL stores intervals in a std::set/std::map keyed by exclusive_less_than (let's call it $\prec$). For two intervals A and B the relation $A \prec B \iff A$ lies entirely, exclusively, to the left of $B$ without a shared point. The relation it induces $A \approx B \iff \lnot (A \prec B) \land \lnot (B \prec A)$ means "A and B overlap or touch inclusively".

A SWO requires that the relation $\approx$ to be an equivalence relation (particularly transitive). But overlap isn't transitive:
$A=[0,3]$, $B=[1,5]$ and $C=[4,5]$ we have $A \approx B$ and $B \approx C$ but $A \not \approx B$. Transitivity fails, so $\prec$ is not a SWO. This is intrinsic to any "overlap" comparator and not a fixable corner case.

This PR fixes the SWO issue from a mathematical point of view focussing on minimal changes in the ICL library.

Why LLVM 22 breaks Boost.ICL

A red-black tree's invariant (every nodes key sorts after its left subtree, before its right subtree) only needs the comparator to be consistent to the keys actually stored. ICL stores pairwise-disjoint intervals (and on this pairwise disjoint set $\prec$ is a perfectly good total order - disjoint intervals are linearly arranged on the line. This means that the tree itself is always well-formed and the problem isn't in the stored intervals. The danger lies in querying the red-black tree with a non-stored key Q. The correctness of the lower_bound(Q)/upper_bound(Q) descend in the tree comparing the query Q against stored nodes. The correctness of the descent depends on transitivity property between the query and the stored keys. Q can be $\approx$ to two stored intervals that are themselves $\prec$ ordered.

TLDR: the container is fine, the interval-keyed query is the non-SWO operation and the optimized descent of LLVM 22 exposes that.

Fix

A single domain point p, compared against a set of pairwise-disjoint intervals, induces a relation that is a strict weak ordering. p lies in at most one of them, every other stored interval is strictly left of p or strictly right of p. This means left/right of a point is a total order consistent with the intervals own order. The fix is therefore only ask the tree point questions.

For lower_bound(Q) we anchor on lower(Q), for upper_bound(Q) we anchor on upper(Q). We have to be careful if our query's own open/closed bound at the endpoint.

lower_bound: It can happen that we need to step RIGHT, at most once. The closed anchor point can land on an interval that the query interval only kisses at an open lower bound e.g. stored [...,2] and we query (2,...]. This means the intervals don't overlap and we need to do one step forward to find genuinely overlapping interval.

upper_bound: It can happen that we need to step LEFT (back), at most once. The upper_point(point) returns the first element strictly right of upper(Q). The same reasoning as for the lower_bound applies and we need to exclude a kiss at an open upper bound.

TLDR: Get a correct starting position using a point search. Afterwards fix for kissing open bounds.

Fixes #51

@gl84 gl84 marked this pull request as draft June 9, 2026 08:04
@gl84 gl84 marked this pull request as ready for review June 9, 2026 10:52
@joaquintides

Copy link
Copy Markdown
Member

Hi Gernot! Can you compare this solution with #54? Both in terms of applicability (#57 seems to not fix the problem for right_open_interval) and performance? Based on this comparison we can decide whether to apply one or the other. Thank you!

@gl84

gl84 commented Jun 10, 2026

Copy link
Copy Markdown
Author

Re-reading #54 I see that it tackles the same problem in a more subtle, technical way (sidestepping SWO through heterogeneous lookups). I honestly had no idea that there exist lookups that are different in nature. You're right about the right_open_interval, that is something I overlooked. Maybe a the best solution is to combine the approaches and use heterogeneous point lookups. For me those point queries are easier to grasp...

@gl84

gl84 commented Jun 26, 2026

Copy link
Copy Markdown
Author

This commit fixes the problem @joaquintides raised (right_open_interval).

Two points remain:

  • C++11 Fallback to boost::container. Is this corner case worth the fix? I have no idea about plans of making C++14 the current baseline.
  • performance evaluation

For the performance evaluation my expectation would be as follows: The descent complexity stays O(log(N)), but the kiss correction adds a fixed 2 comparisons at the end. So when only counting comparisons the #54 is slightly cheaper. I think the reason libc++ switched to single root-to-leaf descent is higher performance (both lookup and insert/erase) and lower memory consumption. What performance evaluations (discrete or dynamic/static and continuous, what compilers, library versions) did you have in mind?

Honestly I think they are very, very close in practice.

…guards

numeric_minimum relied solely on is_numeric<T>::value to decide whether to
query std::numeric_limits<T>. For types where is_numeric is true but
numeric_limits is not specialized (e.g. boost::rational), min()/max() silently
fall back to a default-constructed T(), producing wrong bound checks
(boostorg#58). It also used min() for the lower extreme, which for non-integral types
is the smallest *positive* value rather than the most negative one.

Changes:
  - Guard all bound checks on std::numeric_limits<T>::is_specialized; unspecialized
    numeric domains now impose no constraint instead of comparing against a bogus
    default-constructed bound.
  - Use lowest() (not min()) for the lower extreme.
  - Add a numeric_maximum trait mirroring numeric_minimum to detect domain_next
    overflow, and assert it wherever an interval boundary is built via domain_next
    (singleton, unit_trail, hull) and in first(), making first()/last() symmetric.
  - Add regression tests for the bound traits (int, double, rational) and for
    first()/last() across all bound flavors.

Fixes boostorg#58
@gl84 gl84 marked this pull request as draft June 27, 2026 13:02
gl84 added 3 commits June 27, 2026 15:16
The previous commit added the SWO-safe lower/upper_bound wrappers but
left several call sites still using the raw container directly: find(),
the BOOST_ASSERT in gap_insert, and the collision-range detection in
the add/insert loops. Also adds the regression tests from PR boostorg#53 for
broader coverage of the fixed accessors.
This commit fixes the problem that @joaquintides raised (right_open_interval).

Continuous domains have no singleton to anchor on, so on >=C++14 the
accessors now anchors on the domain point via heterogeneous lookup, with a
single forward/back correction for an open-bound "kiss". To enable this,
exclusive_less_than gains is_transparent plus interval-vs-point and
point-vs-interval operator() overloads.

One open question is wheter one includes a C++11 fallback with boost::container.
The breakage occurs with libc++22 which implies an existing C++14 toolchain.

interval_base_map and interval_base_set now return end() instead of
m.lower_bound/upper_bound(interval) for empty queries.
Passing intervals overlapping stored elements violates
the comparator's SWO precondition (formally UB). In practice every standard
library tested tolerates it: a unique insert's single descent lands on the
overlapping run and rejects {pos,false}, the collision signal add() relies on,
and the hinted overload falls back to the same search. No tested library
misplaces or corrupts, at any hint position.

So this commit is hardening only, not a bug fix.

    - Add _swo_insert (un-hinted + hinted) to interval_base_set/_map: probe via
        the SWO-safe member lower_bound and insert only a disjoint key with a
        correct hint, so the container only ever compares against disjoint
        neighbours regardless of its algorithm; on collision report the colliding
        position without mutating.
    - Route add/insert_at/gap_insert through it. The residual/gap inserts in
        add_front/add_segment/add_rear stay raw (disjoint-by-construction).
    - Guard the C++11 singleton-anchored lower/upper_bound against the domain
        min/max, where singleton() is undefined (left_open/open) or silently wraps
        (right_open/open): fall back to begin()/end(), and back-correct with the
        interval-vs-point comparator so it is valid at the extreme.
    - Document the SWO container invariant atop interval_base_set.
    - Add regression tests for overlapping un-hinted and hinted adds on
        interval_set and interval_map.
@gl84 gl84 marked this pull request as ready for review June 27, 2026 14:58
@gl84

gl84 commented Jun 27, 2026

Copy link
Copy Markdown
Author

Last commit hardens ICL inserts against SWO comparator violations.

Rebased on PR #59. This is required as the hardening triggers a bug with boost::rational and numeric_minimum.

The point-anchored lower_bound/upper_bound/equal_range accessors and the
_swo_insert hardening were duplicated, near-identically, in interval_base_set
and interval_base_map. Extract them into a single home,
detail/interval_lookup.hpp, parameterized by a key projection
(identity_key for sets, the value's interval key for maps).

Both containers now forward their lookup and insert members to the shared
helper, and the full strict-weak-ordering rationale lives in one place
instead of being copied across two files. No behavioral change.
@gl84

gl84 commented Jun 27, 2026

Copy link
Copy Markdown
Author

The SWO-safe interval lookup has been refactored and is now centralized.

Two points remain:

- C++11 Fallback to boost::container worth the effort?
- performance evaluation

@pdimov pdimov closed this Jun 28, 2026
@pdimov pdimov reopened this Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

exclusive_less_than may not be compliant with C++ standard requirements for comparator

3 participants