Skip to content

Forward Solvers: Split local unknowns by digests #1971

Open
jerhard wants to merge 26 commits intofwd_constrsys_newfrom
fwd_constrsys_digests
Open

Forward Solvers: Split local unknowns by digests #1971
jerhard wants to merge 26 commits intofwd_constrsys_newfrom
fwd_constrsys_digests

Conversation

@jerhard
Copy link
Copy Markdown
Member

@jerhard jerhard commented Mar 30, 2026

This PR splits the local unknowns by digests for the forward constraint systems. It uses the module P of the analysis specifications that was previously used for pathsensitivity.

Local unknowns now are four-tuples, consisting of the node, the calling context, the calling digest (called original_digest in the implementation) and the current digest.

After the application of a transfer function, when the new abstract state is computed, the new current digest is derived from the abstract state via the function P.of_elt. In case a transfer function should introduce some splitting, it can use man.split, which is the same mechanism as for pathsensitivity.

The pathsensitivity (without splitting of unknowns) is used when solvers.fwd.digests is false, and the digests are used otherwise.

Additionally, there are changes to the XSLT-output, now grouping result entries consisting of context, calling digest, current digest and abstract state together.

jerhard added 20 commits March 17, 2026 10:41
This breaks comparing constraint systems for the backward constraint systems (for now).
Handle man.split with the digest system for the foward-propagating constraint system. Handling of procedure calls still needs some work.
When defaulting to "bu" as the default solver this commit increases the number of failing sanity tests to 38.
This introduces makes the domain for function return nodes a mapping from unknowns to abstract values. The unknowns that are used as keys may differ by digest, in particular.
This will allow to have the globals defined by the analyses, globals for returns split by current digest, and globals for returns joined over current digests.
…on call pollute results. Also a problem without digests.
In fwdConstraints, in case that the callee does not return, introduce a return path with bottom for the longjmpLifter to do its work.
…viely.

This implementation using man.split should work when using digests and when using pathsensitivity. The event list passed to man.split is empty, since there is nothing further to do.
…onal path-sensitivity with option solvers.fwd.digests.
@jerhard jerhard requested a review from michael-schwarz March 30, 2026 13:08
@sim642 sim642 added the feature label Mar 31, 2026
@sim642 sim642 requested review from sim642 and removed request for sim642 March 31, 2026 04:47
@sim642
Copy link
Copy Markdown
Member

sim642 commented Mar 31, 2026

I just realized that this isn't merging into master. That's why there's no forward solver in the diff here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces digest-based splitting of local unknowns in the forward constraint system to realize path-sensitivity via analysis-spec P digests (configurable via solvers.fwd.digests). It also adapts result aggregation and the XSLT report output to present results grouped by (context, calling digest, current digest, abstract state).

Changes:

  • Add digest-aware local variables and forward constraint generation that propagates/updates digests and supports splitting via man.split.
  • Extend the framework with Spec'/LVarSet support and a new 3-way lifted global lattice (Lift3 / GVar3) to represent spec globals + per-return-node globals + return-node sets.
  • Update XSLT/XML result output and add new regression tests under tests/regression/90-digests.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
xslt/node.xsl Render node results as grouped <tuple> entries with optional digest sections.
tests/regression/90-digests/01-function-call.c New regression test for digest-sensitive behavior across function calls.
tests/regression/90-digests/02-mutex-lock-split.c New regression test for digest-sensitive mutex locking.
tests/regression/90-digests/03-function-call-split.c New regression test for digest changes during a function call.
tests/regression/90-digests/04-garbage-digest.c New regression test for digest “garbage collection” expectations.
tests/regression/90-digests/05-garbage-call.c New regression test involving calls/precision; currently has a C11 declaration-order concern.
tests/regression/90-digests/06-longjmp-return.c New regression test for longjmp/setjmp interactions with the solver.
src/solver/td_simplified_ref_improved.ml Formatting-only adjustments.
src/lifters/noDigestLifter.ml New lifter to expose unit digests when solvers.fwd.digests is disabled.
src/lifters/longjmpLifter.ml Adjust setjmp handling to use man.split for separating normal vs longjmp return flows.
src/framework/fwdControl.ml Switch forward analysis to Spec', integrate digest-aware result type, and wire in the new comparator.
src/framework/fwdConstraints.ml Core change: digest-aware locals (VarDigestF), digest propagation/update, and return handling via return-node sets.
src/framework/fwdCompareConstraints.ml New forward comparator adapted to digest-aware locals and the new forward globals.
src/framework/control.ml Switch to Spec' in the (non-forward) control pipeline for shared infrastructure.
src/framework/compareConstraints.ml Refactor comparator to work against SpecSys and adjust local-joining logic structure.
src/framework/analysisResult.ml Change XML printing to emit <tuple> entries and add a digest-aware result entry type.
src/framework/analyses.ml Introduce VarDigestF, GVarFCNW return variants, GVar3, Spec', and Spec2Spec'.
src/domain/lattice.ml Add Lift3 lattice combinator to support 3-way lifted globals.
src/config/options.schema.json Add new option solvers.fwd.digests (default false).
src/common/domains/printable.ml Add Lift3Conf printable and Prod4 printable used by new result/global types.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +256 to +261
let s = Seq.map return_value return_nodes in
if Seq.is_empty s then
(* In case the calle does not return, create one bottom return path for LongjmpLifter to do its work *)
[S.D.bot ()]
else
List.of_seq s
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seq.is_empty s consumes the first element of the sequence s, so the subsequent List.of_seq s will miss that element (and may even become empty). This can drop return paths and lead to incorrect call/return semantics. Materialize the sequence once (e.g., convert to a list and test that list), or use Seq.uncons/equivalent to check emptiness without losing elements.

Suggested change
let s = Seq.map return_value return_nodes in
if Seq.is_empty s then
(* In case the calle does not return, create one bottom return path for LongjmpLifter to do its work *)
[S.D.bot ()]
else
List.of_seq s
let return_nodes_list = List.of_seq return_nodes in
if return_nodes_list = [] then
(* In case the calle does not return, create one bottom return path for LongjmpLifter to do its work *)
[S.D.bot ()]
else
List.map return_value return_nodes_list

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this a problem, the sequence should be persistent, or?

include S
module P = struct
include Printable.Unit
let of_elt _ = ()
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoDigestLifter uses Printable.Unit for P, which will render digests as "()" in XML/HTML. Since fwdControl always outputs <original_digest>/<current_digest>, this makes the UI show digest sections even when solvers.fwd.digests is disabled. Consider overriding P.printXml/P.show to output an empty value (so the XSLT normalize-space(...) checks hide it), or conditionally omit digest tags in the result XML when digests are disabled.

Suggested change
let of_elt _ = ()
let of_elt _ = ()
let show _ = ""
let printXml _formatter _t = ()

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +73
if get_bool "dbg.trace.context" then (* Print context and digest *)
dprintf "(%a, %a, %a) on %a" Node.pretty_trace x.node C.pretty x.context P.pretty x.original_digest CilType.Location.pretty (getLocation x)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VarDigestF.pretty_trace only prints original_digest, not current_digest. Since the local variable identity includes both digests, traces can become ambiguous once current_digest starts changing. Include both digests (or at least print current_digest when it differs) to make debugging reliable.

Suggested change
if get_bool "dbg.trace.context" then (* Print context and digest *)
dprintf "(%a, %a, %a) on %a" Node.pretty_trace x.node C.pretty x.context P.pretty x.original_digest CilType.Location.pretty (getLocation x)
if get_bool "dbg.trace.context" then (* Print context and digests *)
dprintf "(%a, %a, %a, %a) on %a" Node.pretty_trace x.node C.pretty x.context P.pretty x.original_digest P.pretty x.current_digest CilType.Location.pretty (getLocation x)

Copilot uses AI. Check for mistakes.

module V =
struct
(* TODO: Consider splitting unknowns by digest for constraint system in [FwdConstraints] *)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment about "splitting unknowns by digest" looks outdated now that digests are implemented in FwdConstraints. Consider removing or updating it to reflect the current mechanism (e.g., that this lifter uses man.split, which is implemented via digests when solvers.fwd.digests is enabled).

Suggested change
(* TODO: Consider splitting unknowns by digest for constraint system in [FwdConstraints] *)
(* This lifter uses [man.split] for splitting unknowns; when [solvers.fwd.digests] is enabled, this is implemented via digests in [FwdConstraints]. *)

Copilot uses AI. Check for mistakes.
Comment on lines +191 to 195
man.split normal_return [];
man.split longjmped [];
D.bot ()
)
| Longjmp {env; value} ->
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new forward solver semantics, man.split paths are propagated separately (not just joined), and the transfer function return value is also propagated as its own path. Returning D.bot () here therefore introduces an extra bottom path/state in addition to normal_return and longjmped, which can create spurious digest entries and extra solver work. Prefer returning one of the real states (e.g., normal_return) and only man.split the other(s) to avoid propagating a standalone bottom path.

Suggested change
man.split normal_return [];
man.split longjmped [];
D.bot ()
)
| Longjmp {env; value} ->
man.split longjmped [];
normal_return
)
| Longjmp {env; value} ->
| Longjmp {env; value} ->

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I was wondering if now that you have understood all the details may be a good idea to leave some more comments in these modules. I think if someone comes back to this in a few years, it will all be very hard to understand.

Also: Maybe establishing some kind of terminology for the return variable where occuring digests are accumulated may be helpful?

include Std

let show (x,y,z,w) =
(* TODO: remove ref *)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is needed to remove the TODO here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a copy-paste leftover from other modules in the file. It should be enough to just inline them.

end

(** Functor for locals with digests. *)
module VarDigestF (C: Printable.S) (P : Printable.S) =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one not reuse VarF with an appropriate LD here?

Comment on lines +165 to +168
(* For return vars given by function, context, original digest *)
module ReturnVars = Printable.Prod3 (CilType.Fundec) (C) (P)
(* For return vars given by function, context, original digest and current digest *)
module SingleReturnVars = VarDigestF (C) (P)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we need the first one?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these the set constructions? Might be worth commenting on this.

Comment on lines +59 to +60
; control_context = (fun () -> Obj.magic var.context) (** TODO: Just for testing *)
; context = (fun () -> Obj.magic var.context)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these Obj.magic now instead of Obj.obj?

M.info ~category:Analyzer "Using special for defined function %s" f.vname;
tf_special_call man f
| fd ->
(* TODO: Handle this properly by handling splitting also here. *)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What still needs to be done here? Is this about enter or combine splitting?

let return_unknown d =
let target_unknown = target_unknown d in
(* GVar.single_return *)
target_unknown
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what is the advantage of this over calling target_unknown x directly?

in

List.iter sideg_target_unkonwn r;
(* TODO: Remove need to also propagate to locals for returns *)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this TODO about?

Comment on lines +406 to +413
let set = S.LVarSet.bot () in
let add_entry set d =
let return_unknown = return_unknown d in
S.LVarSet.add return_unknown set
in
let g = GVar.return (fd, x.context, x.original_digest) in
let contrib = List.fold add_entry set r |> G.create_return in
sideg g contrib
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use a comment: It creates a set of those digests appearing for the return point and side-effects them to the appropriate helper unknown?

Comment on lines +378 to +379
let sidel_target_unknowns ds =
List.iter sidel_target_unkonwn ds
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe propagate or something like that is the better name here?


module V =
struct
(* TODO: Consider splitting unknowns by digest for constraint system in [FwdConstraints] *)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that here we still have the set construction? Which is fine, I'm just worried whether this will work together nicely.

@michael-schwarz
Copy link
Copy Markdown
Member

@arkocal Can you have a look if the test failures as you would expect them to be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants