Skip to content

Panic on invalid programs and bad seed inputs#94

Draft
morehouse wants to merge 3 commits into
masterfrom
panic_on_invalid_program
Draft

Panic on invalid programs and bad seed inputs#94
morehouse wants to merge 3 commits into
masterfrom
panic_on_invalid_program

Conversation

@morehouse
Copy link
Copy Markdown
Owner

Demonstrates how we can simplify our code using the approach described in #92 (comment).

Pros:

Cons:

  • our custom mutator library is now required for IR fuzzing
  • hand-created seeds or seeds from previous versions of the IR are likely to trigger panics instead of silently being skipped

Lightly tested with a 10 minute LDK fuzzing session. If we decide we want to do this, we'll need to test longer runs of all 4 targets.

morehouse added 3 commits May 22, 2026 13:13
Our custom mutators are the only supported producers of IR programs and
are designed to *always* create valid programs.  Thus validating
programs on every mutation iteration is useless work for all supported
use cases.  The only time it's helpful is when not using our custom
mutators or when the seed corpus was generated by something other than
the current custom mutators.

Subsequent commits will remove program validation entirely, instead
relying on the ProgramBuilder and executor panicking when programs are
invalid.
Now that smite-ir-mutator no longer calls Program::validate, nothing
else uses it.
Make the executor treat ill-formed IR programs as panics rather than
recoverable errors.  We now require our AFL++ custom mutator shim to be
used when fuzzing the IR scenario -- using the default AFL++ mutators or
any other mutator library is highly likely to produce ill-formed
programs that trigger panics.

A major benefit of this approach is that any bugs in our mutators or
generators can surface as panics rather than being hidden as errors that
the executor recovered from.  The tradeoff is that manually-created
seeds or even seeds from previous versions of Smite will cause panics
rather than being silently discarded.
@Chand-ra
Copy link
Copy Markdown

Chand-ra commented May 23, 2026

Our custom mutators are the only supported producers of IR programs and
are designed to always create valid programs. Thus validating
programs on every mutation iteration is useless work for all supported
use cases.

I'm not sure how future proof this is. I think this approach will force mutators to do two things:

  1. Mutate a given program.
  2. Verify that you never produce an invalid mutation.

IMO, mutators should be dumb and try to do only (1). I guess doing (2) is fine as well (and we already do it), as long as we're talking about structural validity, i.e. validity of the program graph, because the information required to enforce it is already provided to the mutators (in the form of the program itself).

But as discussed in #75, once we start implementing more advanced mutation schemes, this means we will have to maintain a rulebook of sorts for every new mutator that tries to modify the semantic structure of the program.

Subsequent commits will remove program validation entirely, instead
relying on the ProgramBuilder and executor panicking when programs are
invalid.

Okay, so the "rulebook" is supposed to reside in the executor. I don't feel very strongly about the executor panicking or not panicking, but I think it makes more sense for Program to dictate what a valid version of it looks like.

@NishantBansal2003
Copy link
Copy Markdown
Contributor

I think with this we get slightly faster code, since all the validation paths are gone. Previously, we had to maintain the validity of IR programs for the custom mutator anyway, and the same still holds now. So the main thing we lose is the ability to use AFL++’s default mutators, which were not very helpful anyway (especially once we have more interesting mutators in smite).

I would lean towards the simpler and faster approach here, and be explicit that IR programs should use a custom mutator instead of wasting cycles on Skip. I think optimizations and performance improvements on the IR or mutator side are much more likely to uncover target bugs than relying on AFL++’s default mutators.

The tradeoff is that manually-created
seeds or even seeds from previous versions of Smite will cause panics
rather than being silently discarded.

Though right now, adding a new operation is not backward compatible anyway, so previously collected corpora would already become invalid. Because of that, I don’t think resuming old corpora after adding new operations would help much for further fuzzing, so it is probably fine for them to be paniced/discarded completely.

Copy link
Copy Markdown
Contributor

@ekzyis ekzyis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our custom mutator library is now required for IR fuzzing

nit: Could add a check that AFL_CUSTOM_MUTATOR_LIBRARY (and AFL_CUSTOM_MUTATOR_ONLY) is set, and if not, at least print a warning?

let key_bytes = resolve_private_key(&variables, instr.inputs[0]);
let sk = SecretKey::from_slice(&key_bytes)
.map_err(|_| ExecuteError::InvalidPrivateKey)?;
.expect("mutator produced an invalid private key");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, the message suggests that we're expecting the mutator to produce an invalid private key (but it didn't), while we're actually expecting that it didn't (but it did). So similar to how expect() is used in other places, I suggest this:

Suggested change
.expect("mutator produced an invalid private key");
.expect("valid private key");

@morehouse
Copy link
Copy Markdown
Owner Author

I'm not sure how future proof this is. I think this approach will force mutators to do two things:

1. Mutate a given program.

2. Verify that you never produce an invalid mutation.

IMO, mutators should be dumb and try to do only (1). I guess doing (2) is fine as well (and we already do it), as long as we're talking about structural validity, i.e. validity of the program graph, because the information required to enforce it is already provided to the mutators (in the form of the program itself).

But as discussed in #75, once we start implementing more advanced mutation schemes, this means we will have to maintain a rulebook of sorts for every new mutator that tries to modify the semantic structure of the program.

Subsequent commits will remove program validation entirely, instead
relying on the ProgramBuilder and executor panicking when programs are
invalid.

Okay, so the "rulebook" is supposed to reside in the executor. I don't feel very strongly about the executor panicking or not panicking, but I think it makes more sense for Program to dictate what a valid version of it looks like.

I think we are in agreement here.

I'm using the term valid in the structural sense -- i.e. a valid program is runnable by the executor. I want mutators to always produce runnable programs but otherwise be ignorant of the higher-level LN protocol semantics (e.g., message sequencing). I think generators are better suited to encapsulate the semantics, and mutators should be free to break them.

The main changes proposed in this PR are really:

  1. Remove Program::validate since it is really just an approximation of the validation done during execute.
  2. Make execute panic on any invalidly-structured program instead of returning an error.

@Chand-ra
Copy link
Copy Markdown

I'm using the term valid in the structural sense -- i.e. a valid program is runnable by the executor. I want mutators to always produce runnable programs but otherwise be ignorant of the higher-level LN protocol semantics (e.g., message sequencing). I think generators are better suited to encapsulate the semantics, and mutators should be free to break them.

Right.

The main changes proposed in this PR are really:

  1. Remove Program::validate since it is really just an approximation of the validation done during execute.

Okay, this is where I think the friction arises from. Both Program::validate() and execute() protect against structurally invalid programs for now, but in the future we will need to add validation rules for protocol semantics of LN as well, and Program::validate() is much better equipped to handle that than execute().

  1. Make execute panic on any invalidly-structured program instead of returning an error.

Yeah, I don't feel too strongly about this, either way is fine.

@morehouse
Copy link
Copy Markdown
Owner Author

Okay, this is where I think the friction arises from. Both Program::validate() and execute() protect against structurally invalid programs for now, but in the future we will need to add validation rules for protocol semantics of LN as well, and Program::validate() is much better equipped to handle that than execute().

The main semantic rule I foresee needing to follow at this point is not trying to receive a message that will never come (e.g., RecvAcceptChannel without a corresponding send of OpenChannel). And the only reason for that is to avoid really slow fuzzing where we're just waiting 99% of the time. Currently such programs are not structurally invalid, but with the upcoming affine types (#97) they will be, and the executor can panic if an affine type doesn't resolve properly.

And assuming mutators will always follow the affine type rules, I don't think there's any substantial benefit from having Program::validate in addition to the validation in execute. Mutators will always generate structurally valid programs anyway.

@Chand-ra
Copy link
Copy Markdown

Chand-ra commented May 28, 2026

The main semantic rule I foresee needing to follow at this point is not trying to receive a message that will never come (e.g., RecvAcceptChannel without a corresponding send of OpenChannel). And the only reason for that is to avoid really slow fuzzing where we're just waiting 99% of the time. Currently such programs are not structurally invalid, but with the upcoming affine types (#97) they will be,

True, for now we're mostly trying to avoid receiving a message that the target won't send and #97 should solve that. But there's at least one more case I can think of where we'll need to add additional semantic rules, or modify the current one:

SpliceMutator mismatches payload and state token (discussed very briefly in this comment).

The current solution of 'custom sends + affine state variables' cannot prevent this, and this is something that will come into play when we deal with concurrent operations for multiple channels (planned for Milestone 9+).

This is just one example that I ran into while conducting a brief mental review of the affine architecture, and it's very possible that more such cases exist, perhaps even higher up in the Milestones. This is why I'd prefer the architecture to be as accommodating of additional semantic rules as possible.

and the executor can panic if an affine type doesn't resolve properly.

And assuming mutators will always follow the affine type rules, I don't think there's any substantial benefit from having Program::validate in addition to the validation in execute. Mutators will always generate structurally valid programs anyway.

They will, but not in the same sense as the structural rules. We need to allow the mutators to break semantics and catch these violations in a separate step: somewhere it makes sense to encode LN's protocol semantics. Panicking isn't useful here because there's nothing to "catch early and fix".

Unless of course, we do something like this in the advanced mutators:

fn mutate(program: mut& Program) -> bool {
   // mutation logic...
   return program.validate().is_ok();
}

But we still need to validate the program so...

Edit: Thought about this for some time and maybe there's a middle ground: replace Program::validate() with Progam::validate_semantics() that only checks for semantic errors, and let the executor panic on structural errors?

Since we need to perform an O(N^2) pass over instructions to check for semantic errors, there won't be much of a performance gain with this new validation, but I guess it would simplify the code a bit.

@morehouse
Copy link
Copy Markdown
Owner Author

SpliceMutator mismatches payload and state token (discussed very briefly in this comment).

I don't understand this issue. Can you explain it in detail?

The current solution of 'custom sends + affine state variables' cannot prevent this, and this is something that will come into play when we deal with concurrent operations for multiple channels (planned for Milestone 9+).

Do you mean that we would send a message intended for one channel and attempt to receive a message for a different channel? I don't think it's an issue -- the receive shouldn't time out regardless of which channel the original message was for.

and the executor can panic if an affine type doesn't resolve properly.
And assuming mutators will always follow the affine type rules, I don't think there's any substantial benefit from having Program::validate in addition to the validation in execute. Mutators will always generate structurally valid programs anyway.

They will, but not in the same sense as the structural rules. We need to allow the mutators to break semantics and catch these violations in a separate step: somewhere it makes sense to encode LN's protocol semantics. Panicking isn't useful here because there's nothing to "catch early and fix".

Again, I don't think we want mutators validating semantics at all. They should only care about the program being executable.

Unless of course, we do something like this in the advanced mutators:

fn mutate(program: mut& Program) -> bool {
   // mutation logic...
   return program.validate().is_ok();
}

Besides the existence of a validate function, this is in fact exactly what the intended API is. mutate is only allowed to create valid programs, and if it can't do that, it must return false. But so far all mutators do this by construction -- they don't need a separate validation function.

Edit: Thought about this for some time and maybe there's a middle ground: replace Program::validate() with Progam::validate_semantics() that only checks for semantic errors, and let the executor panic on structural errors?

Since we need to perform an O(N^2) pass over instructions to check for semantic errors, there won't be much of a performance gain with this new validation, but I guess it would simplify the code a bit.

Yeah, I'm still not convinced that we should be validating semantics at all. We shouldn't care about semantics at the program/IR/mutator level. Generators are much better suited to create semantically-valid flows.

@Chand-ra
Copy link
Copy Markdown

Chand-ra commented May 29, 2026

SpliceMutator mismatches payload and state token (discussed very briefly in this comment).

I don't understand this issue. Can you explain it in detail?

The current solution of 'custom sends + affine state variables' cannot prevent this, and this is something that will come into play when we deal with concurrent operations for multiple channels (planned for Milestone 9+).

Do you mean that we would send a message intended for one channel and attempt to receive a message for a different channel? I don't think it's an issue -- the receive shouldn't time out regardless of which channel the original message was for.

With the Recv operations no longer outputting an affine type, as discussed in this comment, this should no longer be an issue.

and the executor can panic if an affine type doesn't resolve properly.
And assuming mutators will always follow the affine type rules, I don't think there's any substantial benefit from having Program::validate in addition to the validation in execute. Mutators will always generate structurally valid programs anyway.

They will, but not in the same sense as the structural rules. We need to allow the mutators to break semantics and catch these violations in a separate step: somewhere it makes sense to encode LN's protocol semantics. Panicking isn't useful here because there's nothing to "catch early and fix".

Again, I don't think we want mutators validating semantics at all. They should only care about the program being executable.

Exactly.

Unless of course, we do something like this in the advanced mutators:

fn mutate(program: mut& Program) -> bool {
   // mutation logic...
   return program.validate().is_ok();
}

Besides the existence of a validate function, this is in fact exactly what the intended API is. mutate is only allowed to create valid programs, and if it can't do that, it must return false. But so far all mutators do this by construction -- they don't need a separate validation function.

Right, mutators create structurally valid programs by construction. But to solve issue #75, they need to respect semantic rules as well (which is just "no Recv before Send" for now). Making the mutators respect this by construction will turn out to be difficult: every mutator would have to implement validate_semantics() in some way (and it will go against the "mutators should only care about the program being executable" discussion we just had).

Edit: Thought about this for some time and maybe there's a middle ground: replace Program::validate() with Progam::validate_semantics() that only checks for semantic errors, and let the executor panic on structural errors?
Since we need to perform an O(N^2) pass over instructions to check for semantic errors, there won't be much of a performance gain with this new validation, but I guess it would simplify the code a bit.

Yeah, I'm still not convinced that we should be validating semantics at all. We shouldn't care about semantics at the program/IR/mutator level. Generators are much better suited to create semantically-valid flows.

I'm not sure I understand.

The entire premise behind validating semantics (again, which is just "no Recv before Send" for now) is to prevent mutators from throwing the program in a state that causes timeouts (that is, violates the semantics). To do this, we need to either:

  1. Force the mutators to always respect the rule by construction.
  2. Check for semantic-validity after each mutation.

How do we do either without encoding the semantic rule at program/IR/mutator level? I understand that it is Generators that are supposed to deal with intricate flows, but that has nothing to with preserving mutation validity.

@morehouse
Copy link
Copy Markdown
Owner Author

Right, mutators create structurally valid programs by construction. But to solve issue #75, they need to respect semantic rules as well (which is just "no Recv before Send" for now).

We're talking past each other. By creating affine types with their own single-use rule, we are turning the limited semantic knowledge we do care about into a structural rule in the IR itself. Mutators and generators will need to respect the new rule. Other semantics will not be structurally encoded in the IR and should be completely ignored by mutators.

Making the mutators respect this by construction will turn out to be difficult: every mutator would have to implement validate_semantics() in some way (and it will go against the "mutators should only care about the program being executable" discussion we just had).

All we need is to enforce the new IR single-use rule, which will be part of the IR just like SSA-form and the type system we currently have. I don't think it's unreasonable or difficult to do this in every mutator, just like we currently do for preserving the SSA rules and type correctness.

  • OperationParamMutator: no changes necessary.
  • InputSwapMutator: one tiny change needed -- remove affine-type inputs from the candidate list.
  • InstructionReorderMutator: no changes necessary.
  • InstructionDeleteMutator: one extra pass needed when an affine-type producing instruction is selected for deletion plus some tweaks to the healing logic.
  • GeneratorInsertionMutator: no changes necessary.
  • SpliceMutator: very likely no changes necessary.

When it's this easy to follow the structural rules, doing the shotgun approach (creating invalid mutations and catching them with validate) seems a bit ridiculous.

The entire premise behind validating semantics (again, which is just "no Recv before Send" for now) is to prevent mutators from throwing the program in a state that causes timeouts (that is, violates the semantics). To do this, we need to either:

1. Force the mutators to always respect the rule by construction.

2. Check for semantic-validity after each mutation.

Yep -- I think it's best to do (1) for the new affine type single-use rule, which will become a structural part of the IR. Calling a validate_semantics function is counterintuitive when we're actually checking a structural rule, and we don't use validate this way for the other structural rules either -- SSA form and type-correctness are guaranteed by all current mutators.

@Chand-ra
Copy link
Copy Markdown

We're talking past each other. By creating affine types with their own single-use rule, we are turning the limited semantic knowledge we do care about into a structural rule in the IR itself. Mutators and generators will need to respect the new rule. Other semantics will not be structurally encoded in the IR and should be completely ignored by mutators.

Oh okay, so you meant making the mutators respect every structural rule in the IR (which now includes the one semantic rule we've encoded as a structural one).

All we need is to enforce the new IR single-use rule, which will be part of the IR just like SSA-form and the type system we currently have. I don't think it's unreasonable or difficult to do this in every mutator, just like we currently do for preserving the SSA rules and type correctness.

  • OperationParamMutator: no changes necessary.
  • InputSwapMutator: one tiny change needed -- remove affine-type inputs from the candidate list.
  • InstructionReorderMutator: no changes necessary.

True.

  • InstructionDeleteMutator: one extra pass needed when an affine-type producing instruction is selected for deletion plus some tweaks to the healing logic.

We can defer deleting an affine producer until it's consumer exists, which should be integrable in the type-matching pass we already perform.

  • GeneratorInsertionMutator: no changes necessary.

True, since Generators supposed to be sources of truth: they only produce valid programs.

  • SpliceMutator: very likely no changes necessary.

The changes needed for this mutator should already be encoded in ProgramBuilder::pick_variable()'s scheme for affine types. So true, most likely no changes here.

When it's this easy to follow the structural rules, doing the shotgun approach (creating invalid mutations and catching them with validate) seems a bit ridiculous.

Hmm, since we've modified the IR to no longer have Recv* operations output an affine type, the affine rules have flattened quite a bit. As long as we'll not need to add more semantic rules (which you say we won't), I agree, making the mutators respect the single affine rule by construction wouldn't be too difficult.

The entire premise behind validating semantics (again, which is just "no Recv before Send" for now) is to prevent mutators from throwing the program in a state that causes timeouts (that is, violates the semantics). To do this, we need to either:

1. Force the mutators to always respect the rule by construction.

2. Check for semantic-validity after each mutation.

Yep -- I think it's best to do (1) for the new affine type single-use rule, which will become a structural part of the IR. Calling a validate_semantics function is counterintuitive when we're actually checking a structural rule, and we don't use validate this way for the other structural rules either -- SSA form and type-correctness are guaranteed by all current mutators.

And make the executor panic in case of a violation? Since mutators are now supposed to never violate rules (affine or not affine), I guess that makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants