I’m writing a concolic engine for Python using the sys.settrace() functionality. The main task

Question

0

Asked: May 25, 20262026-05-25T12:33:18+00:00 2026-05-25T12:33:18+00:00

I’m writing a concolic engine for Python using the sys.settrace() functionality. The main task

0

I’m writing a concolic engine for Python using the sys.settrace() functionality.

The main task during such kind of execution is to record the constraints on the input variables. The constraints are nothing else than the conditions of the if statements, that create two branches (the ‘then’ and the ‘else’ branch).

When an execution is complete, the engine chooses a constraint and finds appropriate values for the inputs so that the execution will go the down along the other branch (at execution x it goes the ‘then’ branch, at execution x+1 it goes along the ‘else’ branch).

This is to have a bit of context on why I doing what I’m trying to do…

By combining settrace() and the dis module, I get to see the bytecode of each source line, just before it is executed. This way I can easily record the if conditions as they appear during execution.

But then I have the big problem. I need to know which way the if went, which branch the execution took. So if my code is something like:

if x > a:
  print x
else:
  print a

at a certain point my tracing thing will see:

t: if x > 0:

then the python interpreter will execute the if and jump (or not) somewhere. And I will see:

t + 1: print x

So is the instruction t + 1 in the "then" branch or in the "else" one? Keep in mind the trace function sees only some bytecode in the current block.

I know of two way to do this. One is to evaluate the condition to see exactly whether it is true or false. This works only if there are no side-effects.

The other way is to try to look and the instruction pointer at t + 1 and try to understand where we are in the code. This is the way I am using right now, but it very delicate because at t + 1 I could find myself somewhere completely different (another module, a builtin function, etc).

So finally, the question I have is this: is there a way to get from Python itself, or from a C module/extension/whatever, the result of the last conditional jump?

In alternative, are there more fine-grained tracing options? Something like execute bytecode one opcode at a time. With the settrace() functionality the maximum resolution I get is whole source code lines.

In the worst case, I think I can modify the Python interpreter to expose such information, but I would leave that as last resort, for obvious reasons.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T12:33:19+00:00

In the end this is what I did. I implemented the AST instrumentation and it works pretty well.

By playing with the AST, you need to move all function calls (also attributes and subscriptions, due to getattr() and friends, out of the if conditions by creating temporary variables. Also you need to split the and and or operators.

Then add a call to your own function at the beginning of each branch, with a boolean parameter, True for the then branch and False for the else branch.

After that I wrote an AST to source converter (there is one somewhere on the net, but does not work on current Python versions).

Working with the AST is very easy and quite simple, I ended up doing three transformation passes, adding also some import statements.

This is the first pass, as an example. It splits if conditions if they contain or or and operators:

class SplitBoolOpPass1(ast.NodeTransformer):
  def visit_If(self, node):
      while isinstance(node.test, ast.BoolOp):
        new_node = ast.If(test=node.test.values.pop(), body=node.body, orelse=node.orelse)
        if isinstance(node.test.op, ast.And):
          if len(node.test.values) == 1:
            node.test = node.test.values[0]
          node.body = [new_node]
        else:
          if len(node.test.values) == 1:
            node.test = node.test.values[0]
          node.orelse = [new_node]
      node = self.generic_visit(node) # recusion
      return node

Probably it is not very useful for code coverage applications because it messes up with the code quite a lot.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m writing a concolic engine for Python using the sys.settrace() functionality. The main task

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply