Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7729241
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T05:55:08+00:00 2026-06-01T05:55:08+00:00

I know how function comparison works in Python 3 (just comparing address in memory),

  • 0

I know how function comparison works in Python 3 (just comparing address in memory), and I understand why.

I also understand that “true” comparison (do functions f and g return the same result given the same arguments, for any arguments?) is practically impossible.

I am looking for something in between. I want the comparison to work on the simplest cases of identical functions, and possibly some less trivial ones:

lambda x : x == lambda x : x # True
lambda x : 2 * x == lambda y : 2 * y # True
lambda x : 2 * x == lambda x : x * 2 # True or False is fine, but must be stable
lambda x : 2 * x == lambda x : x + x # True or False is fine, but must be stable

Note that I’m interested in solving this problem for anonymous functions (lambda), but wouldn’t mind if the solution also works for named functions.

The motivation for this is that inside blist module, it would be nice to verify that two sortedset instances have the same sort function before performing a union, etc. on them.

Named functions are of less interest because I can assume them to be different when they are not identical. After all, suppose someone created two sortedsets with a named function in the key argument. If they intend these instances to be “compatible” for the purposes of set operations, they’d probably use the same function, rather than two separate named functions that perform identical operations.

I can only think of three approaches. All of them seem hard, so any ideas appreciated.

  1. Comparing bytecodes might work but it might be annoying that it’s implementation dependent (and hence the code that worked on one Python breaks on another).

  2. Comparing tokenized source code seems reasonable and portable. Of course, it’s less powerful (since identical functions are more likely to be rejected).

  3. A solid heuristic borrowed from some symbolic computation textbook is theoretically the best approach. It might seem too heavy for my purpose, but it actually could be a good fit since lambda functions are usually tiny and so it would run fast.

EDIT

A more complicated example, based on the comment by @delnan:

# global variable
fields = ['id', 'name']

def my_function():
  global fields
  s1 = sortedset(key = lambda x : x[fields[0].lower()])
  # some intervening code here
  # ...
  s2 = sortedset(key = lambda x : x[fields[0].lower()])

Would I expect the key functions for s1 and s2 to evaluate as equal?

If the intervening code contains any function call at all, the value of fields may be modified, resulting in different key functions for s1 and s2. Since we clearly won’t be doing control flow analysis to solve this problem, it’s clear that we have to evaluate these two lambda functions as different, if we are trying to perform this evaluation before runtime. (Even if fields wasn’t global, it might have been had another name bound to it, etc.) This would severely curtail the usefulness of this whole exercise, since few lambda functions would have no dependence on the environment.

EDIT 2:

I realized it’s very important to compare the function objects as they exist in runtime. Without that, all the functions that depend on variables from outer scope cannot be compared; and most useful functions do have such dependencies. Considered in runtime, all functions with the same signature are comparable in a clean, logical way, regardless of what they depend on, whether they are impure, etc.

As a result, I need not just the bytecode but also the global state as of the time the function object was created (presumably __globals__). Then I have to match all variables from outer scope to the values from __globals__.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T05:55:10+00:00Added an answer on June 1, 2026 at 5:55 am

    Edited to check whether external state will affect the sorting function as well as if the two functions are equivalent.


    I hacked up dis.dis and friends to output to a global file-like object. I then stripped out line numbers and normalized variable names (without touching constants) and compared the result.

    You could clean this up so dis.dis and friends yielded out lines so you wouldn’t have to trap their output. But this is a working proof-of-concept for using dis.dis for function comparison with minimal changes.

    import types
    from opcode import *
    _have_code = (types.MethodType, types.FunctionType, types.CodeType,
                  types.ClassType, type)
    
    def dis(x):
        """Disassemble classes, methods, functions, or code.
    
        With no argument, disassemble the last traceback.
    
        """
        if isinstance(x, types.InstanceType):
            x = x.__class__
        if hasattr(x, 'im_func'):
            x = x.im_func
        if hasattr(x, 'func_code'):
            x = x.func_code
        if hasattr(x, '__dict__'):
            items = x.__dict__.items()
            items.sort()
            for name, x1 in items:
                if isinstance(x1, _have_code):
                    print >> out,  "Disassembly of %s:" % name
                    try:
                        dis(x1)
                    except TypeError, msg:
                        print >> out,  "Sorry:", msg
                    print >> out
        elif hasattr(x, 'co_code'):
            disassemble(x)
        elif isinstance(x, str):
            disassemble_string(x)
        else:
            raise TypeError, \
                  "don't know how to disassemble %s objects" % \
                  type(x).__name__
    
    def disassemble(co, lasti=-1):
        """Disassemble a code object."""
        code = co.co_code
        labels = findlabels(code)
        linestarts = dict(findlinestarts(co))
        n = len(code)
        i = 0
        extended_arg = 0
        free = None
        while i < n:
            c = code[i]
            op = ord(c)
            if i in linestarts:
                if i > 0:
                    print >> out
                print >> out,  "%3d" % linestarts[i],
            else:
                print >> out,  '   ',
    
            if i == lasti: print >> out,  '-->',
            else: print >> out,  '   ',
            if i in labels: print >> out,  '>>',
            else: print >> out,  '  ',
            print >> out,  repr(i).rjust(4),
            print >> out,  opname[op].ljust(20),
            i = i+1
            if op >= HAVE_ARGUMENT:
                oparg = ord(code[i]) + ord(code[i+1])*256 + extended_arg
                extended_arg = 0
                i = i+2
                if op == EXTENDED_ARG:
                    extended_arg = oparg*65536L
                print >> out,  repr(oparg).rjust(5),
                if op in hasconst:
                    print >> out,  '(' + repr(co.co_consts[oparg]) + ')',
                elif op in hasname:
                    print >> out,  '(' + co.co_names[oparg] + ')',
                elif op in hasjrel:
                    print >> out,  '(to ' + repr(i + oparg) + ')',
                elif op in haslocal:
                    print >> out,  '(' + co.co_varnames[oparg] + ')',
                elif op in hascompare:
                    print >> out,  '(' + cmp_op[oparg] + ')',
                elif op in hasfree:
                    if free is None:
                        free = co.co_cellvars + co.co_freevars
                    print >> out,  '(' + free[oparg] + ')',
            print >> out
    
    def disassemble_string(code, lasti=-1, varnames=None, names=None,
                           constants=None):
        labels = findlabels(code)
        n = len(code)
        i = 0
        while i < n:
            c = code[i]
            op = ord(c)
            if i == lasti: print >> out,  '-->',
            else: print >> out,  '   ',
            if i in labels: print >> out,  '>>',
            else: print >> out,  '  ',
            print >> out,  repr(i).rjust(4),
            print >> out,  opname[op].ljust(15),
            i = i+1
            if op >= HAVE_ARGUMENT:
                oparg = ord(code[i]) + ord(code[i+1])*256
                i = i+2
                print >> out,  repr(oparg).rjust(5),
                if op in hasconst:
                    if constants:
                        print >> out,  '(' + repr(constants[oparg]) + ')',
                    else:
                        print >> out,  '(%d)'%oparg,
                elif op in hasname:
                    if names is not None:
                        print >> out,  '(' + names[oparg] + ')',
                    else:
                        print >> out,  '(%d)'%oparg,
                elif op in hasjrel:
                    print >> out,  '(to ' + repr(i + oparg) + ')',
                elif op in haslocal:
                    if varnames:
                        print >> out,  '(' + varnames[oparg] + ')',
                    else:
                        print >> out,  '(%d)' % oparg,
                elif op in hascompare:
                    print >> out,  '(' + cmp_op[oparg] + ')',
            print >> out
    
    def findlabels(code):
        """Detect all offsets in a byte code which are jump targets.
    
        Return the list of offsets.
    
        """
        labels = []
        n = len(code)
        i = 0
        while i < n:
            c = code[i]
            op = ord(c)
            i = i+1
            if op >= HAVE_ARGUMENT:
                oparg = ord(code[i]) + ord(code[i+1])*256
                i = i+2
                label = -1
                if op in hasjrel:
                    label = i+oparg
                elif op in hasjabs:
                    label = oparg
                if label >= 0:
                    if label not in labels:
                        labels.append(label)
        return labels
    
    def findlinestarts(code):
        """Find the offsets in a byte code which are start of lines in the source.
    
        Generate pairs (offset, lineno) as described in Python/compile.c.
    
        """
        byte_increments = [ord(c) for c in code.co_lnotab[0::2]]
        line_increments = [ord(c) for c in code.co_lnotab[1::2]]
    
        lastlineno = None
        lineno = code.co_firstlineno
        addr = 0
        for byte_incr, line_incr in zip(byte_increments, line_increments):
            if byte_incr:
                if lineno != lastlineno:
                    yield (addr, lineno)
                    lastlineno = lineno
                addr += byte_incr
            lineno += line_incr
        if lineno != lastlineno:
            yield (addr, lineno)
    
    class FakeFile(object):
        def __init__(self):
            self.store = []
        def write(self, data):
            self.store.append(data)
    
    a = lambda x : x
    b = lambda x : x # True
    c = lambda x : 2 * x
    d = lambda y : 2 * y # True
    e = lambda x : 2 * x
    f = lambda x : x * 2 # True or False is fine, but must be stable
    g = lambda x : 2 * x
    h = lambda x : x + x # True or False is fine, but must be stable
    
    funcs = a, b, c, d, e, f, g, h
    
    outs = []
    for func in funcs:
        out = FakeFile()
        dis(func)
        outs.append(out.store)
    
    import ast
    
    def outfilter(out):
        for i in out:
            if i.strip().isdigit():
                continue
            if '(' in i:
                try:
                    ast.literal_eval(i)
                except ValueError:
                    i = "(x)"
            yield i
    
    processed_outs = [(out, 'LOAD_GLOBAL' in out or 'LOAD_DECREF' in out)
                                for out in (''.join(outfilter(out)) for out in outs)]
    
    for (out1, polluted1), (out2, polluted2) in zip(processed_outs[::2], processed_outs[1::2]):
        print 'Bytecode Equivalent:', out1 == out2, '\nPolluted by state:', polluted1 or polluted2
    

    The output is True, True, False, and False and is stable. The “Polluted” bool is true if the output will depend on external state — either global state or a closure.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I know that a function pointer stores the address of a function. int fun(int
Anyone know what function or file in linux holds the algorithm that finds a
I know that if I am inside some function foo() which is called somewhere
Anyone know of a nice efficient function that could convert, for example: HelloWorld -->
I know the function count() of php, but what's the function for counting how
Do you know any function implementation I could use for decrypting data encrypted by
I know Matlab has a function called cylinder to create the points for a
I know there's a function to obtain the hash value from a varchar in
I know there is the current() function to retrieve the current node in XSL,
I'm still learning Python, and I'd love to know a way to make the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.