I am trying to write a function that will filter a list of tuples

Question

0

Asked: May 31, 20262026-05-31T16:59:59+00:00 2026-05-31T16:59:59+00:00

I am trying to write a function that will filter a list of tuples

0

I am trying to write a function that will filter a list of tuples (mimicing an in-memory database), using a “nearest neighbour” or “nearest match” type algorithim.

I want to know the best (i.e. most Pythonic) way to go about doing this. The sample code below hopefully illustrates what I am trying to do.

datarows = [(10,2.0,3.4,100),
            (11,2.0,5.4,120),
            (17,12.9,42,123)]

filter_record = (9,1.9,2.9,99) # record that we are seeking to retrieve from 'database' (or nearest match)
weights = (1,1,1,1) # weights to approportion to each field in the filter

def get_nearest_neighbour(data, criteria, weights):
    for each row in data:
        # calculate 'distance metric' (e.g. simple differencing) and multiply by relevant weight
    # determine the row which was either an exact match or was 'least dissimilar'
    # return the match (or nearest match)
    pass

if __name__ == '__main__':
    result = get_nearest_neighbour(datarow, filter_record, weights)
    print result

For the snippet above, the output should be:

(10,2.0,3.4,100)

since it is the ‘nearest’ to the sample data passed to the function get_nearest_neighbour().

My question then is, what is the best way to implement get_nearest_neighbour()?. For the purpose of brevity etc, assume that we are only dealing with numeric values, and that the ‘distance metric’ we use is simply an arithmentic subtraction of the input data from the current row.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T17:00:00+00:00

Simple out-of-the-box solution:

import math

def distance(row_a, row_b, weights):
    diffs = [math.fabs(a-b) for a,b in zip(row_a, row_b)]
    return sum([v*w for v,w in zip(diffs, weights)])

def get_nearest_neighbour(data, criteria, weights):
    def sort_func(row):
        return distance(row, criteria, weights)
    return min(data, key=sort_func)

If you’d need to work with huge datasets, you should consider switching to Numpy and using Numpy’s KDTree to find nearest neighbors. Advantage of using Numpy is that not only it uses more advanced algorithm, but also it’s implemented a top of highly optimized LAPACK (Linear Algebra PACKage).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to write a function that will filter a list of tuples

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply