I have a database with 10,000 adam_id ‘s. For each adam_id , I need

Question

0

Editorial Team

Asked: June 10, 20262026-06-10T04:30:50+00:00 2026-06-10T04:30:50+00:00

I have a database with 10,000 adam_id ‘s. For each adam_id , I need

0

I have a database with 10,000 adam_id‘s. For each adam_id, I need to pull down information through an API.

My table looks like this:

`title`
- adam_id
- success (boolean)
- number_of_tries (# of times success=0 when trying to do the pull down)

Here is the function I would like to create:

def pull_down(cursor):
    work_remains = True
    while work_remains:
        cursor.execute("""SELECT adam_id FROM title WHERE success=0 
                          AND number_of_tries < 5 ORDR BY adam_id LIMIT 1""")
        if len(cursor.fetchall()):
            adam_id = cursor.fetchone()[0]
            do_api_call(adam_id)
        else:
            work_remains = False

def do_api_call(adam_id):
    # do api call
    if success:
        cursor.execute("UPDATE title SET success=1 WHERE adam_id = adam_id")
    else:
        cursor.execute("UPDATE title SET number_of_tries+=1 WHERE adam_id=adam_id")

How would I do the above with n workers using python’s multiprocessing functionality instead of doing it with one synchronous process? I have begun looking over the Multiprocessing module ( http://docs.python.org/library/multiprocessing.html ), but it seems pretty hard to digest for me thus far.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T04:30:51+00:00

If the heavy part of the work is the api call, because it goes to an outside resource, then that would be the only part you really would want to make parallel. The database calls are probably really fast. So you might try this:

Batch get the adam_id values in one query
Put the ids into a process pool to do the API calls
Get the results and commit them to the database

This is a rough pseudocode example to show the logic flow:

from multiprocessing import Pool

def pull_down(cursor):
    # get all the data in one query
    count = cursor.execute("""SELECT adam_id FROM title WHERE success=0 
                      AND number_of_tries < 5 ORDR BY adam_id LIMIT 1""")
    if count:
        # Step #1
        adam_id_list = [row[0] for row in cursor.fetchall()]

        # Step #2
        pool = Pool(4)
        results = pool.map(do_api_call, adam_id_list)
        pool.close()

        # Step #3
        update_db(results)

def do_api_call(adam_id):
    # do api call
    success = call_api_with_id(adam_id)
    return (adam_id, success)

def update_db(results):
    # loop over results and built batch queries for the success
    # or failed items

    # (obviously this split up could be optimized)
    succeeded = [result[0] for result in results if result[1]]
    failed = [result[0] for result in results if not result[1]]

    submit_success(succeeded)
    submit_failed(failed)

It would only complicate the code if you tried to make the database calls parallel, because then you have to properly give each process it’s own connection, when really it wouldn’t be the database slowing you down anyways.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a database with 10,000 adam_id ‘s. For each adam_id , I need

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply