Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6774933
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T15:50:48+00:00 2026-05-26T15:50:48+00:00

I have some serial code like this that computes word concordances i.e. counting collocated

  • 0

I have some serial code like this that computes word concordances i.e. counting collocated word pairs. The following program works except that the list of sentences is canned for illustrative purposes.

import sys
from collections import defaultdict

GLOBAL_CONCORDANCE = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: [])))

def BuildConcordance(sentences):
    global GLOBAL_CONCORDANCE
    for sentenceIndex, sentence in enumerate(sentences):
        words = [word for word in sentence.split()]

        for index, word in enumerate(words):
            for i, collocate in enumerate(words[index:len(words)]):
                GLOBAL_CONCORDANCE[word][collocate][i].append(sentenceIndex)

def main():
    sentences = ["Sentence 1", "Sentence 2", "Sentence 3", "Sentence 4"]
    BuildConcordance(sentences)
    print GLOBAL_CONCORDANCE

if __name__ == "__main__":
    main()

To me, the first for loop can be parallelized because the numbers being computed are indepedent. However, the data structure being modified is a global one.

I tried using Python’s Pool module but I am facing some pickling problems which makes me wonder if I am using the right design pattern. Can someone suggest a good way to parallelize this code?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T15:50:49+00:00Added an answer on May 26, 2026 at 3:50 pm

    In general, multiprocessing is easiest when you use a functional style. In this case, my suggestion would be to return a list of result tuples from each instance of the worker function. The extra complexity of the nested defaultdicts doesn’t really gain you anything. Something like this:

    import sys
    from collections import defaultdict
    from multiprocessing import Pool, Queue
    import re
    
    GLOBAL_CONCORDANCE = defaultdict(lambda: defaultdict(lambda: defaultdict(list)))
    
    def concordance_worker(index_sentence):
        sent_index, sentence = index_sentence
        words = sentence.split()
    
        return [(word, colo_word, colo_index, sent_index)
                for i, word in enumerate(words)
                for colo_index, colo_word in enumerate(words[i:])]
    
    def build_concordance(sentences):
        global GLOBAL_CONCORDANCE
        pool = Pool(8)
    
        results = pool.map(concordance_worker, enumerate(sentences))
    
        for result in results:
            for word, colo_word, colo_index, sent_index in result:
                GLOBAL_CONCORDANCE[word][colo_word][colo_index].append(sent_index)
    
        print len(GLOBAL_CONCORDANCE)
    
    
    def main():
        sentences = ["Sentence 1", "Sentence 2", "Sentence 3", "Sentence 4"]
        build_concordance(sentences)
    
    if __name__ == "__main__":
        main()
    

    Let me know if that doesn’t generate what you’re looking for.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have some code like this in a winforms app I was writing to
I have some code that loads the serial ports into a combo-box: List<String> tList
I have the following code on my Arduino that constantly checks for a serial
I have some SerialPort code that constantly needs to read data from a serial
i have one textfield for input some serial number code.i want set this code
I have some classes layed out like this class A { public virtual void
I have some legacy code that provides a list of the available COM ports
All, I have some complex C# code (Windows Forms) that is heavily embedded with
I have some UI in VB 2005 that looks great in XP Style, but
I have some code for starting a thread on the .NET CF 2.0: ThreadStart

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.