Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1107953
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T02:00:34+00:00 2026-05-17T02:00:34+00:00

This particular problem is easy to solve, but I’m not so sure that the

  • 0

This particular problem is easy to solve, but I’m not so sure that the solution I’d arrive at would be computationally efficient. So I’m asking the experts!

What would be the best way to go through a large file, collecting stats (for the entire file) on how often two words occur in the same line?

For instance, if the text contained only the following two lines:

“This is the white baseball.”
“These guys have white baseball bats.”

You would end up collecting the following stats:
(this, is: 1), (this, the: 1), (this, white: 1), (this, baseball: 1), (is, the: 1), (is, white: 1), (is, baseball: 1) … and so forth.

For the entry (baseball, white: 2), the value would be 2, since this pair of words occurs in the same line a total of 2 times.

Ideally, the stats should be placed in a dictionary, where the keys are alphabetized at the tuple level (i.e., you wouldn’t want separate entries for “this, is” and “is, this.” We don’t care about order here: we just want to find how often each possible pair of words occurs in the same line throughout the text.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T02:00:35+00:00Added an answer on May 17, 2026 at 2:00 am
    from collections import defaultdict
    import itertools as it
    import re
    
    pairs = defaultdict(int)
    
    for line in lines:
        for pair in it.combinations(re.findall('\w+', line), 2):
            pairs[tuple(pair)] += 1
    
    resultList = [pair + (occurences, ) for pair, occurences in pairs.iterkeys()]
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This particular example relates to Django in Python, but should apply to any language
In this particular situation, there are 9 automated steps in a process that take
There seem to many ways to skin this particular cat - but which is
I haven't found an answer to this particular question; perhaps there isn't one. But
This question is for the java language in particular. I understand that there is
Ok I was a little unsure on how best name this problem :) But
In this particular case I'm trying to discover if a mylib.a file is 32
I'm getting into ASP.NET (C# - I know it doesn't matter for this particular
We'll soon be embarking on the development of a new mobile application. This particular
I want to do this (no particular language): print(foo.objects.bookdb.books[12].title); or this: book = foo.objects.bookdb.book.new();

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.