Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9154701
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T12:28:15+00:00 2026-06-17T12:28:15+00:00

I would like to read lines from a text file and build a distance

  • 0

I would like to read lines from a text file and build a distance matrix based on Wu-Palmer distance between the words. Eg:

           House    Grass   Boat   Cat
House       x        y       ..    ..
Grass       x1       y1      ..    ..
Boat        x2       y2      ..    ..
Cat         x3       y3      ..    ..

I would like to know if there is any existing functions I can use in python to read lines from a text file and output the lines as rows and columns of the distance Matrix?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T12:28:16+00:00Added an answer on June 17, 2026 at 12:28 pm

    If your input is simply whitespace-delimited words then you can easily iterate through them like this:

    words = set()
    with open("input.txt", "r") as fd:
        for line in fd:
            words.update(line.split())
    

    The use of a set ensures that each word is only ever recorded once – it sounded like this is what you were after.

    If your input is running english text then things become a little harder because you want to catch things like “I’d” – you should also decide whether to class hyphenated words (e.g. “part-time”) as a single word – my example here does, but it’s easy to change. Much as I’m not a fan of them, this is somewhere where regular expressions are actually quite useful:

    import re
    import string
    
    non_word_re = re.compile(r"[^-\w']+")
    words = set()
    with open("input.txt", "r") as fd:
        for line in fd:
            words.update(i for i in non_word_re.split(line) if i[0] in string.letters)
    

    This will create a set of words where a group of characters is anything consisting of one or more from the set [a-zA-Z0-9_-'] and where the first character is a letter.

    After this, you can calculate the distance between each pair of words easily:

    all_distances = {}
    for word in words:
        all_distances[word] = dict((i, calculate_distance(word, i)) for i in words)
    

    There’s probably a cleaner data structure than the nested dictionaries here, but it’s simple enough that I think that would suffice.

    Finally, you can output a tab-delimited matrix something like this:

    with open("output.txt", "w") as fd:
        fd.write("\t" + "\t".join(sorted(all_distances.keys())) + "\n")
        for word1, distances in sorted(all_distances.iteritems()):
            fd.write(word1 + "\t" + "\t".join(i[1] for i in sorted(distances.iteritems())))
    

    If yuo wanted something closer to a pretty-formatted output matrix (i.e. where each column is automatically sized according to its contents) then that’s still not hard per se, but it’s a little fiddly and requires rather more code.

    As an aside, if you want to read or write files in CSV format then take a look at the Python csv module, it handles tedious things like quoting for you.

    Was that the sort of thing you were after?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I would like to read a text file and input its contents into an
I would like to take text from a file and then send it to
I have a text file. I would like to retrieve the content from one
I have a file which I would like to read data from. This is
I need to read lines from a text file but, where the 'end of
In java i would like to read a file line by line and print
I would like to read asynchronously from stdin with Qt. I don't want to
I would like to read a DICOM file in C#. I don't want to
I would like to read the last 1 megabyte of a MP3 file and
I'm wanting to read hex numbers from a text file into an unsigned integer

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.