Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7893789
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T07:13:26+00:00 2026-06-03T07:13:26+00:00

The following program has been running for about ~22 hours on two files (txt,

  • 0

The following program has been running for about ~22 hours on two files (txt, ~10MB ea.). Each file has about ~100K rows. Can someone give me an indication of how inefficient my code is and perhaps a faster method. The input dict are ordered and preserving order is necessary:

import collections

def uniq(input):
  output = []
  for x in input:
    if x not in output:
      output.append(x)
  return output

Su = {}
with open ('Sucrose_rivacombined.txt') as f:
    for line in f:
        (key, val) = line.split('\t')
        Su[(key)] = val
    Su_OD = collections.OrderedDict(Su)

Su_keys = Su_OD.keys()
Et = {}

with open ('Ethanol_rivacombined.txt') as g:
    for line in g:
        (key, val) = line.split('\t')
        Et[(key)] = val
    Et_OD = collections.OrderedDict(Et)

Et_keys = Et_OD.keys()

merged_keys = Su_keys + Et_keys
merged_keys =  uniq(merged_keys)

d3=collections.OrderedDict()

output_doc = open("compare.txt","w+")

for chr_local in merged_keys:
    line_output = chr_local
    if (Et.has_key(chr_local)):
        line_output = line_output + "\t" + Et[chr_local]
    else:
        line_output = line_output + "\t" + "ND"
    if (Su.has_key(chr_local)):
        line_output = line_output + "\t" + Su[chr_local]
    else:
        line_output = line_output + "\t" + "ND"

    output_doc.write(line_output + "\n")

The input files are as follows: not every key is present in both files

Su:
chr1:3266359    80.64516129
chr1:3409983    100
chr1:3837894    75.70093458
chr1:3967565    100
chr1:3977957    100


Et:
chr1:3266359    95
chr1:3456683    78
chr1:3837894    54.93395855
chr1:3967565    100
chr1:3976722    23

I would like the output to look as follows:

chr1:3266359    80.645    95
chr1:3456683    ND        78
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T07:13:27+00:00Added an answer on June 3, 2026 at 7:13 am

    You don’t need your unique function.

    pseudo code like:

    1. read file 2 as OrderedDict
    2. process file 1 writing out it’s item (already ordered correctly)
    3. pop, with defalut from file 2 for last part of the output line
    4. after file one is consumed process the Ordered dict from file 2

    Also, love list comprehensions…you can read the file with:

    OrderedDict(line.strip().split('\t') for line in open('Ethanol_rivacombined.txt'))
    

    Only one ordered dict and ‘Sucrose_rivacombined.txt’ never even makes it into memory. should be super fast

    EDIT complete code (not sure about your output line format)

    from collections import OrderedDict
    
    Et_OD = OrderedDict(line.strip().split('\t') for line in open('Ethanol_rivacombined.txt'))
    
    with open("compare.txt","w+") as output_doc:
        for line in open('Sucrose_rivacombined.txt'):
            key,val = line.strip().split('\t')
            line_out = '\t'.join((key,val,Et_OD.pop(key,'ND')))
            output_doc.write(line_out+'\n')
    
        for key,val in Et_OD.items():
            line_out = '\t'.join((key,'ND',val))
            output_doc.write(line_out+'\n')
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

All, My classpath has been set to the following folder: CLASSPATH = .;C:\Program Files\Java\jdk1.6.0_21\bin;C:\Program
I'm using Delphi 2009. My program has been compiling and running fine. I use
My program has been running fine, but I think I must have accidentally changed
My program has the following class definition: public sealed class Subscriber { private subscription;
I have a C++ program that has the following form: int main(){ int answer;
I have the following situation: There is a windows folder that has been mounted
I have the following program where two variables are to be passed by reference
Good day! I encountered the following error upon running my JSP program. java.lang.IllegalStateException: PWC3991:
So i feel like a noob but this has been baffling me or hours
For the following code: (assuming x has been defined) scanf(%d\n, &x); printf(foo); I expect

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.