Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 707085
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T04:14:17+00:00 2026-05-14T04:14:17+00:00

I want to find a programmatic solution using C++. I have a 900 files

  • 0

I want to find a programmatic solution using C++.

I have a 900 files each of 27MB size. (just to inform about the enormity ).

Each file has 55K rows and Varying columns. But the header indicates the columns

I want to sort the rows in an order w.r.t to a Column Value.

I wrote the sorting algorithm for this (definitely my newbie attempts, you may say).
This algorithm is working for few numbers, but fails for larger numbers.

Here is the code for the same:
basic functions I defined to use inside the main code:

int getNumberOfColumns(const string& aline)
{
 int ncols=0;
 istringstream ss(aline);
 string s1;
 while(ss>>s1) ncols++;
 return ncols;
}

vector<string> getWordsFromSentence(const string& aline)
{
 vector<string>words;
 istringstream ss(aline);
 string tstr;
 while(ss>>tstr) words.push_back(tstr);
 return words;
}

bool findColumnName(vector<string> vs, const string& colName)
{
 vector<string>::iterator it = find(vs.begin(), vs.end(), colName);
 if ( it != vs.end()) 
 return true;
 else return false;
}

int getIndexForColumnName(vector<string> vs, const string& colName)
{
 if ( !findColumnName(vs,colName) ) return -1;
 else {
  vector<string>::iterator it = find(vs.begin(), vs.end(), colName);
 return it - vs.begin();
 }
}

////////// I like the Recurssive functions - I tried to create a recursive function
///here. This worked for small values , say 20 rows. But for 55K - core dumps
void sort2D(vector<string>vn, vector<string> &srt, int columnIndex)
{
  vector<double> pVals;
 for ( int i = 0; i < vn.size(); i++) {
  vector<string>meancols = getWordsFromSentence(vn[i]);
  pVals.push_back(stringToDouble(meancols[columnIndex]));
 }

        srt.push_back(vn[max_element(pVals.begin(), pVals.end())-pVals.begin()]);
        if (vn.size() > 1 ) {
        vn.erase(vn.begin()+(max_element(pVals.begin(), pVals.end())-pVals.begin()) );
        vector<string> vn2 = vn;
 //cout<<srt[srt.size() -1 ]<<endl;
        sort2D(vn2 , srt, columnIndex);
        }
}

Now the main code:

 for ( int i = 0; i < TissueNames.size() -1; i++)
 {
  for ( int j = i+1; j < TissueNames.size(); j++)
  {
   //string fname = path+"/gse7307_Female_rma"+TissueNames[i]+"_"+TissueNames[j]+".txt";
   //string fname2 = sortpath2+"/gse7307_Female_rma"+TissueNames[i]+"_"+TissueNames[j]+"Sorted.txt";
   string fname = path+"/gse7307_Male_rma"+TissueNames[i]+"_"+TissueNames[j]+".txt";
   string fname2 = sortpath2+"/gse7307_Male_rma"+TissueNames[i]+"_"+TissueNames[j]+"4Columns.txt";
   vector<string>AllLinesInFile;
   BioInputStream fin(fname);
   string aline;
   getline(fin,aline);
   replace (aline.begin(), aline.end(), '"',' ');
   string headerline = aline;
   vector<string> header = getWordsFromSentence(aline);

   int pindex = getIndexForColumnName(header,"p-raw");
   int xcindex = getIndexForColumnName(header,"xC");
   int xeindex = getIndexForColumnName(header,"xE");
   int prbindex = getIndexForColumnName(header,"X");

   string newheaderline = "X\txC\txE\tp-raw";
   BioOutputStream fsrt(fname2);
   fsrt<<newheaderline<<endl;

   int newpindex=3;
   while ( getline(fin, aline) ){

   replace (aline.begin(), aline.end(), '"',' ');
   istringstream ss2(aline);
   string tstr;
   ss2>>tstr;
   tstr = ss2.str().substr(tstr.length()+1);
   vector<string> words = getWordsFromSentence(tstr);
   string values = words[prbindex]+"\t"+words[xcindex]+"\t"+words[xeindex]+"\t"+words[pindex];
    AllLinesInFile.push_back(values);
   }

   vector<string>SortedLines; 
   sort2D(AllLinesInFile, SortedLines,newpindex);

   for ( int si = 0; si < SortedLines.size(); si++)
    fsrt<<SortedLines[si]<<endl;
   cout<<"["<<i<<","<<j<<"] = "<<SortedLines.size()<<endl;
  }
 }

can some one suggest me a better way of doing this?
why it is failing for larger values. ?

The primary function of interest for this query is Sort2D function.

thanks for the time and patience.

prasad.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T04:14:17+00:00Added an answer on May 14, 2026 at 4:14 am

    I’m not sure why your code is crashing, but recursion in that case is only going to make the code less readable. I doubt it’s a stack overflow, however, because you’re not using much stack space in each call.

    C++ already has std::sort, why not use that instead? You could do it like this:

    // functor to compare 2 strings
    class CompareStringByValue : public std::binary_function<string, string, bool>
    {
    public:
        CompareStringByValue(int columnIndex) : idx_(columnIndex) {}
        bool operator()(const string& s1, const string& s2) const
        {
            double val1 = stringToDouble(getWordsFromSentence(s1)[idx_]);
            double val2 = stringToDouble(getWordsFromSentence(s2)[idx_]);
            return val1 < val2;
        }
    private:
        int idx_;
    };
    

    To then sort your lines you would call

    std::sort(vn.begin(), vn.end(), CompareByStringValue(columnIndex));
    

    Now, there is one problem. This will be slow because stringToDouble and getWordsFromSentence are called multiple times on the same string. You would probably want to generate a separate vector which has precalculated the values of each string, and then have CompareByStringValue just use that vector as a lookup table.

    Another way you can do this is insert the strings into a std::multimap<double, std::string>. Just insert the entries as (value, str) and then read them out line-by-line. This is simpler but slower (though has the same big-O complexity).

    EDIT: Cleaned up some incorrect code and derived from binary_function.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to find all excel files within a directory structure using recursion. The
I have some scattered 3D points (2d solution is sufficient). I want find different
I want to find the Xelement attribute.value which children have a concrete attribute.value. string
I want to find files containing the word navbar anywhere in files. I can
I want to find out what libraries are using the most memory. I want
I want to find the number of zeroes in a factorial using Cpp. The
If I want find the differences between two directory trees, I usually just execute:
any ideas about it? any plugin? I want find all img tag and show
I want to find particular string in JAVA and I am using CONTAINS function
I want to find all #tags in a piece of text (using javascript) and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.