Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9259875
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T12:45:38+00:00 2026-06-18T12:45:38+00:00

Segmentation fault occurs when datact=10736 when trying to insert into an unordered_map (see commented

  • 0

Segmentation fault occurs when datact=10736 when trying to insert into an unordered_map (see commented line for which call throws the fault). See below for attempted fixes.

When SIGSEGV is thrown it points me to line 764 of hashtable_policy.h

INPUT: data file with column1=counts, column2=16-character strings

PURPOSE: cluster 16-character sequences by adding all the counts of 1-substitution different sequences together. First sequence seen is “origin” by which all its 1-substitution friends are identified.

PSEUDOCODE: for each line in file:

  1. read count, read sequence.

  2. if sequence key_value exists in hash ‘location’ (type
    unordered_map), add current count;

  3. otherwise make a new key_value, have it point to the count here, and
    assign all 1-substitution sequences to also point to this count.

Code:

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <cmath>
#include <vector>
#include <unordered_map>
#include <map>

#include "matrix.h"

using namespace std;

int nuc2num(char currchar)
{   // returns 0,1,2,3 for A,C,T,G respectively
    int outnum;

    if (currchar=='A')
    {
        outnum=0;
    }
    else if (currchar=='C')
    {
        outnum=1;
    }
    else if (currchar=='T')
    {
        outnum=2;
    }
    else
    {
        outnum=3;
    }
    return outnum;
}

int main(int argc, char* argv[])
{
    //command line arguments
    //  arg1=filename, arg2=barcode sequence length, arg3=#mismatches permitted

    //input handling
    //  file format: column 1 | column 2
    //               counts   | sequence    [int | string]

    string filename;
    string funnelstring;

    // define lookup matrix; rows=ACTG, cols = ACTG without row element
    Matrix <char> sub_lookup(4,3);
    sub_lookup[0][0] = 'C';
    sub_lookup[0][1] = 'T';
    sub_lookup[0][2] = 'G';
    sub_lookup[1][0] = 'A';
    sub_lookup[1][1] = 'T';
    sub_lookup[1][2] = 'G';
    sub_lookup[2][0] = 'A';
    sub_lookup[2][1] = 'C';
    sub_lookup[2][2] = 'G';
    sub_lookup[3][0] = 'A';
    sub_lookup[3][1] = 'C';
    sub_lookup[3][2] = 'T';

    int L,k;

    int j=0;

    const int buffersize=10000;
    int currentsize=buffersize;

    int datact=0;
    int currchar;

    vector <unsigned int> ctarr(buffersize);
    vector <string> seqarr(buffersize);

    filename=argv[1];
    L=atoi(argv[2]);
    k=atoi(argv[3]);

    unsigned int sct;
    int substrlen;
    string sequence,textct;
    ifstream seqfile (filename.c_str());

    //map <string,unsigned int*> location;
    unordered_map <string,unsigned int*> location;

    if (seqfile.is_open())
    {
        getline(seqfile,textct,'\n');
        while (textct != "")
        {
            sct=atoi(textct.c_str());
            substrlen=textct.length();
            //cout << textct << endl;
            sequence=textct.substr(substrlen-L,L);
            //cout << sequence << endl;

            //is there an associated guy?
            if (location.find(sequence) != location.end()) //just asks whether this key has been assigned
            {   //there's a value in the region
                *location[sequence]+=sct;
            }
            else
            {   //no value in region, make a footprint
                ctarr[datact]=sct;
                seqarr[datact]=sequence;
                location[sequence]=&ctarr[datact]; //assign current key to point to data count


                //assign k substitution "funnel" region to point to this count as well
                for (j=0; j<L; j++)
                {
                    funnelstring=sequence;
                    currchar = nuc2num(sequence[j]);

                    if (datact==10736 && j==13)
                    {
                        cout << "here" << endl;
                        cout << sequence << endl;
                    }

                    for (k=0; k<3; k++)
                    {
                        funnelstring[j]=sub_lookup[currchar][k];

//                    if (datact==10736 && j==13)
//                    {
//                        cout << funnelstring << endl;
//                        cout << location.max_size() << " | " << location.size() << endl;
//                        string asdf;
//                        asdf="AAAAAAAAAAAAAAAA";
//                        location[asdf]=&ctarr[datact]; //still segfaults
//                    }

                        if (location.find(funnelstring) == location.end()) // asks whether this key has been assigned
                        {   //this region is not assigned to another funnel
                            location[funnelstring]=&ctarr[datact]; //LINE THAT CAUSES SIGSEGV
                        }
                    }
                }
                datact++;
                cout << datact << endl;
                if (datact>=currentsize)
                {
                    ctarr.resize(currentsize+buffersize);
                    seqarr.resize(currentsize+buffersize);
                    currentsize+=buffersize;
                }
            }

            getline(seqfile,textct,'\n');
         }
        seqfile.close();
    }

Explorations.

  1. It doesn’t matter what key is added, when datact==10736 and j=13, any
    key added to (unordered_map) location results in a SIGSEGV.
  2. The line of code in question (marked by a comment above) is called many times prior, and operates correctly.
  3. Exchanging unordered_map for map results in the same event, but at a different (higher) count of datact. Still low (between datact==16-35
    thousand).
  4. Rewriting this code almost exactly, but with randomly generated 16-character sequences works perfectly, as far as I can tell (no
    segfaults for datact up to 200,000, didnt test higher).
  5. In the code in (4), it does appear that there is a rehash right around 10000, which could be related or coincidence.

If need be, I can post the datafile that is read.

EDIT: Solved
Instead of unordered_map <string,unsigned int*> location, replaced with unordered_map <string,unsigned int> location (value_type is int instead of int*). Now value_type holds the index in the ctarr[]. Runs fine. Thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T12:45:40+00:00Added an answer on June 18, 2026 at 12:45 pm

    Pointers to vector elements can be invalidated when you call vector::resize(). This is because the entire data may have to be moved around in order to find a contiguous memory block that fits the new size. In other words: As soon as you call resize, all your location data suddenly becomes useless garbage.

    Possible solutions:

    • Let location store the index of the required element in ctarr as its value rather than a pointer. (This will certainly not change the semantics of your program.)
    • Let location store the actual unsigned int value instead of a pointer. Depending on your program logic and how you change and access this data, this may not be what you want.

    Also note that although the segfault occurs in hashtable_policy.h, this bug has nothing to do with the implementation of unordered_map (or vector) – it is entirely your fault for not reading the reference for vector::resize() 😉 : http://www.cplusplus.com/reference/vector/vector/resize/ (section ‘Iterator validity’)


    The other thing I noticed about your code is that you use operator[] for accessing your vector elements. This disables out-of-bounds checking. If I came across an error like yours in my code (hard to trace back because it occurs somewhere far from my erroneous code), my first course of action would be swapping operator[] for vector::at() (actually, I always start with at() and only switch if I can prove beyond reasonable doubt that bounds checking is a performance bottleneck for this particular purpose). This would not have helped with your problem, but is often an invaluable help in spotting mistakes.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i am getting a segmentation fault while trying to call the addEdge(int, int) method.
I'm getting a segmentation fault the first time I call malloc() after I protect
I have a multithreaded C program, which consistently generates a segmentation fault at a
When a segmentation fault occurs, the printf() before it does not execute. main() {
When allocating memory for the 2-dimensional array using malloc() ,segmentation fault occurs when the
So here is the problem: When I call glGenTextures I get a segmentation fault.
Whenever a segmentation fault occurs, do we always have a page fault? Anyone with
Possible Duplicate: Segmentation fault on large array sizes A segmentation fault occurs when I
Possible Duplicate: segmentation fault using scanf I am trying to run a C program
This code causes a Segmentation Fault: int main(){ char *p; char a[50] = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa;

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.