Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6338909
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T19:32:31+00:00 2026-05-24T19:32:31+00:00

I have a huge dataset (around 5 000 000 rows in a database) which

  • 0

I have a huge dataset (around 5 000 000 rows in a database) which I want to represent as a graph. For algorithmic reasons it is required to store the dataset in a adjacency matrix. The Matrix will be very sparse and symmmetric.

First I thought of storing the graph in a database table. This would require 5 000 000 rows, which should be no problem. But 5 000 000 columns? I don’t know much of databases but I have the feeling, that this would be no recommended way of doing this.

After some searching within google, I found SciPy which has several Sparse Matrix Objects. lil_matrix and coo_matrix seem to be what I need.

Since I will operate on this matrix using python, SciPy seems a good why to go. The question for me now is how to store the graph aka sparse matrix?

Should I use a csv file? Should I use coo_matrix to save the matrix into a daatabase_table? Both would result into around 2 500 000 000 000 rows/lines

Or is there a far better way for creating and storing such a symmetric and sparse “Matrix” of dimension around 5 000 000 in python?

I am using numpy and some self written algorithms in python, which I want to run on the matrix. So it would be cool, if the suggestions make it easy to use python on the graph.

I don’t know if I provided enough information for an answer. If you need more information: Feel free to ask me in a comment or so. I will gladly edit my answer.

Thanks in advance for any suggestion!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T19:32:31+00:00Added an answer on May 24, 2026 at 7:32 pm

    You can use the numpy sparse matrix format. But all of your questions depend on the number of non-zero entries (NNZ) in the matrix. Storage and lots of computations are dependent (approximately) only on the NNZ. Start here.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a huge data set and I want to extract the rows which
I have a huge database with 100's of tables and stored procedures. Using SQL
I have a huge database with some 100 tables and some 250 stored procedures.
I have a rails app that references the TMY3 meteorological dataset , which is
I have a wcf client which receives a dataset from a wcf service. It
I have a dataset in a Visual Studio 2010 Web App project which accesses
I have a system which will return all users from the database and order
I have a huge dataset with words word_i and weights weight[i,j] , where weight
I have loaded a huge CSV dataset -- Eclipse's Filtered Usage Data using PostgreSQL's
I have huge 3D arrays of numbers in my .NET application. I need to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.