Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7839265
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T15:23:20+00:00 2026-06-02T15:23:20+00:00

I am in the process of setting up a mysql server to store some

  • 0

I am in the process of setting up a mysql server to store some data but realized(after reading a bit this weekend) I might have a problem uploading the data in time.

I basically have multiple servers generating daily data and then sending it to a shared queue to process/analyze. The data is about 5 billion rows(although its very small data, an ID number in a column and a dictionary of ints in another). Most of the performance reports I have seen have shown insert speeds of 60 to 100k/second which would take over 10 hours. We need the data in very quickly so we can work on it that day and then we may discard it(or achieve the table to S3 or something).

What can I do? I have 8 servers at my disposal(in addition to the database server), can I somehow use them to make the uploads faster? At first I was thinking of using them to push data to the server at the same time but I’m also thinking maybe I can load the data onto each of them and then somehow try to merge all the separated data into one server?

I was going to use mysql with innodb(I can use any other settings it helps) but its not finalized so if mysql doesn’t work is there something else that will(I have used hbase before but was looking for a mysql solution first in case I have problems seems more widely used and easier to get help)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T15:23:22+00:00Added an answer on June 2, 2026 at 3:23 pm

    Wow. That is a lot of data you’re loading. It’s probably worth quite a bit of design thought to get this right.

    Multiple mySQL server instances won’t help with loading speed. What will make a difference is fast processor chips and very fast disk IO subsystems on your mySQL server. If you can use a 64-bit processor and provision it with a LOT of RAM, you may be able to use a MEMORY access method for your big table, which will be very fast indeed. (But if that will work for you, a gigantic Java HashMap may work even better.)

    Ask yourself: Why do you need to stash this info in a SQL-queryable table? How will you use your data once you’ve loaded it? Will you run lots of queries that retrieve single rows or just a few rows of your billions? Or will you run aggregate queries (e.g. SUM(something) ... GROUP BY something_else) that grind through large fractions of the table?

    Will you have to access the data while it is incompletely loaded? Or can you load up a whole batch of data before the first access?

    If all your queries need to grind the whole table, then don’t use any indexes. Otherwise do. But don’t throw in any indexes you don’t need. They are going to cost you load performance, big time.

    Consider using myISAM rather than InnoDB for this table; myISAM’s lack of transaction semantics makes it faster to load. myISAM will do fine at handling either aggregate queries or few-row queries.

    You probably want to have a separate table for each day’s data, so you can “get rid” of yesterday’s data by either renaming the table or simply accessing a new table.

    You should consider using the LOAD DATA INFILE command.

    http://dev.mysql.com/doc/refman/5.1/en/load-data.html

    This command causes the mySQL server to read a file from the mySQL server’s file system and bulk-load it directly into a table. It’s way faster than doing INSERT commands from a client program on another machine. But it’s also tricker to set up in production: your shared queue needs access to the mySQL server’s file system to write the data files for loading.

    You should consider disabling indexing, then loading the whole table, then re-enabling indexing, but only if you don’t need to query partially loaded tables.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm in the process of setting up PHP / MYSQL on a Windows Server
I am in the process of setting up a server to run a Ruby
I'm currently in the process of setting my website, largely with php. Though this
I'm currently in the process of setting up a media server for my dorm
I'm running a server at my office to process some files and report the
I am creating a daemon process. This process depends on MYSQL, however my process
I am in the process of setting up a Git server (1.7.2.3) on a
I am in the process of setting up some IIS hosted WCF projects for
We are in the process of setting up a new development environment for about
I'm new to Ruby on Rails, and I'm in the process of setting it

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.