Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6930433
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T11:26:48+00:00 2026-05-27T11:26:48+00:00

I started developing an website analytics system in MySQL for a project I’m working

  • 0

I started developing an website analytics system in MySQL for a project I’m working on but have quickly realised it’s not going to be sufficient for my needs (in terms of scalability, speed etc). After doing a fair bit of research MongoDB keeps cropping up as good candidate, the only problem I have is that I have no experience in it and don’t know the best practices of high performance/size MongoDB databases as well as I do for MySQL.

When a user visits a website it needs to record the standard info (IP, browser info, website ID, URL, username). It also needs to record every subsequent page the user visits (current timestamp, url). If a user leaves the website and comes back 10 days later, it needs to log that visit and also record that it’s a returning user (identified by their username).

In addition to logging visits for multiple websites (looking at 500 records being added per second) it needs to have reporting capability. I’m fine with producing graphs etc but I need to know how to extract the data from the database efficiently. I’d like to be able to provide graphs that show activity for every 15 minutes, but an hour would be sufficient if it’s more practical.

As a side thought it’d be nice if it could be capable of real-time reporting in the future, but that’s outside the scope of the current project.

Now I’ve read the article at http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics but it doesn’t mention anything about high traffic websites – it could just be capable of dealing with a few thousands records for all I know. Do I follow the concept of that post and pull reporting directly from that collection, or would it be better to pre-analyse the data and archive it into a separate collection?

Any thoughts on the data insertion, database structure and reporting would be hugely appreciated!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T11:26:49+00:00Added an answer on May 27, 2026 at 11:26 am

    (MySQL) not going to be sufficient for my needs (in terms of scalability, speed etc)

    Well… it seems facebook uses MySQL to a great degree. When it comes to NoSQL, I believe it’s not necessarily the technology, it’s data structures and algorithms.


    What you are facing is a situation of potential high write-throughput. One approach to high write throughput that fits your problem well is sharding: No matter how big the machine and how efficient the software, there will be a limit of the number of writes a single machine can handle. Sharding splits the data across multiple servers, so you can write to different servers. For example, users A-M write to server 1, users N-Z to server 2.

    Now, sharding comes at the cost of complexity, because it needs balancing, aggregations across all shards can be tricky, you need to maintain multiple independent databases, etc.

    That’s a technology thing: MongoDB sharding is rather simple, because they support auto-sharding which does most of the nasty stuff for you. I don’t think you’ll need it at 500 inserts per second, but it’s good to know it’s there.

    For the schema design, it’s important to think about the shard key, which will be used to determine which shard is responsible for the document. This might depend on your traffic patterns. Suppose you have a user who operates a fair. Once a year, his website goes totally nuts, but 360 days it is one of the lower traffic sites. Now if you shard on your CustomerId, that particular user might lead to problems. On the other hand, if you shard on VisitorId, you’ll have to hit each shard for a simple count().

    The analysis part depends largely on the queries you want to support. The real deal slice&dice is rather challenging I’d say, in particular if you want to support near-real-time analytics. A much easier approach is to limit the user’s options and only provide a small set of operations. These can also be cached, so you won’t have to do all aggregations every time.

    In general, analytics can be tricky because there are many features that need relations. For example, cohort analysis will require you to consider only those log entries that were generated by a specific group of users. An $in query will do the trick for smaller cohorts, but if we’re talking about tens of thousands of users, it won’t do. You could select only a random subset of users, because that should be statistically sufficient, but of course it depends on your specific requirements.

    For the analysis of large amounts of data, Map/Reduce comes in handy: it will do the processing on the server, and Map/Reduce also benefits from sharding, because the jobs can be processed individually by each shard. However, depending on a gazillion factors, these jobs will take some time.

    I believe that the blog of Boxed Ice has some information on this; they definitely have experience in handling lots of analytical data using MongoDB.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Intro I'm developing a project with MVC.Net. I have just started a default website
I just started developing a new website and I have some questions. You can
I've started developing a website in ASP.NET MVC and have taken part in the
I am trying to get started developing using the .NET Micro Framework but appear
Using JDeveloper , I started developing a set of web pages for a project
I have found the Getting Started documents for developing apps on iPhone. I wanted
I have been developing a login library for a website using CodeIgniter. The authentication
If you started developing as a project, how difficult is it to migrate into
I recently started developing websites on wordpress. I have this problem with tinymce editor.
i started developing apps in Titanium,in that i have already set android sdk path

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.