Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8738961
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T10:50:37+00:00 2026-06-13T10:50:37+00:00

I’ve got multiple websites, where each website has visitors that trigger multiple events I

  • 0

I’ve got multiple websites, where each website has visitors that “trigger” multiple events I want to track. I have a log of those events, from all websites, each event is filled with the website-id, the event-name and the user-id that did the event (for the sake of simplicity, let’s say that’s it).

The requirements:

  1. Be able to get, per website-id and event-name, how many unique visitors got it.
  2. This should support also date range (distinct unique visitors on the range).

I was thinking of creating a collection per “website-id” with the following data model (as example):

collection ev_{websiteId}:
[
    {
        _id: "error"
        dailyStats: [
            {
                _id: 20121005 <-- (yyyyMMdd int, should be indexed!)
                hits: 5
                users: [ 
                         {
                            _id: 1, <-- should be indexed!
                            hits: 1
                         }, 
                         {
                            _id: 2
                            hits: 3
                         },
                         {
                            _id: 3,
                            hits: 1
                         }
                ]
            },
            {
                _id: 20121004 
                hits: 8
                users: [ 
                         {
                            _id: 1,
                            hits: 2
                         }, 
                         {
                            _id: 2
                            hits: 3
                         },
                         {
                            _id: 3,
                            hits: 3
                         }
                ]
            },
        ]
    },
    {
        _id: "pageViews"
        dailyStats: [
            {
                _id: 20121005 
                hits: 500
                users: [ 
                         {
                            _id: 1, 
                            hits: 100
                         }, 
                         {
                            _id: 2
                            hits: 300
                         },
                         {
                            _id: 3,
                            hits: 100
                         }
                ]
            },
            {
                _id: 20121004
                hits: 800
                users: [ 
                         {
                            _id: 1, 
                            hits: 200
                         }, 
                         {
                            _id: 2
                            hits: 300
                         },
                         {
                            _id: 3,
                            hits: 300
                         }
                ]
            },
        ]
    },
]

I’m using the _id to hold the event-id.
I’m using dailyStats._id to hold when it happened (an integer in yyyyMMdd format).
I’m using dailySattes.users._id to represent a user’s unique-id hash.

In order to get the unique users, I should basically be able to run (mapreduce?) distinct count number of items in the array(s), per the given date range (I will convert the date range to yyyyMMdd).

My questions:

  1. does this data model makes sense to you? I’m concerned about scalability of this model over time (if I’ve got a lot of daily unique visitors in some client, it make cause a huge document).
    I was thinking of deleting dailyStats documents by _id < [date as yyyyMMdd]. This way I can keep my documents size to a sane number, but still, there are limits here.
  2. Is there an easy way to run “upsert” that will also create the dailyStats if not already created, add the user, if not already created and increment “hits” property for both?
  3. what about map-reduce? how would you approach it (need to run distinct on the users._id for all subdocuments in the given date range)? is there an easier way with the new aggregation framework?

btw – another option to solve unique visitors is using Redis Bitmaps but I am not sure it’s worth holding multiple data storage (maintenance-wise).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T10:50:38+00:00Added an answer on June 13, 2026 at 10:50 am

    Few comments on the current above architecture.

    I’m slightly worried as you’ve pointed out about the scalability and how much pre-aggregation you’re really doing.

    Most of the Mongo instances I’ve worked with when doing metrics have similar cases to what you pointed out but you really seem to be relying heavily on doing updates to a single document and upserting various parts of it which is going to slow down and potentially cause a bit of locking..

    I might suggest a different path, one that Mongo even suggests when talking with them about doing metrics. Seeing as you already have a structure that you’re looking to do I’d create something along the lines of:

    {
      "_id":"20121005_siteKey_page",
      "hits":512,
      "users":[
       {
         "uid":5, 
         "hits":512,
       }
    }
    

    This way you are limiting your document sizes to something that is going to be reasonable to do quick upserts on. From here you can do mapreduce jobs in batches to further extend out what you’re looking to see.

    It also depends on your end goal, are you looking to provide realtime metrics? What sort of granularity are you attemtping to get? Redis Maps may be something you want to at least look at: Great article here.

    Regardless it is a fun problem to solve 🙂

    Hope this has helped!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've got a string that has curly quotes in it. I'd like to replace
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I have just tried to save a simple *.rtf file with some websites and
I want to count how many characters a certain string has in PHP, but
That's pretty much it. I'm using Nokogiri to scrape a web page what has
Basically, what I'm trying to create is a page of div tags, each has
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I have an array which has BIG numbers and small numbers in it. I
I have a small JavaScript validation script that validates inputs based on Regex. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.