Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 540693
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T10:16:00+00:00 2026-05-13T10:16:00+00:00

I’m building, basically, an ad server. This is a personal project that I’m trying

  • 0

I’m building, basically, an ad server. This is a personal project that I’m trying to impress my boss with, and I’d love any form of feedback about my design. I’ve already implemented most of what I describe below, but it’s never too late to refactor 🙂

This is a service that delivers banner ads (http://myserver.com/banner.jpg links to http://myserver.com/clicked) and provides reporting on subsets of the data.

For every ad impression served and every click, I need to record a row that has (id, value) [where value is the cash value of this transaction; e.g. -$.001 per served banner ad at $1 CPM, or +$.25 for a click); my output is all based on earnings per impression [abbreviated EPC]: (SUM(value)/COUNT(impressions)), but on subsets of the data, like “Earnings per impression where browser == ‘Firefox'”. The goal is to output something like “Your overall EPC is $.50, but where browser == ‘Firefox’, your EPC is $1.00”, so that the end user can quickly see significant factors in their data.

Because there’s a very large number of these subsets (tens of thousands), and reporting output only needs to include the summary data, I’m precomputing the EPC-per-subset with a background cron task, and storing these summary values in the database. Once in every 2-3 hits, a Hit needs to query the Hits table for other recent Hits by a Visitor (e.g. “find the REFERER of the last Hit”), but usually, each Hit only performs an INSERT, so to keep response times down, I’ve split the app across 3 servers [bgprocess, mysql, hitserver].

Right now, I’ve structured all of this as 3 normalized tables: Hits, Events and Visitors. Visitors are unique per visitor session, a Hit is recorded every time a Visitor loads a banner or makes a click, and Events map the distinct many-to-many relationship from Visitors to Hits (e.g. an example Event is “Visitor X at Banner Y”, which is unique, but may have multiple Hits). The reason I’m keeping all the hit data in the same table is because, while my above example only describes “Banner impressions -> clickthroughs”, we’re also tracking “clickthroughs -> pixel fires”, “pixel fires -> second clickthrough” and “second clickthrough -> sale page pixel”.

My problem is that the Hits table is filling up quickly, and slowing down ~linearly with size. My test data only has a few thousand clicks, but already my background processing is slowing down. I can throw more servers at it, but before launching the alpha of this, I want to make sure my logic is sound.

So I’m asking you SO-gurus, how would you structure this data? Am I crazy to try to precompute all these tables? Since we rarely need to access Hit records older than one hour, would I benefit to split the Hits table into ProcessedHits (with all historical data) and UnprocessedHits (with ~last hour’s data), or does having the Hit.at Date column indexed make this superfluous?

This probably needs some elaboration, sorry if I’m not clear, I’ve been working for past ~3 weeks straight on it so far 🙂 TIA for all input!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T10:16:00+00:00Added an answer on May 13, 2026 at 10:16 am

    You should be able to build an app like this in a way that it won’t slow down linearly with the number of hits.

    From what you said, it sounds like you have two main potential performance bottlenecks. The first is inserts. If you can have your inserts happen at the end of the table, that will minimize fragmentation and maximize throughput. If they’re in the middle of the table, performance will suffer as fragmentation increases.

    The second area is the aggregations. Whenever you do a significant aggregation, be careful that you don’t cause all in-memory buffers to get purged to make room for the incoming data. Try to minimize how often the aggregations have to be done, and be smart about how you group and count things, to minimize disk head movement (or maybe consider using SSDs).

    You might also be able to do some of the accumulations at the web tier based entirely on the incoming data rather than on new queries, perhaps with a fallback of some kind if the server goes down before the collected data is written to the DB.

    Are you using INNODB or MyISAM?

    Here are a few performance principles:

    1. Minimize round-trips from the web tier to the DB
    2. Minimize aggregation queries
    3. Minimize on-disk fragmentation and maximize write speeds by inserting at the end of the table when possible
    4. Optimize hardware configuration
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 350k
  • Answers 350k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Put a sleep inside the tearDown method of your TestCase… May 14, 2026 at 6:59 am
  • Editorial Team
    Editorial Team added an answer Foreach option, check if they have beenTo it. Then add… May 14, 2026 at 6:59 am
  • Editorial Team
    Editorial Team added an answer Yes, this is a common pattern with all the major… May 14, 2026 at 6:59 am

Related Questions

I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I want use html5's new tag to play a wav file (currently only supported
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti
I've got a string that has curly quotes in it. I'd like to replace
In order to apply a triggered animation to all ToolTip s in my app,

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.