Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 333731
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T09:59:19+00:00 2026-05-12T09:59:19+00:00

Our MySQL web analytics database contains a summary table which is updated throughout the

  • 0

Our MySQL web analytics database contains a summary table which is updated throughout the day as new activity is imported. We use ON DUPLICATE KEY UPDATE in order that the summarization overwrites earlier calculations, but are having difficulty because one of the columns in the summary table’s UNIQUE KEY is an optional FK, and contains NULL values.

These NULLs are intended to mean “not present, and all such cases are equivalent”. Of course, MySQL usually treats NULLs as meaning “unknown, and all such cases are not equivalent”.

Basic structure is as follows:

An “Activity” table containing an entry for each session, each belonging to a campaign, with optional filter and transaction IDs for some entries.

CREATE TABLE `Activity` (
    `session_id` INTEGER AUTO_INCREMENT
    , `campaign_id` INTEGER NOT NULL
    , `filter_id` INTEGER DEFAULT NULL
    , `transaction_id` INTEGER DEFAULT NULL
    , PRIMARY KEY (`session_id`)
);

A “Summary” table containing daily rollups of total number of sessions in activity table, an d the total number of those sessions which contain a transaction ID. These summaries are split up, with one for every combination of campaign and (optional) filter. This is a non-transactional table using MyISAM.

CREATE TABLE `Summary` (
    `day` DATE NOT NULL
    , `campaign_id` INTEGER NOT NULL
    , `filter_id` INTEGER DEFAULT NULL
    , `sessions` INTEGER UNSIGNED DEFAULT NULL
    , `transactions` INTEGER UNSIGNED DEFAULT NULL
    , UNIQUE KEY (`day`, `campaign_id`, `filter_id`)
) ENGINE=MyISAM;

The actual summarization query is something like the following, counting up the number of sessions and transactions, then grouping by campaign and (optional) filter.

INSERT INTO `Summary` 
    (`day`, `campaign_id`, `filter_id`, `sessions`, `transactions`)
    SELECT `day`, `campaign_id`, `filter_id
        , COUNT(`session_id`) AS `sessions`
        , COUNT(`transaction_id` IS NOT NULL) AS `transactions`
    FROM Activity
    GROUP BY `day`, `campaign_id`, `filter_id`
ON DUPLICATE KEY UPDATE
    `sessions` = VALUES(`sessions`)
    , `transactions` = VALUES(`transactions`)
;

Everything works great, except for the summary of cases where the filter_id is NULL. In these cases, the ON DUPLICATE KEY UPDATE clause does not match the existing row, and a new row is written every time. This is due to the fact that “NULL != NULL”. What we need, however, is “NULL = NULL” when comparing the unique keys.

I am looking for ideas for workarounds or feedback on those we have come up with so far. Workarounds we have thought of so far follow.

  1. Delete all summary entries containing a NULL key value prior to running the summarization. (This is what we are doing now)
    This has the negative side effect of returning results with missing data if a query is executed during the summarization process.

  2. Change the DEFAULT NULL column to DEFAULT 0, which allows the UNIQUE KEY to be matched consistently.
    This has the negative side effect of overly complicating the development of queries against the summary table. It forces us to use a lot of “CASE filter_id = 0 THEN NULL ELSE filter_id END”, and makes for awkward joining since all of the other tables have actual NULLs for the filter_id.

  3. Create a view which returns “CASE filter_id = 0 THEN NULL ELSE filter_id END”, and using this view instead of the table directly.
    The summary table contains a few hundred thousand rows, and I’ve been told view performance is quite poor.

  4. Allow the duplicate entries to be created, and delete the old entries after summarization completes.
    Has similar problems to deleting them ahead of time.

  5. Add a surrogate column which contains 0 for NULL, and use that surrogate in the UNIQUE KEY (actually we could use PRIMARY KEY if all columns are NOT NULL).
    This solution seems reasonable, except that the example above is only an example; the actual database contains half a dozen summary tables, one of which contains four nullable columns in the UNIQUE KEY. There is concern by some that the overhead is too much.

Do you have a better workaround, table structure, update process or MySQL best practice which can help?

EDIT: To clarify the “meaning of null”

The data in the summary rows containing NULL columns are considered to belong together only in the sense that of being a single “catch-all” row in summary reports, summarizing those items for which that data point does not exist or is unknown. So within the context of the summary table itself, the meaning is “the sum of those entries for which no value is known”. Within the relational tables, on the other hand, these truly are NULL results.

The only reason for putting them into a unique key on the summary table is to allow for automatic update (by ON DUPLICATE KEY UPDATE) when re-calculating the summary reports.

Maybe a better way to describe it is by the specific example that one of the summary tables groups results geographically by the zip code prefix of the business address given by the respondent. Not all respondents provide a business address, so the relationship between the transaction and addresses table is quite correctly NULL. In the summary table for this data, a row is generated for each zip code prefix, containing the summary of data within that area. An additional row is generated to show the summary of data for which no zip code prefix is known.

Altering the rest of the data tables to have an explicit “THERE_IS_NO_ZIP_CODE” 0-value, and placing a special record in the ZipCodePrefix table representing this value, is improper–that relationship truly is NULL.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T09:59:19+00:00Added an answer on May 12, 2026 at 9:59 am

    I think something along the lines of (2) is really the best bet — or, at least, it would be if you were starting from scratch. In SQL, NULL means unknown. If you want some other meaning, you really ought to use a special value for that, and 0 is certainly an OK choice.

    You should do this across the entire database, not just this one table. Then you shouldn’t wind up with weird special cases. In fact, you should be able to get rid of a lot of your current ones (example: currently, if you want the summary row where there is no filter, you have the special case “filter is null” as opposed to the normal case “filter = ?”.)

    You should also go ahead and create a “not present” entry in the referred-to table as well, to keep the FK constraint valid (and avoid special cases).

    PS: Tables w/o a primary key are not relational tables and should really be avoided.

    edit 1

    Hmmm, in that case, do you actually need the on duplicate key update? If you’re doing a INSERT … SELECT, then you probably do. But if your app is supplying the data, just do it by hand — do the update (mapping zip = null to zip is null), check how many rows were changed (MySQL returns this), if 0 do an insert.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 165k
  • Answers 165k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer There are a couple of variants of a rename command,… May 12, 2026 at 12:33 pm
  • Editorial Team
    Editorial Team added an answer There is currently no way to build CLS-compliant assemblies from… May 12, 2026 at 12:33 pm
  • Editorial Team
    Editorial Team added an answer You might want to look at Google Protocol Buffers or… May 12, 2026 at 12:33 pm

Related Questions

We have a web e-learning type system, and for this project, I am supposed
I have inherited a broad, ill-designed web portfolio at my job. Most pages are
The HR department at the company that I am currently working for has requested
I'm involved in creating a web based business solution. The idea is that the
Our web application basically dynamically generates tables and relations. It also generate indexes on

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.