Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 82947
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T21:41:48+00:00 2026-05-10T21:41:48+00:00

Table1: Everything including the kitchen sink. Dates in the wrong format (year last so

  • 0

Table1: Everything including the kitchen sink. Dates in the wrong format (year last so you cannot sort on that column), Numbers stored as VARCHAR, complete addresses in the ‘street’ column, firstname and lastname in the firstname column, city in the lastname column, incomplete addresses, Rows that update preceeding rows by moving data from one field to another based on some set of rules that has changed over the years, duplicate records, incomplete records, garbage records… you name it… oh and of course not a TIMESTAMP or PRIMARY KEY column in sight.

Table2: Any hope of normalization went out the window upon cracking this baby open. We have a row for each entry AND update of rows in table one. So duplicates like there is no tomorrow (800MB worth) and columns like Phone1 Phone2 Phone3 Phone4 … Phone15 (they are not called phone. I use this for illustration) The foriegn key is.. well take guess. There are three candidates depending on what kind of data was in the row in table1

Table3: Can it get any worse. Oh yes. The ‘foreign key is a VARCHAR column combination of dashes, dots, numbers and letters! if that doesn’t provide the match (which it often doesn’t) then a second column of similar product code should. Columns that have names that bear NO correlation to the data within them, and the obligatory Phone1 Phone2 Phone3 Phone4… Phone15. There are columns Duplicated from Table1 and not a TIMESTAMP or PRIMARY KEY column in sight.

Table4: was described as a work in progess and subject to change at any moment. It is essentailly simlar to the others.

At close to 1m rows this is a BIG mess. Luckily it is not my big mess. Unluckily I have to pull out of it a composit record for each ‘customer’.

Initially I devised a four step translation of Table1 adding a PRIMARY KEY and converting all the dates into sortable format. Then a couple more steps of queries that returned filtered data until I had Table1 to where I could use it to pull from the other tables to form the composit. After weeks of work I got this down to one step using some tricks. So now I can point my app at the mess and pull out a nice clean table of composited data. Luckily I only need one of the phone numbers for my purposes so normalizing my table is not an issue.

However this is where the real task begins, because every day hundreds of employees add/update/delete this database in ways you don’t want to imagine and every night I must retrieve the new rows.

Since existing rows in any of the tables can be changed, and since there are no TIMESTAMP ON UPDATE columns, I will have to resort to the logs to know what has happened. Of course this assumes that there is a binary log, which there is not!

Introducing the concept went down like lead balloon. I might as well have told them that their children are going to have to undergo experimental surgery. They are not exactly hi tech… in case you hadn’t gathered…

The situation is a little delicate as they have some valuable information that my company wants badly. I have been sent down by senior management of a large corporation (you know how they are) to ‘make it happen’.

I can’t think of any other way to handle the nightly updates, than parsing the bin log file with yet another application, to figure out what they have done to that database during the day and then composite my table accordingly. I really only need to look at their table1 to figure out what to do to my table. The other tables just provide fields to flush out the record. (Using MASTER SLAVE won’t help because I will have a duplicate of the mess.)

The alternative is to create a unique hash for every row of their table1 and build a hash table. Then I would go through the ENTIRE database every night checking to see if the hashs match. If they do not then I would read that record and check if it exists in my database, if it does then I would update it in my database, if it doesn’t then its a new record and I would INSERT it. This is ugly and not fast, but parsing a binary log file is not pretty either.

I have written this to help get clear about the problem. often telling it to someone else helps clarify the problem making a solution more obvious. In this case I just have a bigger headache!

Your thoughts would be greatly appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T21:41:49+00:00Added an answer on May 10, 2026 at 9:41 pm

    The Log Files (binary Logs) were my first thought too. If you knew how they did things you would shudder. For every row there are many many entries in the log as pieces are added and changed. Its just HUGE! For now I settled upon the Hash approach. With some clever file memory paging this is quite fast.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 105k
  • Answers 106k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Here's a link to the docs for mpmovieplayercontroller: http://developer.apple.com/iphone/library/documentation/MediaPlayer/Reference/MPMoviePlayerController_Class/MPMoviePlayerController/MPMoviePlayerController.html#//apple_ref/occ/instm/MPMoviePlayerController/initWithContentURL: It's… May 11, 2026 at 8:49 pm
  • Editorial Team
    Editorial Team added an answer I would suggest definitely going to a full-on MVC framework.… May 11, 2026 at 8:49 pm
  • Editorial Team
    Editorial Team added an answer split creates an array recs in your code, and recs[1]… May 11, 2026 at 8:49 pm

Related Questions

I have a database with a report table. This table represents report data, and
I'm considering using PostgreSQL's Ltree module in my application to help with threaded comments.
On our live/production database I'm trying to add a trigger to a table, but
I'm developing a WinForms application (.Net 3.5, no WPF) where I want to be

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.