Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6004885
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T01:19:32+00:00 2026-05-23T01:19:32+00:00

A dataset I receive for routine refresh purposes contains a date field that’s actually

  • 0

A dataset I receive for routine refresh purposes contains a date field that’s actually VARCHAR.

As this will be an indexed/searched field, I’m left with…
1) Converting the field to DATETIME and validating and normalizing the data values when refreshing

or…
2) Leaving the data as-is and forming my queries to accommodate various valid date formats, i.e.,
WHERE DateField = ‘CCYYMMDD’ OR DateField = ‘MM/DD/CCYY’ OR ….

The refresh would be on a monthly basis; “cleaning” the data would add about 35% time to the ETL cycle. My queries on the date field would all be equalities; I do not need to range search.
Also, I’m a one man shop, so the more hands-off the overall solution the better.

So which scenario am I better off doing? All opinions appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T01:19:32+00:00Added an answer on May 23, 2026 at 1:19 am

    I think this is a great question. Here’s my opinion:

    I’m a big believer in the idea that in the long run you’ll save more time and have fewer headaches by using data types for the purpose for which they were intended. That means dates in date fields, characters in character fields, etc. If you go with option 2 you’ll need to remember to code for all the various possible date formats every time you query the table. If you set this down and come back a year from now, are you going to remember?

    By contrast, if you use a date field and do the upfront work in the ETL process of dealing with the dates properly, you will always know just how to interact with the field. And I’m not even going into performance implications.

    And in this case, I’m not sure you’ll even see a short-term benefit. If there are, for example 5 different possible date formats in the source data, you’ll need to account for those one way or another; either in the ETL or in the output queries. The code to transform those 5 formats in ETL is not materially more complicated than the code to manage those 5 formats in the output queries.

    And if the data could literally arrive in an infinite number of formats, you have big problems either way. Either your ETL will break or your queries will break. It is, to a certain extent, an irreducible complexity.

    I would suggest that you take the time to code the proper transforms into your ETL. But do yourself a favor and code a preprocessing step that identifies dates in formats that won’t properly transform and alerts you to them. If you see patterns; i.e., if any format shows up more than once, code a transform for it. Over time you’ll be left manually cleaning fewer and fewer of those nasty dates. With luck, your 35% will drop to 5% or less.

    Good luck!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a DataSet that contains a few columns. One of these columns is
I'm using PHP 5.3 to receive a Dataset from a web service call that
my SSRS DataSet returns a field with HTML, e.g. <b>blah blah </b><i> blah </i>.
I have a dataset that I have modified into an xml document and then
I have a DataSet which I get a DataTable from that I am being
I have a dataset that has two tables in it. I want to do
I have a DataSet consisting of XML data, I can easily output this to
I have a DataSet with a DataTable that correctly fills a single DataRow through
I have a large dataset (over 100,000 records) that I wish to load into
I'm developing an application that will need to communicate with itself running on different

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.