Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 713049
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T04:55:03+00:00 2026-05-14T04:55:03+00:00

I’m tasked with providing a list of metadata requirements our data warehouse developers might

  • 0

I’m tasked with providing a list of metadata requirements our data warehouse developers might need.

This is not the business metadata (nice descriptions etc), but rather data required for change management (also known as impact assesment), data lineage etc.

I’ve seen this article Meta Meta Data Data – Ralph Kimball but as I’m not the first person to do this I’m throwing it to the SO community.

The actual question is this:
What metadata do datawarehouse developers require to design, develop and manage change in ETL routines?

PS: I’m trying to keep the answer platform agnostic but for context this is an Oracle database with PL/SQL and Datastage.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T04:55:04+00:00Added an answer on May 14, 2026 at 4:55 am

    At the most rudimentary level you could maintain a listing of tablename.fieldname’s that are extracted from the source system. If you are extracting from files or other non-database sources then you may or may not be able to list the dependencies as granular as the field level. Such meta data would tell you only that If(source field/fileFormat/table changes) then (some change may be needed in the ETL process). I say may because some changes may be minor enough that the ETL process is not broken, but one should still perform testing and data profiling to ensure it doesn’t invalidate any assumptions that were in place during the initial design of the ETL process.

    While the ETL developer may be intimately familiar with the source system and ETL process, for large ETL projects it would probably be wise to tie each dependency in the source system to specific processes/components of the ETL system. For example, if your ETL process is composed of many stored procedures, then you would want the metadata to relate each source system field/fileFormat/table/etc. to each stored procedure. This would be a many-to-many relationship since many stored procedures might depend on a particular field, and a single stored procedure may depend on many source fields. This would be something manually updated by the ETL developers, as they create or update the ETL system, it would be their responsibility to recognize what fields they are loading/cleaning/conforming and subsequently input that information into the metadata that tracks these dependencies.

    If your ETL system is powered by something besides stored procedures, or some combination there of, then you will need to come up with some naming scheme to reference these components. But the concept is the same, you relate source fields/file formats/etc. to the components of the ETL system.

    This will give you enough information to say that where “Person.FirstName” changes in the source system in some way, then you can compile a report that shows all the stored procedures which will need to be verified, potentially updated, and tested to cope with the change in the source system.

    This kind of implies that knowing that Person.FirstName changed in some way, either size, datatype, and/or removed entirely, requires a manual step of being notified of the change via some database designer, and taking action in response. If you want a really sophisticated system then you’d need to have triggers/audit on DDL that changes your source system so that you can automatically log and notify your ETL architects of the change.

    Should such a change occur, you’d know that sp_Person_Load, sp_Person_Clean, sp_Person_Transform stored procedures all have some dealings with the Person.FirstName field, because the author of those stored procedures noted that in the metadata documenting dependencies.

    You could make it more complicated, where sp_Person_Clean doesn’t depend on Person.Firstname, but actually depends on sp_Person_Load. Such that you build a chain of dependencies. This would make reporting on changes more complicated because you would have to chain dependencies together to determine the impact of a source system change. And you are also building a complex string of dependencies, and potential circular references, that might make maintaining the metadata as difficult as maintaining the ETL process itself. If the ETL system is simple enough that the ETL architect can define dependencies in terms of the fields/files/tables from the source system, then do that to keep things simple.

    Depending on who has authority over your data warehouse, the destination system, you may need to track these dependencies. Often the ETL developer is also the data warehouse developer. However, if someone else is the data warehouse designer, and they have the authority to make changes to the data warehouse, then your ETL developers will need to have a similar system, whether it be automated or manual, to track and be notified of changes and the impact they will have on the ETL process.

    Really when I think about how changes should be tracked, I think about the boundaries of authority. If a ETL developer changes his sp_Person_Clean procedure, he doesn’t need emtadata to tell him that sp_Person_Transform will need to be updated/tested. He already knows this very intuitively. On the other hand, if a third party/vendor system changes, or if a business department within the same organization changes a spreadsheet or file format, then those are things not enacted by the ETL developer. He won’t have the same level of intimacy with the source systems as he does with his own ETL system and data warehouse. Thus he will benefit the most from metadata that shows how components of the source system relates to components of the ETL system.

    Deciding how granular you want to define “components” will really depend on how the systems are designed, and how much meta data you want the developer to document. Too granular and you drown in the metadata. Too course and it is like saying “My house is in Florida”, which doesn’t really help the ETL developer in his work. If the entire ETL process is coded in a single SQL script, then you only have one component. So the system needs to be designed in advance knowing that you need to be able to refer to specific components and/or steps of the ETL process.

    The metadata will only be any good if the developer is diligent in updating it. There are systems/toolkits that can automatically update this kind of metadata, but they would have to box you into their toolkit so that the tool can analyze the dependencies. I have little confidence that these systems are very reliable. There are often times you have to do really hackish things to get data from a source system into the destination in the desired format, and I can imagine a dependency analyzer couldn’t understand the dependencies. For example, if you were using dynamic SQL formed from strings, then the toolkit can’t really figure out what the dependencies are.

    However, I will say I have never dove into great depth into the tools to really know how good they are. I always have gotten to the point that I found things that would normally be easy in SQL to be very cumbersome in the tool, and decided it was more trouble than it was worth. I keep telling myself I will pick something up like Talend and really become an expert at it before I make my final verdict, but there are always other priorities pulling me in other directions.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I need a function that will clean a strings' special characters. I do NOT
I have some data like this: 1 2 3 4 5 9 2 6
link Im having trouble converting the html entites into html characters, (&# 8217;) i
For some reason, after submitting a string like this Jack’s Spindle from a text
this is what i have right now Drawing an RSS feed into the php,
I have this code to decode numeric html entities to the UTF8 equivalent character.
In my XML file chapters tag has more chapter tag.i need to display chapters
We're building an app, our first using Rails 3, and we're having to build

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.