I have a calendar table Excel file. Every month a new column is added.
I shouldn’t read columns names. When I developed ETL in May, I placed dummy columns until December. The Excel file was like this (columns):
customer jan12 feb12 mar12 apr12 may12
To form metadata data in ETL I added dummy columns until December:
customer jan12 feb12 mar12 apr12 may12 mon mon mon mon mon mon mon
Then in SSIS Excel Source I wrote a query:
select * from Sheet1$A2:M1624
(not reading column names and reading until dec column)
Now in June the Excel file came like this:
customer jan12 feb12 mar12 apr12 may12 jun12
Since I created metadata, I thought load will be peaceful. To my fate ETL failed.
At the run time SSIS doesn’t load blank columns from excel file even if the metadata for those columns is provided while creating the source. Problem scenario: We define 10 columns in excel source and map them with 10 columns at destination. At run time if we encounter only 3 columns form excel source then excel source will rebuild metadata(automatically) to only 3 columns. Since destination is mapped with 10 columns, package fails at validation phase.
So in order to read a excel file with varying columns we need to relay on Script task. But again the limitation here is you should be at least aware of maximum no of columns you may have in excel. This is something that we can leave with.
Solution