Below given SSIS script task takes in a ticker symbol and the date range you would like to get back and returns a CSV formatted download that can be used to extract the price history but it does not work and I have no idea why. Full information about this SSIS well-thought-out concept can be found in the below link.
SSIS / ETL Example – Yahoo Equity & Mutual Fund Price History
You can download the sample SSIS package from the below link.
Following SSIS script task has no errors but it does not download a file:
I just started to understand this code, and it seemed that this link is subdivided to components like here, and with the correct values it must work but I don’t understand why it is not retrieving the file.
ichart.finance.yahoo.com/table.csv?s={symbol}&a={startMM}&b={startDD}&c=
{startYYYY}&d={endMM}&e={endDD}&f={endYYYY}&g={res}&ignore=.csv
Script task code that I am using:
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Configuration;
using System.Collections.Generic;
using System.Data.Sql;
using System.Data.SqlClient;
using System.Net;
using System.Collections.Specialized;
using System.Linq;
using Hash = System.Collections.Generic.Dictionary<string, string>;
namespace ST_361aad0e48354b30b8152952caab8b2b.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
static string dir;
static DateTime end;
const string CSV_FORMAT = "Id,Cusip,Date,Open,High,Low,Close,Volume,Adj Close";
public void Main()
{
// end date is today minus one day (end of day)
end = DateTime.Now;
// output directory stored in SSIS variable
// which can be set at runtime
dir = System.IO.Path.Combine(Dts.Variables["OutputCSV"].Value.ToString(), end.ToString("yyyyMMdd"));
if (!System.IO.Directory.Exists(dir))
System.IO.Directory.CreateDirectory(dir);
// connection string to our database
var connectionString = Dts.Variables["ConnectionString"].Value.ToString();
// the sql command to execute
var sql = Dts.Variables["PriceHistorySqlCommand"].Value.ToString();
var list = new List<Hash>();
using (var cnn = new SqlConnection(connectionString))
{
cnn.Open();
using (var cmd = new SqlCommand(sql, cnn))
{
cmd.CommandTimeout = 0;
var dr = cmd.ExecuteReader();
while (dr.Read())
{
// store result in temporary hash
var h = new Hash();
h["cusip"] = dr["cusip"].ToString();
h["symbol"] = dr["symbol"].ToString();
h["product_id"] = dr["product_id"].ToString();
h["last_price_dt_id"] = dr["last_price_dt_id"].ToString();
list.Add(h);
// process batches of 100 at a time
// (This requires System.Threading.dll (CTP of parallel extensions) to be installed in the GAC)
if (list.Count >= 100)
{
System.Threading.Tasks.Parallel.ForEach(list, item =>
{
var dt = item["last_price_dt_id"].TryGetDateFromDateDimensionId(end.AddYears(-100));
DownloadPriceHistory(item["product_id"], item["cusip"], item["symbol"], dt);
});
list.Clear();
}
}
}
}
// TODO: Add your code here
Dts.TaskResult = (int)ScriptResults.Success;
}
static void DownloadPriceHistory(string id, string cusip, string symbol, DateTime begin)
{
// get write path
var path = System.IO.Path.Combine(dir, cusip + ".csv");
var url = String.Format("http://ichart.finance.yahoo.com/table.csv?s={0}&d={1}&e={2}&f={3}&g=d&a={4}&b={5}&c={6}&ignore=.csv",
symbol.ToUpper(),
(end.Month - 1).ToString("00"), end.Day.ToString("00"), end.Year,
(begin.Month - 1).ToString("00"), begin.Day.ToString("00"), begin.Year);
string csv;
using (WebClient web = new WebClient())
{
try
{
var text = web.DownloadString(url);
var lines = text.Split('\n');
System.Text.StringBuilder sb = new System.Text.StringBuilder();
int i = 0;
foreach (var line in lines)
{
// skip first line its a header
if (i == 0)
sb.AppendLine(CSV_FORMAT);
// ensure line being added is not null
else if (false == String.IsNullOrEmpty(line) && false == String.IsNullOrEmpty(line.Trim()))
sb.AppendLine(id + "," + cusip + "," + line);
i++;
}
// add header and body
csv = sb.ToString();
}
catch (System.Net.WebException)
{
// 404 error
csv = CSV_FORMAT;
}
}
System.IO.File.WriteAllText(path, csv);
}
}
/// <summary>
/// Some simple extension methods.
/// </summary>
public static class ExtensionMethods
{
/// <summary>
/// Gets a datetime object from a dimension id string for example '20090130' would be translated to
/// a proper datetime of '01-30-2009 00:00:00'. If the string is empty than we default to the passed
/// in <paramref name="defaultIfNull"/>.
/// </summary>
/// <param name="str">The string</param>
/// <param name="defaultIfNull">The default null.</param>
/// <returns>Returns the datetime.</returns>
public static DateTime TryGetDateFromDateDimensionId(this string str, DateTime defaultIfNull)
{
if (String.IsNullOrEmpty(str)) return defaultIfNull;
return DateTime.Parse(str.Substring(4, 2) + "/" + str.Substring(6, 2) + "/" + str.Substring(0, 4));
}
}
}
Import ticker price history from Yahoo Finance Chart website using SSIS:
There is another way to import the ticker symbol price history from Yahoo Chart website into database using SSIS. Here is a sample package written using
SSIS 2008 R2with database inSQL Server 2008 R2Create an SSIS package named (say
SO_14797886.dtsx) usingBusiness Intelligence Development Studio (BIDS)and create anOLE DBconnection manager/data source that connects to your database. This sample uses the data sourceOLEDB_Sora.dsthat connects to the databaseSoraon my local machine running the instanceKIWI\SQLSERVER2008R2.KIWIis the machine name andSQLSERVER2008R2is the instance name.Execute the below given script in the database to create two tables.
Table
dbo.TickerSymbolswill hold the information about list of ticker symbols and the start and end dates for which you would like to import the price files along with the resolution of the import. Resolution can contain values likedforday;wforweekly;mformonthly; andyforyearly.Table
dbo.TickerPriceHistorywill hold the price history information of the symbols downloaded from Yahoo Finance Chart website.Insert script has added four records for the ticker symbols
AAPL(Apple);MSFT(Microsoft);GOOG(Google); andYHOO(Yahoo). Each record is set with different date ranges and resolution.Script to create tables and insert few ticker symbols data:
On the SSIS package, create the following variables.
EndDate: The package will use this variable of data typeDateTimeto hold the end date of the symbol being looped through in the record set list.FileExtension: This variable of data typeStringwill hold the file extension to use for the downloaded files. This is optional.FileName: This variable of data typeStringwill hold the name of the file for a given symbol. The name is generated based on timestamp to avoid overwriting previously downloaded files. Click the variable and press F4 to view properties. Change the propertyEvaluateAsExpressiontoTrue. Click on the Ellipsis button against Expression to open the Expression Builder. Set theExpressionto the following value. This expression will evaluate to value likeMSFT_20130210_092519.csv, where MSFT is the symbol and the rest of the information is package start time in the formatyyyMMdd_hhmmssand.csvis the file extension.FilePath: This variable of data typeStringwill hold the complete path of the downloaded file for a given symbol. Click the variable and press F4 to view properties. Change the propertyEvaluateAsExpressiontoTrue. Click on the Ellipsis button against Expression to open the Expression Builder. Set theExpressionto the value@[User::RootFolder] + "\\" + @[User::FileName]. We will use this express toResolution: The package will use this variable of data typeStringto hold the reolution information of the symbol being looped through in the record set list.RootFolder: This variable of data typeStringwill hold the root folder where the files should be downloaded to.SQL_GetSymbols: This variable of data typeStringwill contain the T-SQL query to fetch the ticker symbols information from database. Set the value toSELECT Symbol, StartDate, EndDate, Resolution FROM dbo.TickerSymbolsStartDate: The package will use this variable of data typeDateTimeto hold the start date of the symbol being looped through in the record set list.Symbol: The package will use this variable of data typeStringto hold the ticker symbol as it loops through each record in the record set list.SymbolsList: The package will use this variable of data typeObjectto hold the result set of ticker symbols stored in the database.URLYahooChart: This variable of data typeStringwill hold the URL to Yahoo Finance Chart website with place holders to fill in the appropriate values for query string. Set the value tohttp://ichart.finance.yahoo.com/table.csv?s={0}&a={1}&b={2}&c={3}&d={4}&e={5}&f={6}&g={7}&ignore=.csvOn the package, right-click on the Connection Managers tab and click
Flat File Connection...On the
Generalpage of Flat File Connection Manager Editor, perform the following actions:Set the Name to
FILE_TickerPriceHistorySet the Description to
Read the ticker symbol price history.If you already have a sample file, point to the file location. SSIS will infer the settings from the data in the file. In this case, I already downloaded a file by navigating the URL
http://ichart.finance.yahoo.com/table.csv?s=MSFT&a=9&b=1&c=2012&d=11&e=30&f=2012&g=m&ignore=.csvand saved it under the nameC:\Siva\StackOverflow\Files\14797886\Data\\MSFT_20130210_092519.csvMake sure the Format is set to
Delimited.Make sure the Header row delimiter is set to
{CR}{LF}Check the box
Column names in the first data rowClick Columns page
On the
Columnspage of Flat File Connection Manager Editor, make sure that the Row delimiter is set to{LF}and Column delimiter is set toComma {,}. Click Advanced page.On the
Advancedpage of Flat File Connection Manager Editor, the columns will be created based on the file information. Change the values as shown below so that column names match with names in the database. This way the column mapping will be easier. All columns except the last column should have the ColumnDelimiter set toComma {,}. Lsst column should have the the ColumnDelimiter set to{LF}.You should now see both the connection managers at the bottom of the package.
Drag and drop an
Execute SQL Taskon to the Control Flow tab and perform the following actions on the General tab.Get symbols from databaseFetch the list of symbols and its download settings from database.Full result setbecause the query will return a record set.OLE DBOLEDB_SoraVariablefrom SQLSourceTypeUser::SQL_GetSymbolsfrom SourceVariableOn the Result Set page of Execute SQL Task, click Add and set Result Name to
0indicating the index of the result set. SelectUser::SymbolsListfrom the Variable Name to store the result set into object variable.Drag and drop a Foreach Loop Container and place it after the Execute SQL Task. Connect the Execute SQL Task green arrow to Foreach Loop Container. Double-click Foreach Loop Container to view Foreach Loop Editor. Configure the ForEach Loop Editor as shown below.
On the Variable Mappings page of ForEach Loop Editor, configure it as shown below:
Drag and drop a
Script Taskinside the ForEach Loop Container. Double-click the Script Task to open the Script Task Editor. On the Script page of the Script Task editor, click the Ellipsis button against theReadOnlyVariablesand select the below listed variables. We need to use these inside the Script Task code.Click the
Edit Script...button on the Script Task Editor and type the below code. After typing the code, close the Script Task Editor.Script Task code in C#:
Drag and drop a
Data Flow Taskinside the Foreach Loop Container after the Script Task. Connect the green arrow from Script Task to the Data Flow Task. The Control Flow tab should look as shown below.On the Data Flow Task, drag and drop Flat File Source and configure it as shown below to read the price history CSV files.
Drag and drop a
Derived Column Transformationand create a new column namedSymbolwith the expression(DT_STR,10,1252)@[User::Symbol]to add the Symbol to the data pipeline.Drag and drop OLE DB Destination and configure it as shown below to insert the data into the database.
Your Data Flow tab should look like as shown below:
Before running the package, we need to make couple of changes to prevent any warnings or errors on the design time view due to the absence of the files in the folder.
Click the flat file connection manager
FILE_TickerPriceHistoryand pressF4to view the properties. Change the propertyDelayValidationtoTrue. This will make sure that the validation of file existence will happen during runtime. Click the Ellipsis button against the Expression and set theConnectionStringproperty to the value@[User::FilePath]. This will change the file path as each file is being downloaded from the website.Click the Data Flow Task and press
F4to view the properties. Change the propertyDelayValidationtoTrue. This will make sure that the validation of file existence will happen during runtime.Navigate to the Data Flow tab and click the Flat File Source and press
F4to view the properties. Change the propertyValidateExternalMetadatatoFalse. This will make sure that the validation of flat file existence will happen during runtime.Let us navigate to the folder
C:\Siva\StackOverflow\Files\14797886, where the downloaded files will be saved is empty. The folder does not have to be empty. This is just for execution check.Run the following SQL statements against the database to verify the data in the table. The second table should be empty.
Execute the package. If everything is set up correctly, the package should run successfully and download the files for each symbol listed in table
dbo.TickerSymbolsThe files should be successfully saved to the folder
C:\Siva\StackOverflow\Files\14797886. Notice that each file is named appropriately based on the expressions provided in the package.Run the following SQL statement against the database to verify the data in the table. The table
dbo.TickerPriceHistoryshould now have the data from the price files downloaded from website.The above sample package illustrated how to download price files from Yahoo Finance Chart website for a given list of ticker symbols and load them into the database.