I am trying to load a text file into a database. My text file is about 1.6GB. I need to write a python script to load the text file with all the headers into a database
Any guidelines on how I go about doing this?
thanks
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Using python is certainly possible. If you’re reading into mysql, you might check out mysql-python. Reading the text file you can use file = open(‘filename’,’r’), and file.readline() to get each line and parse it.
However… there is an added overhead of using python. If the text file is orderly (that is, one reacord per row, each row having the same number of columns with a consistent delimiter such as a comma, tab, semi-colon, etc), then the most efficient way is to load it directly. In mysql, you’d do this something like:
If you need some minor modifications to the file, such as changing commas or things at the beginning or end of the line you might use a command line sed (if you’re on *nix or osx… you’ll have to install if you’re on windows).
Update
LOAD DATA INFILE will be quickest: http://dev.mysql.com/doc/refman/5.5/en/load-data.html
When you say “start of article 1. some text 2. some text 3. some text MAINO”, are 1., 2., 3. and MAINO DIFFERENT FIELDS? If you had 2 fields such as a header and article, you might format your text document to look something like:
Then, you could use LOAD DATA INFILE like: