I need to extract some data from a 1 GB XML file into <key,value> tables using ets and dets. I have searched the whole web and also in here but I did not find any simple example on how to handle big XML files.
For the beginning I just want to understand how to read the file without uploading the whole of it into memory.
come on ! What you need is a SAX XML parser called Erlsom. For small files, its possible to load it all into memory and then parse it as in the answer i gave to this question. But, for your case, these big files need the SAX method. The Sax examples are here.
SAX ensures that you do not load a file into memory to parse it. The tokens that the parser gets , is what it gives to you. You will need an advanced skill of tail recursion, pattern matching and stateful programming.
EDIT
Now, download erlsom, and extract it into your erlang
lib, a location where all built-in applications are located. Rename its extraction folder like this:erlsom-1.0. Create a file called:Emakefilein theerlsom-1.0folder. Put this inside that file and save.{"src/*", [verbose,report,warn_obsolete_guard,{outdir, "ebin"}]}.The erlsom-1.0 folder, should look like this:
The rest of the other files do not matter. Now, open an erlang shell, whose
pwd()is looking into theerlsom-1.0folder. Run the function:make:all().like thisSo, its done. So if the folder
erlsom-1.0is in your erlanglib, then, you can call the erlsom methods from any erlang shell whicheverpwd()it may have.