I want to write a little program that transforms my TeX files into HTML. I want to parse the documents and turn the macros (the build-in and of course my own) into HTML pieces. Here are my requirements:
- predefined rules (e.g.
begin{itemize} \item text \end{itemize}=><br> <p>text </p> <br/>) - defining own CSS style
- ability to convert formulars (extract the formulars, load them in an imagecreator and then save the jpg/png)
- easy to maintain and concise
I know there are several technologies out there, but I don’t exactly know which is the best for me. Here are the technologies which flow into my mind
- Ruby (I/O is easy, formular loading via webrat),
- XML XSLT (I don’t think that I need just overhead)
- perl (there are many libs out there but I’m not quite familiar with it)
- bash (I worked with sed and was surprised how easy it was to work with regular expressions)
- latex2html … (these converters won’t work for me and they don’t give me freedom in parsing)
Any suggestions, hints and comments are welcome.
Thanks for your time, folks.
have a look at pandoc here. it can also be installed on linux or os x. Though it won’t do your custom macros. The only thing I’ve seen that can do a decent job with custom macros is tex4ht, but to really work well you need to be producing .DVI files. If you have a ton of custom macros, writing your own converter is going to take an ass load of time. Even if you only have a few custom macros, it’s still going to be a pain. good luck!