I got a big set of molecules from the zinc database (http://zinc.docking.org/), in mol2 (http://tripos.com/index.php?family=modules,SimplePage,,,&page=sup_mol2&s=0) format. I would like to be able to split this database into an arbitrary set of N smaller databases. What is the best scripting approach in either python, bash or perl for this? I read about openbabel, but it can only generate sets of individual molecules.
If not, I can also convert mol2 to another more convenient format
Thaks
csplitcan separate the file into individual molecules:If you want something more clever then you can read each molecule into a list or array as a string and then spit out as many as you like into each file.