I’m looking to find the associated source file(s) for specific class(es) in a set of compiled .net assemblies.
e.g.
MyAsm.Namespace.Foo -> C:\Source\foo.cs
MyAsm.Namespace.Bar -> C:\Source\Code\MoreCode\Common.cs
MyAsm.Namespace2.Bar -> C:\Source\Code\MoreCode\Common.cs
...
I have the assembly reflection / extracting the Type information I’m interested in working using standard System.Reflection functionality.
I now need to find the originating .cs source file for the class. While I have a brute force solution in place as a workaround, its unacceptably slow.
I would hope to have the entire process complete in ~5 seconds. Currently, the reflection extraction portion takes less than 1 second, the ‘file association’ takes minutes. I don’t think its unreasonable to scan a couple of MB in 4 seconds.
Unfortunately there are a couple of caveats, which prevent shortcuts.
-
I don’t know the names of the files, so I need to do a
dir / s *.csevery run, to enumerate all the potential source files. -
The class name won’t always match the source file, it can hint at a possible location, but its not guaranteed to work.
-
Multiple classes are defined in the same file in some cases.
-
There are ~20k .cs files / 63MB of source.
-
I need an association between ~10k of the classes / their files.
-
I would prefer not to incrementally build a DB with the file name / classes declared in it, as the file contents will change, and I’ll have the trouble of maintaining this DB etc (though I may have to go down this route if everything else fails).
-
The OS’s this will run on, wont have windows search/indexing enabled, so no joy there either.
What I’ve tried:
-
Using findstr.exe – much too slow
-
Creating a .net app, load all files into memory. – too slow to find *.cs / load all the
files, fast to scan the files once they are in memory. -
Creating one large source file from all the smaller files, loading it, scanning etc – again, too slow. Takes minutes to build the file, fast once loaded.
-
Reading PDB files – I’m investigating PDB2XML.exe, and while it does output file names, and runs quickly, I cant see how to associate a class, with the file name.
So, does anyone have alternate suggestions, magic or some experience with PDB2XML?
Using PDBs is your best option IMHO if the files are on the disk. The file names (represented by ISymbolDocument.URL) are related to sequence points. Sequence points are related to methods (including property get/set), not classes. Of course, a .NET class source can be stored in multiple files. So you’ll have to browse all members of a type (using reflection for example) to determine all the corresponding files.