I have a C# regex-parser program with three files in it, each containing a static class:
1) one static class filled with string dictionaries
static class MyStringDicts
{
internal static readonly Dictionary<string, string> USstates =
new Dictionary<string, string>()
{
{ "ALABAMA", "AL" },
{ "ALASKA", "AK" },
{ "AMERICAN SAMOA", "AS" },
{ "ARIZONA", "AZ" },
{ "ARKANSAS", "AR" }
// and so on
}
// and some other dictionaries
}
2) A class that compiles these values into Regex
public static class Patterns
{
Public static readonly string StateUS =
@"\b(?<STATE>" + CharTree.GenerateRegex(Enumerable.Union(
AddrVals.USstates.Keys,
AddrVals.USstates.Values))
+ @")\b";
//and some more like these
}
3) some code that runs regular expressions based on these strings:
public static class Parser
{
// heavily simplified example
public static GroupCollection SearchStringForStates(string str)
{
return Regex.Match(str,
"^" + Patterns.StateUS,
RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase).Groups;
}
}
I’d like to be able to generate 2) as with a T4 template, as all of this concatenation is identical on every execution:
@"\b(?<STATE><#=CharTree.GenerateRegex(Enumerable.Union(
AddrVals.USstates.Keys,
AddrVals.USstates.Values)#>)\b";
This works, but if I create a new member of MyStringDicts, or add/remove some values from its dictionaries, the T4 template won’t recognize them until exclude Patterns.cs from compilation and recompile. As Parser depends on Patterns, this really isn’t an option – I need the T4 transformation to take into account changes to other files in the same build.
Things I don’t want do do:
- Split
MyStringDictsinto its own project. I’d like to keep the files in one project, as they are a logical unit. - Just move
MyStringDictsinto the top of Patterns.cs. I need the MyStringDicts members for other purposes, too (for dictionary lookups, or in other T4 templates, for example.)
I adopted the advice here about using T4Toolbox’s VolatileAssembly and such, but that seems to only work for the reverse direction, when the class files need to be recompiled after editing the T4 template.
Is what I want possible?
edited for clarity
I just created a small test template which uses EnvDte (Visual Studio Automation) and the T4Toolbox to run through the first file. It picks up the file through the project, so there’s no need to compile before running the template. In fact, it even picks up unsaved changes…
This is basically the same approach as FullSnabel uses, but without the need for Roslyn.
This should work if you want to stick to your original approach.
What you seem to be doing is storing data in a class file. You could consider storing your lists outside code (in an xml or ini file) and generate both files based on that data. That way you avoid the problem all together, it might also make managing the lists easier.
If you don’t care too much about changes to the list you could also put the dictionaries inside the T4 template itself.
Another alternative might dealing with it fully in code. You could create a subclass of Dictionary which has a ‘Pattern’ property (or GetPattern() function). The parser would then use AddrVals.USstates.Pattern, and the patterns class won’t be needed anymore. This way you won’t need any code generation.
Perhaps a wrapper around the actual dictionary would be better because it allows you to hide the actual collection to make sure it’s not changed at runtime. See Is there a read-only generic dictionary available in .NET? for an example of that.