I have a dictionary of struct, where one member is a list containing varying elements applicable to each dictionary item.
I would like to join these elements against each item, in order to filter them and/or group them by element.
In SQL I’m familiar with joining against tables/queries to obtain multiple rows as desired, but I’m new to C#/Linq. Since a “column” can be an object/list already associated with the proper dictionary items, I wonder how I can use them to perform a join?
Here’s a sample of the structure:
name elements
item1 list: elementA
item2 list: elementA, elementB
I would like a query that gives this output (count = 3)
name elements
item1 elementA
item2 elementA
item2 elementB
For ultimately, grouping them like this:
element count
ElementA 2
ElementB 1
Here’s my code start to count dictionary items.
public struct MyStruct
{
public string name;
public List<string> elements;
}
private void button1_Click(object sender, EventArgs e)
{
MyStruct myStruct = new MyStruct();
Dictionary<String, MyStruct> dict = new Dictionary<string, MyStruct>();
// Populate 2 items
myStruct.name = "item1";
myStruct.elements = new List<string>();
myStruct.elements.Add("elementA");
dict.Add(myStruct.name, myStruct);
myStruct.name = "item2";
myStruct.elements = new List<string>();
myStruct.elements.Add("elementA");
myStruct.elements.Add("elementB");
dict.Add(myStruct.name, myStruct);
var q = from t in dict
select t;
MessageBox.Show(q.Count().ToString()); // Returns 2
}
Edit: I don’t really need the output is a dictionary. I used it to store my data because it works well and prevents duplicates (I do have unique item.name which I store as the key). However, for the purpose of filtering/grouping, I guess it could be a list or array without issues. I can always do .ToDictionary where key = item.Name afterwards.
The method here is Enumerable.SelectMany. Using extension method syntax:
EDIT
Note that you could also use
t.Value.nameabove, instead oft.Key, since these values are equal.So, what’s going on here?
The query-comprehension syntax is probably easiest to understand; you can write an equivalent iterator block to see what’s going on. We can’t do that simply with an anonymous type, however, so we’ll declare a type to return:
How about the extension method syntax (or, what’s really going on here)?
(This is inspired in part by Eric Lippert’s post at https://stackoverflow.com/a/2704795/385844; I had a much more complicated explanation, then I read that, and came up with this:)
Let’s say we want to avoid declaring the NameElement type. We could use an anonymous type by passing in a function. We’d change the call from this:
to this:
The lambda expression
(string1, string2) => new { name = string1, element = string2 }represents a function that takes 2 strings — defined by the argument list(string1, string2)— and returns an instance of the anonymous type initialized with those strings — defined by the expressionnew { name = string1, element = string2 }.The corresponding implementation is this:
Type inference allows us to call this function without specifying
Tby name. That’s handy, because (as far as we are aware as C# programmers), the type we’re using doesn’t have a name: it’s anonymous.Note that the variable
tis nowpair, to avoid confusion with the type parameterT, andvis nowe, for “element”. We’ve also changed the type of the first parameter to one of its base types,IEnumerable<KeyValuePair<string, MyStruct>>. It’s wordier, but it makes the method more useful, and it will be helpful in the end. As the type is no longer a dictionary type, we’ve also changed the name of the parameter fromdicttopairs.We could generalize this further. The second
foreachhas the effect of projecting a key-value pair to a sequence of type T. That whole effect could be encapsulated in a single function; the delegate type would beFunc<KeyValuePair<string, MyStruct>, T>. The first step is to refactor the method so we have a single statement that converts the elementpairinto a sequence, using theSelectmethod to invoke theresultSelectordelegate:Now we can easily change the signature:
The call site now looks like this; notice how the lambda expression now incorporates the logic that we removed from the method body when we changed its signature:
To make the method more useful (and its implementation less verbose), let’s replace the type
KeyValuePair<string, MyStruct>with a type parameter,TSource. We’ll change some other names at the same time:And, just for kicks, we’ll make it an extension method:
And there you have it: SelectMany! Well, the function still has the wrong name, and the actual implementation includes validation that the source sequence and the selector function are non-null, but that’s the core logic.
From MSDN:
SelectMany“projects each element of a sequence to an IEnumerable and flattens the resulting sequences into one sequence.”