I’m analyzing data with Apache pig and could not find a way to expand an array if items.
Here is the schema I’m working with, and an example of the desired output:
(col1:int, col2:int, items:{ARRAY_ELEM:(name:chararray, total:int)})
input = (1, 1, {("bird", 5), ("bear", 12), ("wolf", 10)})
output = (1, 1, "bird", 5, "bear", 12, "wolf", 10)
Is there any way to do this transformation?
Thanks for your help!
If you need to do this transformation right now the easiest way is probably to do a UDF in Python or Java (I am not aware of any built-in solution).
However, most of the time it is better to keep the same number of columns in each record (e.g. keep your array as a bag or tuple and don’t “flatten” it in one record).