There are account ids, each with a timestamp grouped by username. foreach of these username groups I want all pairs of (oldest account, other account).
I have a java reducer that does that, can I rewrite it as a simple pig script?
Schema:
{group:(username),A: {(id , create_dt)}
Input:
(batman,{(id1,100), (id2,200), (id3,50)})
(lulu ,{(id7,100), (id9,50)})
Desired output:
(batman,{(id3,id1), (id3,id2)})
(lulu ,{(id9,id7)})
Not that anyone seems to care, but here goes. You have to create a UDF:
And the UDF:
and if you wanna play real nicely, add this:
}