I have the following tables in a Hive database:
table1:
id t X
1 1 a
1 4 a
2 5 a
3 10 a
table2:
id t Y
1 3 b
2 6 b
2 8 b
3 15 b
And I would like to merge them to have a table like:
id t Z
1 1 a
1 3 b
1 4 a
2 5 a
2 6 b
2 8 b
3 10 a
3 15 b
Basically what I want to do is :
-
a join on the column
id(that part is easy) -
merge the columns
table1.tandtable2.tinto a new columnt -
have the variable
Zthat is equal totable1.Xif the correspondingtcomes fromtable1.t, andtable2.Yif it comes fromtable2.t -
order the table by
idand then byt(that shouldn’t be too hard)
I have no idea on how to do the parts 2 and 3. I tried with an outer join on
table1.id = table2.id and table1.t = table2.t, but it doesn’t merge the two columns t.
Any pointer would be appreciated. Thanks!
Although not always required, using a subquery for the union’d queries help to organize, plus you can then reference the fields from the union (e.g.
u1.id) in other parts of the query.You’ll need the alias on the 3rd column to make the schemas match. If the source table name was not already a column, you could do something like this: