I am having an issue with a protobuf that I am using to create a table.
I have a .proto file with 2 fields in a structure. Hive seems to use only 1 field (EMetaData) and ignores the ‘bytes’ type field in the table.
message EE {
required EMetaData header = 1;
optional bytes cl = 2;
}
message EMetaData {
required uint32 version = 1;
optional string root_pid = 2;
}
The table is created like this in Hive.
Hive>desc pbtest2;
OK
key struct<header:struct<rootpid:string,version:int>> from deserializer
value struct<header:struct<rootpid:string,version:int>> from deserializer
Below is my create table statement.
create table pbtest2 row format serde 'MyProtobufDeserializer' with serdeproperties ('KEY_SERIALIZE_CLASS'='CEMessages$EE','VALUE_SERIALIZE_CLASS'='CEMessages$EE') stored as inputformat 'MyInputFormat' outputformat 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
The bytes type cl field is not present in the table. Not sure what the problem is.
Has anyone run into this issue ? Please let me know if you have any suggestions.
Figured out that my SerDe needed some changes. It was not handling ‘bytes’ type from the .proto file. After handling that I am able to see the ‘binary’ type field created for the table.