I have a custom load function that just extends Pig’s PigStorage load func. I’m looking to do some work with type casting, but I need access to the schema, but I’m not sure how/where to access the pig schema. I’m not sure if you need any additional information, but if you do, please let me know, and I’ll happy to provide it.
Share
Pig doesn’t reliably provide the user-defined schema to a LoadFunc. If you implement LoadPushdown, and only some of the fields are needed, you’ll get a call that indicates which of the fields are required; but that’s only if projection happens, so you can’t rely on that for 100% of the use cases.
To play with typecasting, you can implement a custom LoadCaster interface; it will be used to translate from bytearrays to specific types, and you can do your conversions there.