Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6982103
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T18:14:38+00:00 2026-05-27T18:14:38+00:00

I want to use Apache pig , but until now I have just parsed

  • 0

I want to use Apache pig, but until now I have just parsed formatted data like csv or comma separated etc.

But if I have some data separated by ‘;’ & ‘@&@’ etc, how can I work with it?

Like when I used MapReduce I split data by “;” in map and then again by “@&@” in reduce.

Also suppose for example we have a csv file with first field username which is made by “FirstnameLastname” format,

raw = LOAD 'log.csv' USING PigStorage(',') AS (username: chararray, site: chararray, viwes: int);

By above example we can just get whole username, but how can I get both Name and Lastname different?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T18:14:39+00:00Added an answer on May 27, 2026 at 6:14 pm

    You can do just about anything Java or Python can do with UDFs in Pig. Pig is not intended to have an exhaustive set of processing functions, but just provide basic functionality. Piggybank fills the niche of custom code for doing stuff by collecting a bunch of community-contributed UDFs. Sometimes, piggybank just doesn’t have what you need. It’s a good thing UDFs are pretty simple to write.

    • You could write a custom loader that handles the unique structure of your data at load time. The custom load function manipulates the data with Java code and outputs its structured columnar format that Pig is looking for. Another nice thing about customer loaders is you can specify the load schema so you don’t have to write out the AS (...)

      A = LOAD 'log.csv' USING MyCustomLoader('whatever', 'parameters);
      
    • You could write a custom evaluation function. Sometimes a function like SPLIT or TOKENIZE just isn’t good enough. Use TextLoader to get your data in line-by-line, and then following up with a UDF to parse that line and output a tuple (which can then be flattened into columns).

      A = LOAD 'log.csv' USING TextLoader() as (line:char array);
      B = FOREACH A GENERATE FLATTEN(CustomLineParser(line));
      
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

So I use apache HttpComponents to handle http request in java. Now I want
I want to use apache common-io package, but I am not sure whether there
I want to use apache commons file upload, but I am novice in Java
I want to use the commons.apache.maths classes for my own project but I don't
I have a question about the REST and Apache wink. I want to use
There are many how-to's for people who want to use Rails with Apache and
I want to use these two classes from lucene - import org.apache.lucene.analysis.snowball.*; import org.apache.lucene.analysis.PorterStemmer;
I want to use log4j in my jsp s and servlets. I read apache
I want to unit test a RESTful interface written with Apache CXF. I use
I want use BYTE_ORDER macro in my Xcode project but i can't because i

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.