I have the data set below. I want to get a unique list of

Question

0

Asked: June 12, 20262026-06-12T07:27:40+00:00 2026-06-12T07:27:40+00:00

I have the data set below. I want to get a unique list of

0

I have the data set below. I want to get a unique list of the first column as the output. {9719,382 ..} there are integers in the end of the each line so checking if it starts and ends with a number is not a way and i couldn’t think of a solution. Can you show me how to do it? I’d really
appreciate it if you show it in detail.(with what to do in map and what to do in reduce step)

id  - - [date] "URL"

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T07:27:41+00:00

In your mapper you should parse each line and write out the token that you are interested in from the beginning of the line (e.g. 9719) as the Key in a Key-Value pair (the Value is irrelevant in this case). Since the keys will be sorted before sending to the reducer, all you need to do in the reducer is iterate thru the values and each time a value changes, output it.

The WordCount example app that is packaged with Hadoop is very close to what you need.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the data set below. I want to get a unique list of

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply