I’m trying to recreate Hadoop’s word count map / reduce logic in a simple

Question

0

Editorial Team

Asked: June 15, 20262026-06-15T21:49:26+00:00 2026-06-15T21:49:26+00:00

I’m trying to recreate Hadoop’s word count map / reduce logic in a simple

0

I’m trying to recreate Hadoop’s word count map / reduce logic in a simple Scala program for learning

This is what I have so far

val words1 = "Hello World Bye World"      
val words2 = "Hello Hadoop Goodbye Hadoop"

val input = List(words1,words2)           
val mapped = input.flatMap(line=>line.split(" ").map(word=>word->1))
    //> mapped  : List[(String, Int)] = List((Hello,1), (World,1), (Bye,1), 
    //                                       (World,1), (Hello,1), (Hadoop,1), 
    //                                       (Goodbye,1), (Hadoop,1))

mapped.foldLeft(Map[String,Int]())((sofar,item)=>{
    if(sofar.contains(item._1)){
        sofar.updated(item._1, item._2 + sofar(item._1))
    }else{
        sofar + item
    }
})                              
    //>Map(Goodbye -> 1, Hello -> 2, Bye -> 1, Hadoop -> 2, World -> 2)

This seems to work, but I’m sure there is a more idiomatic way to handle the reduce part (foldLeft)

I was thinking about perhaps a multimap, but I have a feeling Scala has a way to do this easily

Is there? e.g. a way to add to a map, and if the key exists, instead of replacing it, adding the value to the existing value. I’m sure I’ve seen this quesion somewhere, but couldn’t find it and neither the answer.

I know groupBy is the way to do it probably in the real world, but I’m trying to implement it as close as possible to the original map/reduce logic in the link above.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T21:49:27+00:00

You can use Scalaz’s |+| operator because Maps are part of the Semigroup typeclass:

The |+| operator is the Monoid mappend function (a Monoid is any “thing” that can be “added” together. Many things can be added together like this: Strings, Ints, Maps, Lists, Options etc. An example:

scala> import scalaz._
import scalaz._

scala> import Scalaz._
import Scalaz._

scala> val map1 = Map(1 -> 3 , 2 -> 4)
map1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 3, 2 -> 4)

scala> val map2 = Map(1 -> 1, 3 -> 6)
map2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 3 -> 6)

scala> map1 |+| map2
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 4, 3 -> 6, 2 -> 4)

So in your case, rather then create a List[(String,Int)], create a List[Map[String,Int]], and then sum them:

val mapped = input.flatMap(_.split(" ").map(word => Map(word -> 1)))
mapped.suml

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to recreate Hadoop’s word count map / reduce logic in a simple

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply