I’ve defined a custom input and output format over in XMLIO.scala:
import scala.xml.Node
import org.apache.hadoop.lib.input.FileInputFormat
import org.apache.hadoop.lib.output.FileOutputFormat
import org.apache.hadoop.mapreduce.{ RecordReader, RecordWriter }
// ...
object XMLIO {
class XMLInputFormat extends FileInputFormat[LongWritable, Node] { /*...*/ }
class XMLRecordReader extends RecordReader[LongWritable, Node] { /*...*/ }
class XMLOutputFormat extends FileOutputFormat[LongWritable, Node] { /*...*/ }
class XMLRecordWriter extends RecordWriter[LongWritable, Node] { /*...*/ }
}
Which I’m trying to use for a job I’m defining over in Example.scala:
import XMLIO._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.mapreduce.Job
object Example {
@throws(classOf[Exception])
def main( args : Array[String] ) {
val job = new Job(new Configuration(), "")
job setInputFormatClass classOf[XMLInputFormat]
}
}
However, this is giving me a compiler error:
[ERROR] /path/to/Example.scala:8: error: type mismatch;
[INFO] found : java.lang.Class[XMLInputFormat](classOf[XMLInputFormat])
[INFO] required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.InputFormat]
[INFO] job setInputFormatClass classOf[XMLInputFormat]
[INFO] ^
Which seemed odd to me, given that XMLInputFormat is a subclass of FileInputFormat, which is a subclass of InputFormat.
Playing around a bit in the REPL, I found a weird workaround. If I create an instance of XMLInputFormat before I set the input format class, there’s no compiler error. That is, the following compiles fine:
import XMLIO._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.mapreduce.Job
object Example {
@throws(classOf[Exception])
def main( args : Array[String] ) {
val x = new XMLInputFormat()
val job = new Job(new Configuration(), "")
job setInputFormatClass classOf[XMLInputFormat]
}
}
What’s going on here? Is there a fix for this that doesn’t seem like so much of a hack?
Looks like this was a bug with scala 2.9.0 (which is what I was using). When I upgraded to scala 2.9.1, the problem went away.