I am trying to connect R to a Hadoop cluster using R. The cluster has HDFS, Map Reduce, Hive, Pig and Sqoop installed on it.
R will be running on in the Windows environment. I know that rhdfs, rhadoop and rmr exist for Linuix, but I can’t find anything on Windows.
Does anyone know of a library to use?
Thank you
Revolution Analytrics is trying to make a name for themselves in this space. They have a couple of nice packages (some of which are open-source and/or free for non-commercial use) which allow you to interact with Hadoop from R in a Windows environment fluidly.