Hadoop clister: HDP, RHEL OS
R/Rstudio: Windows 7 Enterprise
RHdfs,rmr packages.
I am working on a project where i need to get some data from HDFS on to R data frames. I thought things will be might easlir if just get RHadoop packages from Github.
But this RHDFS package has a prerequsite of setting , HADOOP_CMD environment variable which should point to “full path for the hadoop binary”.
Like i mentioned my hadoop cluster is running on Linux and R on windows, how do i establish communication between these two.
Install Hadoop on VM in a windows environment: I can do it for testing purposes but i am looking for production solution.
Install R on one of the cluster edge notes : I can do it for testing purposes but i am looking for production solution.
Install Hadoop binaries on Windows: How ?