Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

Local Spark talking to remote HDFS?

$
0
0

I have a file in HDFS inside my HortonWorks HDP 2.3_1 VirtualBox VM.

If I go into the guest spark-shell and refer to the file thus, it works fine

val words=sc.textFile(“hdfs:///tmp/people.txt”)
words.count

However if I try to access it from a local Spark app on my Windows host, it doesn’t work

val conf = new SparkConf().setMaster(“local”).setAppName(“My App”)
val sc = new SparkContext(conf)

val words=sc.textFile(“hdfs://localhost:8020/tmp/people.txt”)
words.count

Emits

Exception in thread “main” org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-452094660-10.0.2.15-1437494483194:blk_1073742905_2098 file=/tmp/people.txt at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)

The port 8020 is open, and if I choose the wrong file name, it will tell me

Input path does not exist: hdfs://localhost:8020/tmp/people.txt!!

localhost:8020 should be correct as the guest HDP VM hat NAT port tunneling to my host Windows box.

I also opened up some extra ports

http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_reference/content/reference_chap2_1.html

And it’s telling that if I give it the wrong name I get an appropriate exception

My pom has

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.4.1</version>
<scope>provided</scope>
</dependency>

Am I doing something wrong? And what is the BlockMissingException trying to tell me?

And how do I fix it, if at all?

I just want to be able to do my Spark dev on my windows box in Scala IDE whilst talking to my HDP sandbox.

Thanks.


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>