Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

Corrupt file recovery

$
0
0

Hypothetical situation : let us suppose I have a four node hadoop cluster – one name node and three data nodes.

I load a 3 gb data file and I am thinking that it will split the data amongst the three data nodes – say 1 gb each.

Then sometime later let us suppose that I get corruption errors and am able to narrow down the corrupt data to node 3 – the same file as loaded in the first step.

Let us suppose that I don’t have another copy of that file anywhere.

So how do i recover only that part of the data? That 1 gb on node 3?

Second hypothetical situation : instead of the 3 gb data file, let us suppose that the data file is 300 gb.

And same corruption happens only on node 3 which hosts 1/3rd of the data ie 100 gb.

And I HAVE the original data file.

So how do I load only that 100 gb data into Node 3 on HDFS?

So in short I am looking for solutions to address partial data node corruption issues.

Appreciate the insights.


Viewing all articles
Browse latest Browse all 3435

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>