I have a weird problem with a number of different MapReduce jobs. When they are being submitted the “job.jar” and “job.split” files are being created with a target replica setting of 10.
I cannot see this value anywhere in the config.
It is a pain because I don’t have 10 data nodes in my cluster so I temporarily have under replicated blocks. This would be fine if they dissappeared but sometimes when the system fails they stay around.
I have tried setting dfs.replication.max to 5 (the number of data nodes I have) and the MapReduce jobs all refuse to start! (It was previously set to 50)
hdfs fsck / | grep Under
Connecting to namenode via http://mynamenode.local:50070
/user/example/.staging/job_1430325670718_7300/job.jar: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575400_3836552. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7300/job.split: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575404_3836556. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7302/job.jar: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575414_3836566. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7302/job.split: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575417_3836569. Target Replicas is 10 but found 5 replica(s).
Under-replicated blocks: 4 (4.899343E-4 %)
And a few minutes later
hdfs fsck / | grep Under
Connecting to namenode via http://bruathdp001.iggroup.local:50070
Under-replicated blocks: 0 (0.0 %)