I tried playing with backing up namenode metadata today.
Due to some reason this did not work : curl -o fsimage.<dt> ‘http://<host>:50070/getimage?getimage=1&txid=latest’ 2>/dev/null
Then I tried : hdfs dfsadmin -fetchImage – this command only fetched the latest fsimage_<xxx> file but not the fsimage<xxx>.md5 file. Due to the missing .md5 file the namenode did not start.
Then I simply copied the fsimage_<xxx> and the fsimage_<xxx>.md5 from the namenode directory to a backup directory (when the cluster was up and running) and then deleted the two files from the namenode directory (after shutting down the cluster) and recopied the files from the backup directory. I ran start-dfs.sh and all the nodes came up fine.
So if a simply OS copy will suffice (will it?) to backup namenode metadata files, then why are we bothering with the first two type of copies – curl and hdfs dfsadmin -fetchImage?
Appreciate the insights.