Hello,
We’re doing a POC with one name node and three data nodes and I have a newbie question about disk space. Assuming that I need to import 300GB of data – does it mean that each data node should be capable to handle 300GB, or the data will be equally distributed among three data nodes? In other words, will those 300GB be mirrored to each node, or only particular blocks are replicated (in such case, the redundancy is achieved by syncing between two data nodes)?
Many thanks.
↧
Newbie question
↧