Quantcast
Channel: Hortonworks » All Replies
Viewing all articles
Browse latest Browse all 3435

HDFS Rebalancing

$
0
0

Hi,

With the latest version of HDFS, does it rebalance itself? The only thing I can find about this is here (https://wiki.apache.org/hadoop/FAQ#If_I_add_new_DataNodes_to_the_cluster_will_HDFS_move_the_blocks_to_the_newly_added_nodes_in_order_to_balance_disk_space_utilization_between_the_nodes.3F) which indicates I will have to take manual action to rebalance.

I’m thinking about this question for Autoscaling on EC2. If I use a CloudWatch Alarm to spin up another DataNode at 75% percent capacity, and it doesn’t rebalance the data, I will be in a scale-up loop until I rebalance the data because my original nodes will still be at around 75% capacity.

If this is handled automatically now? Or if that is a flag that can be set, that would resolve my concern. Otherwise, I will have to script the rebalancing of the data, have a long wait period between autoscaling events and plan for the performance impact of the rebalancing.

Thanks for any ideas you may have.

Thanks,
Scott Edwards


Viewing all articles
Browse latest Browse all 3435

Trending Articles