Hello,
In its container allocation logic, how does YARN take into account the locality of data require for a given job? When a client submits a job, it communicates with the Resource Manager, which as far as I understand does not hold data metadata (this being held in the NameNode). The Application Master makes requests to the ResourceManager for allocation of containers – apparently data locality consideration can be taken into account as part of this request. Hoes does the AppMaster come to know about the location of data required in the job, given that job submission does not involve communicating with the NameNode at any stage. In other words, is there any interaction with the NameNode, which holds data locality information, during the execution of a job on YARN?