Hi all!
I’m setting up a test cluster of two machines. The installation process was completed successfully – I installed, all the master services on the machine running ambari server (hdfs, yarn, hbase, zookeeper), and all the worker services and clients on the other (worker) machine.
After having restarted both the machines, I could start all the master services except HBase, but the ambari-agent can’t seem to register.
This is part of the ambari-server log just after restarting the ambari-agent:
23 Apr 2015 12:06:03,272 WARN [qtp-ambari-agent-242] SecurityFilter:103 - Request https://ambariserver2:8440/ca doesn't match any pattern. 23 Apr 2015 12:06:03,272 WARN [qtp-ambari-agent-242] SecurityFilter:62 - This request is not allowed on this port: https://ambariserver2:8440/ca 23 Apr 2015 12:06:06,281 INFO [qtp-ambari-agent-242] HeartBeatHandler:877 - agentOsType = centos6 23 Apr 2015 12:06:06,306 INFO [qtp-ambari-agent-242] HostImpl:277 - Received host registration, host=[hostname=datanode04,fqdn=datanode04.bi.internal,domain=bi.internal,architecture=x86_64,processorcount=2,phys icalprocessorcount=2,osname=centos,osversion=6.6,osfamily=redhat,memory=3914964,uptime_hours=1,mounts=(available=41691044,mountpoint=/,used=2883644,percent=7%,size=46967160,device=/dev/mapper/vg_datanode04-lv_ro ot,type=ext4)(available=1957480,mountpoint=/dev/shm,used=0,percent=0%,size=1957480,device=tmpfs,type=tmpfs)(available=436586,mountpoint=/boot,used=25466,percent=6%,size=487652,device=/dev/vda1,type=ext4)] , registrationTime=1429783566280, agentVersion=2.0.0 23 Apr 2015 12:06:48,709 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:146 - Heartbeat lost from host datanode04 23 Apr 2015 12:06:48,710 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component METRICS_MONITOR on datanode04 23 Apr 2015 12:06:48,712 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component HBASE_REGIONSERVER on datanode04 23 Apr 2015 12:06:48,714 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component DATANODE on datanode04 23 Apr 2015 12:06:48,715 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component NODEMANAGER on datanode04 23 Apr 2015 12:07:06,945 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert datanode_webui for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:07:06,949 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert datanode_process for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:07:06,953 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert hbase_regionserver_process for an invalid service HBASE and component HBASE_REGIONSERVER on host datanode04. bi.internal 23 Apr 2015 12:07:06,956 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_webui for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal 23 Apr 2015 12:07:06,961 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert ams_metrics_monitor_process for an invalid service AMBARI_METRICS and component METRICS_MONITOR on host data node04.bi.internal 23 Apr 2015 12:07:06,964 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_health for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal 23 Apr 2015 12:07:48,772 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:146 - Heartbeat lost from host datanode04 23 Apr 2015 12:07:48,773 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component METRICS_MONITOR on datanode04 23 Apr 2015 12:07:48,775 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component HBASE_REGIONSERVER on datanode04 23 Apr 2015 12:07:48,777 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component DATANODE on datanode04 23 Apr 2015 12:07:48,779 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component NODEMANAGER on datanode04 23 Apr 2015 12:08:07,636 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert datanode_webui for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:08:07,639 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert datanode_process for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:08:07,643 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert hbase_regionserver_process for an invalid service HBASE and component HBASE_REGIONSERVER on host datanode04. bi.internal 23 Apr 2015 12:08:07,645 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert datanode_storage for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:08:07,653 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_webui for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal 23 Apr 2015 12:08:07,659 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert ams_metrics_monitor_process for an invalid service AMBARI_METRICS and component METRICS_MONITOR on host data node04.bi.internal 23 Apr 2015 12:08:07,661 WARN [alert-event-bus-2] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_health for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal 23 Apr 2015 12:08:48,840 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:146 - Heartbeat lost from host datanode04 23 Apr 2015 12:08:48,841 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component METRICS_MONITOR on datanode04 23 Apr 2015 12:08:48,843 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component HBASE_REGIONSERVER on datanode04 23 Apr 2015 12:08:48,845 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component DATANODE on datanode04 23 Apr 2015 12:08:48,847 WARN [ambari-hearbeat-monitor] HeartbeatMonitor:161 - Setting component state to UNKNOWN for component NODEMANAGER on datanode04 23 Apr 2015 12:09:08,301 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert datanode_process for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:09:08,304 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert hbase_regionserver_process for an invalid service HBASE and component HBASE_REGIONSERVER on host datanode04. bi.internal 23 Apr 2015 12:09:08,308 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert ams_metrics_monitor_process for an invalid service AMBARI_METRICS and component METRICS_MONITOR on host data node04.bi.internal 23 Apr 2015 12:09:08,312 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert datanode_webui for an invalid service HDFS and component DATANODE on host datanode04.bi.internal 23 Apr 2015 12:09:08,317 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_webui for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal 23 Apr 2015 12:09:08,319 WARN [alert-event-bus-1] AlertReceivedListener:302 - Unable to process alert yarn_nodemanager_health for an invalid service YARN and component NODEMANAGER on host datanode04.bi.internal
and this is the log on the ambari agent, it seems to perform heartbeats and receive acknowledgements:
INFO 2015-04-23 12:18:55,305 HostCheckReportFileHandler.py:91 - Host check report at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:18:55,305 HostCheckReportFileHandler.py:141 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:18:55,306 HostCheckReportFileHandler.py:146 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:18:55,311 Controller.py:125 - Registering with datanode04.bi.internal (10.0.10.30) (agent='{"hardwareProfile": {"kernel": "Linux", "domain": "bi.internal", "physicalprocessorcount": 2, "kernel release": "2.6.32-504.el6.x86_64", "uptime_days": "0", "memorytotal": 3914964, "swapfree": "3.87 GB", "memorysize": 3914964, "osfamily": "redhat", "swapsize": "3.87 GB", "processorcount": 2, "netmask": "255.255. 255.0", "timezone": "CET", "hardwareisa": "x86_64", "memoryfree": 3682172, "operatingsystem": "centos", "kernelmajversion": "2.6", "kernelversion": "2.6.32", "macaddress": "02:01:2B:0C:B7:3E", "operatingsystemre lease": "6.6", "ipaddress": "10.0.10.30", "hostname": "datanode04", "uptime_hours": "1", "fqdn": "datanode04.bi.internal", "id": "root", "architecture": "x86_64", "selinux": true, "mounts": [{"available": "41690 904", "used": "2883784", "percent": "7%", "device": "/dev/mapper/vg_datanode04-lv_root", "mountpoint": "/", "type": "ext4", "size": "46967160"}, {"available": "1957480", "used": "0", "percent": "0%", "device": " tmpfs", "mountpoint": "/dev/shm", "type": "tmpfs", "size": "1957480"}, {"available": "436586", "used": "25466", "percent": "6%", "device": "/dev/vda1", "mountpoint": "/boot", "type": "ext4", "size": "487652"}], "hardwaremodel": "x86_64", "uptime_seconds": "4480", "interfaces": "eth0,lo"}, "currentPingPort": 8670, "prefix": "/var/lib/ambari-agent/data", "agentVersion": "2.0.0", "agentEnv": {"transparentHugePage": "alway s", "hostHealth": {"agentTimeStampAtReporting": 1429784335306, "activeJavaProcs": [], "liveServices": [{"status": "Unhealthy", "name": "ntpd", "desc": "ntpd is stopped\\n"}]}, "iptablesIsRunning": true, "reverse Lookup": true, "alternatives": [], "umask": "18", "stackFoldersAndFiles": [{"type": "directory", "name": "/etc/hadoop"}, {"type": "directory", "name": "/etc/hbase"}, {"type": "directory", "name": "/etc/zookeeper "}, {"type": "directory", "name": "/var/run/hadoop"}, {"type": "directory", "name": "/var/run/hbase"}, {"type": "directory", "name": "/var/run/zookeeper"}, {"type": "directory", "name": "/var/run/hadoop-yarn"}, {"type": "directory", "name": "/var/run/hadoop-mapreduce"}, {"type": "directory", "name": "/var/log/hadoop"}, {"type": "directory", "name": "/var/log/hbase"}, {"type": "directory", "name": "/var/log/zookeeper"}, {"type": "directory", "name": "/var/log/hadoop-yarn"}, {"type": "directory", "name": "/var/log/hadoop-mapreduce"}, {"type": "directory", "name": "/usr/lib/hadoop"}, {"type": "directory", "name": "/usr/lib/flume "}, {"type": "directory", "name": "/usr/lib/storm"}, {"type": "directory", "name": "/var/lib/hadoop-hdfs"}, {"type": "directory", "name": "/var/lib/hadoop-yarn"}, {"type": "directory", "name": "/var/lib/hadoop-m apreduce"}, {"type": "directory", "name": "/tmp/hadoop-hdfs"}, {"type": "directory", "name": "/hadoop/hbase"}, {"type": "directory", "name": "/hadoop/zookeeper"}, {"type": "directory", "name": "/hadoop/hdfs"}, { "type": "directory", "name": "/hadoop/yarn"}], "existingUsers": [{"status": "Available", "name": "mapred", "homeDir": "/home/mapred"}, {"status": "Available", "name": "hbase", "homeDir": "/home/hbase"}, {"status ": "Available", "name": "ambari-qa", "homeDir": "/home/ambari-qa"}, {"status": "Available", "name": "zookeeper", "homeDir": "/home/zookeeper"}, {"status": "Available", "name": "hdfs", "homeDir": "/home/hdfs"}, { "status": "Available", "name": "yarn", "homeDir": "/home/yarn"}]}, "timestamp": 1429784335182, "hostname": "datanode04.bi.internal", "responseId": -1, "publicHostname": "datanode04.bi.internal"}') INFO 2015-04-23 12:18:55,312 NetUtil.py:60 - Connecting to https://ambariserver2:8440/connection_info INFO 2015-04-23 12:18:55,704 security.py:93 - SSL Connect being called.. connecting to the server INFO 2015-04-23 12:18:55,939 security.py:55 - SSL connection established. Two-way SSL authentication is turned off on the server. INFO 2015-04-23 12:18:55,974 Controller.py:149 - Registration Successful (response id = 0) INFO 2015-04-23 12:18:55,974 Controller.py:153 - Got status commands on registration. WARNING 2015-04-23 12:18:55,974 AlertSchedulerHandler.py:92 - There are no alert definition commands in the heartbeat; unable to update definitions INFO 2015-04-23 12:18:55,974 Controller.py:350 - Registration response from ambariserver2 was OK INFO 2015-04-23 12:18:55,975 Controller.py:355 - Resetting ActionQueue... INFO 2015-04-23 12:19:05,986 Heartbeat.py:75 - Building Heartbeat: {responseId = 0, timestamp = 1429784345986, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:06,109 HostCheckReportFileHandler.py:91 - Host check report at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:19:06,110 HostCheckReportFileHandler.py:141 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:19:06,110 HostCheckReportFileHandler.py:146 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:19:06,392 Controller.py:239 - Heartbeat response received (id = 1) INFO 2015-04-23 12:19:06,393 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:19:16,394 Heartbeat.py:75 - Building Heartbeat: {responseId = 1, timestamp = 1429784356394, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:16,442 Controller.py:239 - Heartbeat response received (id = 2) INFO 2015-04-23 12:19:16,443 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:19:26,443 Heartbeat.py:75 - Building Heartbeat: {responseId = 2, timestamp = 1429784366443, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:26,490 Controller.py:239 - Heartbeat response received (id = 3) INFO 2015-04-23 12:19:26,490 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:19:36,491 Heartbeat.py:75 - Building Heartbeat: {responseId = 3, timestamp = 1429784376491, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:36,543 Controller.py:239 - Heartbeat response received (id = 4) INFO 2015-04-23 12:19:36,543 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:19:46,544 Heartbeat.py:75 - Building Heartbeat: {responseId = 4, timestamp = 1429784386544, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:46,598 Controller.py:239 - Heartbeat response received (id = 5) INFO 2015-04-23 12:19:46,599 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:19:52,816 scheduler.py:509 - Running job "14372191-75df-4c7a-8825-9c849a8f57e8 (trigger: interval[0:01:00], next run at: 2015-04-23 12:19:52.813771)" (scheduled at 2015-04-23 12:19:52.813771) INFO 2015-04-23 12:19:52,821 scheduler.py:509 - Running job "03028a48-e699-43f5-b6c4-ff2ea805da03 (trigger: interval[0:01:00], next run at: 2015-04-23 12:19:52.815123)" (scheduled at 2015-04-23 12:19:52.815123) INFO 2015-04-23 12:19:52,823 scheduler.py:527 - Job "14372191-75df-4c7a-8825-9c849a8f57e8 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.813771)" executed successfully INFO 2015-04-23 12:19:52,823 scheduler.py:509 - Running job "4c96efd5-3c3c-404b-bf7a-1b3ffaef276c (trigger: interval[0:01:00], next run at: 2015-04-23 12:19:52.815760)" (scheduled at 2015-04-23 12:19:52.815760) INFO 2015-04-23 12:19:52,831 scheduler.py:509 - Running job "3de93e99-bedb-4de9-8039-96f713cff992 (trigger: interval[0:01:00], next run at: 2015-04-23 12:19:52.816300)" (scheduled at 2015-04-23 12:19:52.816300) INFO 2015-04-23 12:19:52,842 scheduler.py:509 - Running job "3e2ce1a4-4b87-43d8-ab01-34be1c07e6e9 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.816938)" (scheduled at 2015-04-23 12:19:52.816938) INFO 2015-04-23 12:19:52,843 scheduler.py:509 - Running job "56ac6ab3-d457-49fa-b3b8-717a7a052458 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.817534)" (scheduled at 2015-04-23 12:19:52.817534) INFO 2015-04-23 12:19:52,845 scheduler.py:509 - Running job "59196fab-5960-4573-b51d-3553a0cf04ef (trigger: interval[0:01:00], next run at: 2015-04-23 12:19:52.818116)" (scheduled at 2015-04-23 12:19:52.818116) INFO 2015-04-23 12:19:52,851 scheduler.py:527 - Job "3e2ce1a4-4b87-43d8-ab01-34be1c07e6e9 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.816938)" executed successfully INFO 2015-04-23 12:19:52,852 scheduler.py:527 - Job "4c96efd5-3c3c-404b-bf7a-1b3ffaef276c (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.815760)" executed successfully INFO 2015-04-23 12:19:52,853 scheduler.py:527 - Job "3de93e99-bedb-4de9-8039-96f713cff992 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.816300)" executed successfully INFO 2015-04-23 12:19:52,858 scheduler.py:527 - Job "03028a48-e699-43f5-b6c4-ff2ea805da03 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.815123)" executed successfully INFO 2015-04-23 12:19:52,861 scheduler.py:527 - Job "56ac6ab3-d457-49fa-b3b8-717a7a052458 (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.817534)" executed successfully INFO 2015-04-23 12:19:52,863 scheduler.py:527 - Job "59196fab-5960-4573-b51d-3553a0cf04ef (trigger: interval[0:01:00], next run at: 2015-04-23 12:20:52.818116)" executed successfully INFO 2015-04-23 12:19:56,599 Heartbeat.py:75 - Building Heartbeat: {responseId = 5, timestamp = 1429784396599, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:19:56,617 Controller.py:239 - Heartbeat response received (id = 6) INFO 2015-04-23 12:19:56,617 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:20:06,618 Heartbeat.py:75 - Building Heartbeat: {responseId = 6, timestamp = 1429784406618, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:20:06,771 HostCheckReportFileHandler.py:91 - Host check report at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:20:06,772 HostCheckReportFileHandler.py:141 - Removing old host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:20:06,773 HostCheckReportFileHandler.py:146 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result INFO 2015-04-23 12:20:07,149 Controller.py:239 - Heartbeat response received (id = 7) INFO 2015-04-23 12:20:07,150 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:20:17,151 Heartbeat.py:75 - Building Heartbeat: {responseId = 7, timestamp = 1429784417150, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:20:17,198 Controller.py:239 - Heartbeat response received (id = 8) INFO 2015-04-23 12:20:17,198 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:20:27,199 Heartbeat.py:75 - Building Heartbeat: {responseId = 8, timestamp = 1429784427199, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:20:27,246 Controller.py:239 - Heartbeat response received (id = 9) INFO 2015-04-23 12:20:27,246 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:20:37,247 Heartbeat.py:75 - Building Heartbeat: {responseId = 9, timestamp = 1429784437247, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:20:37,295 Controller.py:239 - Heartbeat response received (id = 10) INFO 2015-04-23 12:20:37,296 Controller.py:283 - No commands sent from ambariserver2 INFO 2015-04-23 12:20:47,297 Heartbeat.py:75 - Building Heartbeat: {responseId = 10, timestamp = 1429784447296, commandsInProgress = False, componentsMapped = False} INFO 2015-04-23 12:20:47,345 Controller.py:239 - Heartbeat response received (id = 11)