Hi Orlando,
After upgrading to Ambari 2.0.0, the new stack version (HDP 2.2.0.0-2041) will become CURRENT as soon as all of the hosts perform a full Restart on each of their components; the other way to do this is to do a full Restart on each service. This will cause Ambari to mark all host_version records for the new stack as CURRENT.
You can confirm this via the API, e.g., http://server:8080/api/v1/clusters/c1/stack_versions/1
ClusterStackVersions: {
cluster_name: "c1",
id: 1,
repository_version: 1,
stack: "HDP",
state: "CURRENT",
version: "2.3",
host_states: {
CURRENT: [
"c6408.ambari.apache.org",
"c6409.ambari.apache.org"
],
INSTALLED: [ ],
INSTALLING: [ ],
INSTALL_FAILED: [ ],
OUT_OF_SYNC: [ ],
UPGRADED: [ ],
UPGRADE_FAILED: [ ],
UPGRADING: [ ]
}
}
Installing the bits is straightforward, and all host_version records for the new stack should have a state of INSTALLED.
After finishing the Rolling Upgrade, all components on all hosts will advertise the version that they are on, which should be the new version. This is important in order for the “Save Cluster State” step to save the state in the database. If for whatever reason a host/component is not on the new version (perhaps that host encountered an error), its state at the API endpoint above for the new stack version will typically be in UPGRADING instead of UPGRADED. To remedy this,
1. Find the list of hosts or components that have not been upgraded. I personally prefer to query the DB directly,
SELECT repo_version_id, version FROM repo_version ORDER BY repo_version_id ASC;
SELECT * FROM host_version WHERE repo_version_id = ? AND state <> 'UPGRADED' ORDER BY host_name;
-- Can further find which components are not on the new version, e.g., 2.2.4.2-1234
-- Note that 'UNKNOWN' is allowed for components like ZKFC and AMS that do not advertise a version.
SELECT version, host_name, component_name FROM hostcomponentstate WHERE version NOT IN ('2.2.4.2-1234', 'UNKNOWN') ORDER BY version, host_name, component_name;
2. SSH onto each problematic host and run “hdp-select status”
If any of the components that should be on the new version are not, then we need to fix that host by running,
“hdp-select set all <new_version>” E.g., “hdp-select set all 2.2.4.2-1234”
3. Restart (IMPORTANT, RESTART command is not the same as “Stop, then Start”) the affected components on that host by navigating to the Hosts Details page and restarting the individual component. If the component is a master/slave, then when the component restarts it will advertise the correct version to Ambari, which will then update the value in the hostcomponentstate table.
Once all of the components on a host are on the new version, the record in the host_version table for that host should transition from UPGRADING -> UPGRADED.
4. Re-try the “Save Cluster State” step, which should succeed.