Updating clusters can be complex. When there's a problem, you have to determine which alerts are unimportant, which logs to focus on, and work to pinpoint the real issue behind the issue. To make the update process smoother and more transparent, the Red Hat OpenShift team has developed a new tool: oc adm update status
. This new command helps you obtain detailed information about the state of your cluster during an update. And the functionality, initially available in the oc command-line tool, is set to evolve into a full-fledged API provided by the cluster itself.
Technology preview in Red Hat OpenShift 4.16
The oc adm upgrade status
command debuts as a technology preview in Red Hat OpenShift 4.16. This new subcommand is entirely client-side in its current version; you only need the 4.16 oc binary. It's safe to use in production environments as it only reads data without modifying any configurations.
How to Use oc adm upgrade status
To use this new feature, you must first enable it by setting the OC_ENABLE_CMD_UPGRADE_STATUS
environment variable to true
:
export OC_ENABLE_CMD_UPGRADE_STATUS=true
Then run the command during an update:
$ oc adm upgrade status
= Control Plane =
Assessment: Progressing
Completion: 12%
Duration: 12m5s
Operator Status: 33 Healthy
Control Plane Nodes
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-30-217.us-east-2.compute.internal Outdated Pending 4.14.0 ?
ip-10-0-53-40.us-east-2.compute.internal Outdated Pending 4.14.0 ?
ip-10-0-92-180.us-east-2.compute.internal Outdated Pending 4.14.0 ?
= Worker Upgrade =
= Worker Pool =
Worker Pool: worker
Assessment: Excluded
Completion: 0%
Worker Status: 3 Total, 3 Available, 0 Progressing, 3 Outdated, 0 Draining, 3 Excluded, 0 Degraded
Worker Pool Nodes
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-20-162.us-east-2.compute.internal Excluded Paused 4.14.0 -
ip-10-0-4-159.us-east-2.compute.internal Excluded Paused 4.14.0 -
ip-10-0-99-40.us-east-2.compute.internal Excluded Paused 4.14.0 -
= Update Health =
SINCE LEVEL IMPACT MESSAGE
- Warning Update Stalled Outdated nodes in a paused pool 'worker' will not be updated
Run with --details=health for additional description and links to related online documentation
The command provides the following information:
- Control plane: Shows the progress, completion percentage, and health status of operators
- Worker upgrade: Details on worker nodes, including their current state and any issues affecting their upgrade process
- Update health: Offers insights into the health of the update process, highlighting any critical issues and providing actionable advice
Here's example output when the control plane nodes are updated first during an OpenShift update:
= Control Plane =
Assessment: Progressing
Completion: 97%
Duration: 1h58m50s
Operator Status: 28 Healthy, 1 Unavailable, 4 Available but degraded
Control Plane Nodes
NAME ASSESSMENT PHASE VERSION EST MESSAGE
ip-10-0-30-217.us-east-2.compute.internal Outdated Pending 4.14.0-rc.3 ?
ip-10-0-53-40.us-east-2.compute.internal Outdated Pending 4.14.0-rc.3 ?
ip-10-0-92-180.us-east-2.compute.internal Outdated Pending 4.14.0-rc.3 ?
This next example shows the worker nodes updating after the control plane has updated during an OpenShift update:
= Worker Upgrade =
= Worker Pool =
Worker Pool: worker
Assessment: Degraded
Completion: 39%
Worker Status: 59 Total, 46 Available, 5 Progressing, 36 Outdated, 12 Draining, 0 Excluded, 7 Degraded
Worker Pool Nodes
NAME ASSESSMENT PHASE VERSION EST MESSAGE
build0-gstfj-ci-prowjobs-worker-b-9lztv Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-prowjobs-worker-b-bg9f5 Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-prowjobs-worker-b-mrxwn Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-tests-worker-b-4h7pn Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-tests-worker-b-jv5bg Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-tests-worker-b-kj6gk Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-tests-worker-c-dcz9p Degraded Draining 4.16.0-ec.2 ? failed to drain node: <node> after 1 hour ...
build0-gstfj-ci-tests-worker-c-jq5rk Unavailable Updated 4.16.0-ec.3 - Node is unavailable
build0-gstfj-ci-tests-worker-c-2kz4m Progressing Draining 4.16.0-ec.2 +30m
build0-gstfj-ci-tests-worker-c-55hpj Progressing Draining 4.16.0-ec.2 +30m
...
Omitted additional 49 Total, 22 Completed, 46 Available, 3 Progressing, 27 Outdated, 3 Draining, 0 Excluded, and 0 Degraded nodes.
Pass along --details=nodes to see all information.
You can see insights about an ongoing update:
= Update Health =
SINCE LEVEL IMPACT MESSAGE
14m4s Info None Upgrade is proceeding well
Of course, sometimes not everything goes as planned. Here's an example of the kind of output you get when there's a problem.:
= Update Health =
SINCE LEVEL IMPACT MESSAGE
20m24s Error API Availability Cluster Operator machine-config is unavailable (MachineConfigControllerFailed)
- Error Update Stalled Node build0-gstfj-ci-prowjobs-worker-b-9lztv is degraded
- Error Update Stalled Node build0-gstfj-ci-prowjobs-worker-b-bg9f5 is degraded
- Error Update Stalled Node build0-gstfj-ci-prowjobs-worker-b-mrxwn is degraded
- Warning Update Speed Node build0-gstfj-ci-tests-worker-c-jq5rk is unavailable
Run with --details=health for additional description and links to related online documentation
Use the --details=health
option to see individual insights expanded with further information about the update issue:
$ oc adm upgrade status --detailed=health
...
= Update Health =
Message: Node build0-gstfj-ci-prowjobs-worker-b-9lztv is degraded
Since: -
Level: Error
Impact: Update Stalled
Reference: https://docs.openshift.com/container-platform/latest/post_installation_configuration/machine-configuration-tasks.html#understanding-the-machine-config-operator
Resources:
nodes: build0-gstfj-ci-prowjobs-worker-b-9lztv
Description: failed to drain node: build0-gstfj-ci-prowjobs-worker-b-9lztv after 1 hour. Please see machine-config-controller logs for more information
Message: Cluster Operator machine-config is unavailable (MachineConfigControllerFailed)
Since: 20m24s
Level: Error
Impact: API Availability
Reference: https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/ClusterOperatorDown.md
Resources:
clusteroperators.config.openshift.io: machine-config
Description: Cluster not available for [{operator 4.14.0-rc.3}]: ControllerConfig.machineconfiguration.openshift.io "machine-config-controller" is invalid: [status.controllerCertificates[0].notAfter: Required value ... some validation rules were not checked because the object was invalid; correct the existing errors to complete validation]
Future roadmap
Our roadmap includes transitioning this functionality to a cluster-provided API in OpenShift 4.17, allowing for broad consumption and integration across different tools and platforms. This initial release in the client helps us gather feedback to refine the feature.
We encourage everyone involved in cluster updates to try this new feature and share your feedback, helping us better serve your needs in future releases.
執筆者紹介
Subin Modeel is a principal technical product manager at Red Hat.
類似検索
チャンネル別に見る
自動化
テクノロジー、チームおよび環境に関する IT 自動化の最新情報
AI (人工知能)
お客様が AI ワークロードをどこでも自由に実行することを可能にするプラットフォームについてのアップデート
オープン・ハイブリッドクラウド
ハイブリッドクラウドで柔軟に未来を築く方法をご確認ください。
セキュリティ
環境やテクノロジー全体に及ぶリスクを軽減する方法に関する最新情報
エッジコンピューティング
エッジでの運用を単純化するプラットフォームのアップデート
インフラストラクチャ
世界有数のエンタープライズ向け Linux プラットフォームの最新情報
アプリケーション
アプリケーションの最も困難な課題に対する Red Hat ソリューションの詳細
オリジナル番組
エンタープライズ向けテクノロジーのメーカーやリーダーによるストーリー
製品
ツール
試用、購入、販売
コミュニケーション
Red Hat について
エンタープライズ・オープンソース・ソリューションのプロバイダーとして世界をリードする Red Hat は、Linux、クラウド、コンテナ、Kubernetes などのテクノロジーを提供しています。Red Hat は強化されたソリューションを提供し、コアデータセンターからネットワークエッジまで、企業が複数のプラットフォームおよび環境間で容易に運用できるようにしています。
言語を選択してください
Red Hat legal and privacy links
- Red Hat について
- 採用情報
- イベント
- 各国のオフィス
- Red Hat へのお問い合わせ
- Red Hat ブログ
- ダイバーシティ、エクイティ、およびインクルージョン
- Cool Stuff Store
- Red Hat Summit