<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                合規國際互聯網加速 OSASE為企業客戶提供高速穩定SD-WAN國際加速解決方案。 廣告
                # Review Apps > 原文:[https://docs.gitlab.com/ee/development/testing_guide/review_apps.html](https://docs.gitlab.com/ee/development/testing_guide/review_apps.html) * [How does it work?](#how-does-it-work) * [CI/CD architecture diagram](#cicd-architecture-diagram) * [Detailed explanation](#detailed-explanation) * [Auto-stopping of Review Apps](#auto-stopping-of-review-apps) * [QA runs](#qa-runs) * [Performance Metrics](#performance-metrics) * [Cluster configuration](#cluster-configuration) * [Node pools](#node-pools) * [Helm](#helm) * [How to](#how-to) * [Get access to the GCP Review Apps cluster](#get-access-to-the-gcp-review-apps-cluster) * [Log into my Review App](#log-into-my-review-app) * [Enable a feature flag for my Review App](#enable-a-feature-flag-for-my-review-app) * [Find my Review App slug](#find-my-review-app-slug) * [Run a Rails console](#run-a-rails-console) * [Dig into a Pod’s logs](#dig-into-a-pods-logs) * [Diagnosing unhealthy Review App releases](#diagnosing-unhealthy-review-app-releases) * [Release failed with `ImagePullBackOff`](#release-failed-with-imagepullbackoff) * [Node count is always increasing (i.e. never stabilizing or decreasing)](#node-count-is-always-increasing-ie-never-stabilizing-or-decreasing) * [p99 CPU utilization is at 100% for most of the nodes and/or many components](#p99-cpu-utilization-is-at-100-for-most-of-the-nodes-andor-many-components) * [The `logging/user/events/FailedMount` chart is going up](#the-loggingusereventsfailedmount-chart-is-going-up) * [Using K9s](#using-k9s) * [Troubleshoot a pending `dns-gitlab-review-app-external-dns` Deployment](#troubleshoot-a-pending-dns-gitlab-review-app-external-dns-deployment) * [Finding the problem](#finding-the-problem) * [Solving the problem](#solving-the-problem) * [Mitigation steps taken to avoid this problem in the future](#mitigation-steps-taken-to-avoid-this-problem-in-the-future) * [Frequently Asked Questions](#frequently-asked-questions) * [Other resources](#other-resources) * [Helpful command line tools](#helpful-command-line-tools) # Review Apps[](#review-apps "Permalink") Review Apps 由[管道](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/6665)自動部署. ## How does it work?[](#how-does-it-work "Permalink") ### CI/CD architecture diagram[](#cicd-architecture-diagram "Permalink") 圖 TD A [" build-qa-image,編譯生產資產 (僅適用于規范的默認參考)"]; B [review-build-cng]; C [review-deploy]; D [CNG-mirror]; E [review-qa-smoke]; A-> |一旦準備`階段已完成| BB -.-> |觸發 CNG 鏡像管道并等待其完成| DD -.-> |輪詢直到完成| BB-> |一旦完成`view-build-cng`工作完成| CC-> |完成"審查-部署"工作| E 子圖" 1\. gitlab`prepare` stage"結束子圖" 2\. gitlab`review-prepare`階段" B 結束子圖" 3\. gitlab`review` stage" C [" review-deploy Helm 使用云部署 Review App 由 CNG 鏡像管道構建的本機映像. Cloud Native 映像已部署到" review-apps" Kubernetes(GKE)集群,位于 GCP`gitlab-review-apps`項目中."]結束子圖" 4\. gitlab`qa` stage" E [review-qa-smoke gitlab-qa 對 Review App 運行冒煙套件.]結束子圖" CNG 鏡像管道" D>構建了 Cloud Native 圖像]; 結束 ### Detailed explanation[](#detailed-explanation "Permalink") 1. 在`prepare`階段的每個[管道](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730)上,都會自動啟動[`compile-production-assets`](https://gitlab.com/gitlab-org/gitlab/-/jobs/641770154)作業. * 完成后, [`review-build-cng`](https://gitlab.com/gitlab-org/gitlab/-/jobs/467724808)作業開始,因為在后續步驟中觸發的[`CNG-mirror`](https://gitlab.com/gitlab-org/build/CNG-mirror)管道依賴[`review-build-cng`](https://gitlab.com/gitlab-org/gitlab/-/jobs/467724808) . 2. 完成`compile-production-assets`后, [`review-build-cng`](https://gitlab.com/gitlab-org/gitlab/-/jobs/467724808)作業[將觸發](https://gitlab.com/gitlab-org/build/CNG-mirror/pipelines/44364657) [`CNG-mirror`](https://gitlab.com/gitlab-org/build/CNG-mirror)項目中[的管道](https://gitlab.com/gitlab-org/build/CNG-mirror/pipelines/44364657) . * 僅當您的 MR 包括[CI 或前端更改時](../pipelines.html#changes-patterns) , `review-build-cng`作業才會自動開始. 在其他情況下,該工作是手動的. * [`CNG-mirror`](https://gitlab.com/gitlab-org/build/CNG-mirror/pipelines/44364657)管道基于[GitLab 管道](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730)的提交創建每個組件(例如`gitlab-rails-ee` , `gitlab-shell` , `gitaly`等)的 Docker 映像,并將它們存儲在其[注冊表中](https://gitlab.com/gitlab-org/build/CNG-mirror/container_registry) . * 我們使用[`CNG-mirror`](https://gitlab.com/gitlab-org/build/CNG-mirror)項目,以便`CNG` (Cloud Native GitLab)項目的注冊表不會因大量臨時 Docker 映像而過載. * 請注意,官方的 CNG 圖像是由`cloud-native-image`作業構建的,該作業僅針對標簽運行,并自身觸發[`CNG`](https://gitlab.com/gitlab-org/build/CNG)管道. 3. 完成`review-build-cng`后, [`review-deploy`](https://gitlab.com/gitlab-org/gitlab/-/jobs/467724810)作業使用[官方的 GitLab Helm 圖表](https://gitlab.com/gitlab-org/charts/gitlab/)將 Review App 部署到 GCP 上的[`review-apps`](https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/review-apps?project=gitlab-review-apps) Kubernetes 集群. * 可以在[`scripts/review_apps/review-apps.sh`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/scripts/review_apps/review-apps.sh)找到用于部署 Review App 的實際腳本. * 這些腳本基本上是[我們的官方 Auto DevOps 腳本](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Auto-DevOps.gitlab-ci.yml) ,其中默認的 CNG 映像會被構建并存儲在[`CNG-mirror`項目注冊表中](https://gitlab.com/gitlab-org/build/CNG-mirror/container_registry)的映像覆蓋. * 由于我們使用的[是官方的 GitLab Helm 圖表](https://gitlab.com/gitlab-org/charts/gitlab/) ,這意味著您將為分支機構獲得一個專用的環境,該環境非常接近生產環境. 4. 一旦[`review-deploy`](https://gitlab.com/gitlab-org/gitlab/-/jobs/467724810)作業成功,您應該可以使用您的 Review App,這要歸功于 MR 小部件與它的直接鏈接. 要登錄 Review App,請參閱"登錄我的 Review App?". 下面. **補充筆記:** * 如果`review-deploy`工作持續失敗(請注意,我們已經試了兩次),請在發布消息`#g_qe_engineering_productivity`通道和/或創建`~"Engineering Productivity"` `~"ep::review apps"` `~bug`的問題有鏈接到您的合并請求. 請注意,部署失敗可能會揭示合并請求中引入的實際問題(即,這不一定是暫時性失敗)! * 如果`review-qa-smoke`作業仍然失敗(請注意,我們已經重試了兩次),請檢查該作業的日志:您可能會發現合并請求中引入的實際問題. 您也可以下載工件,以查看發生故障時頁面的屏幕截圖. 如果您找不到失敗的原因,或者看起來與更改無關,請在`#quality`頻道中發布一條消息和/或創建`#quality`問題,并帶有指向合并請求的鏈接. * 手動`review-stop`可用于手動停止復查應用,一旦合并請求的分支在合并后被刪除,GitLab 也將啟動手動`review-stop` . * 使用[GitLab 的 Kubernetes 集成](../../user/project/clusters/index.html)將 Kubernetes 集群連接到`gitlab`項目. 這基本上允許直接從合并請求窗口小部件鏈接到 Review App. ### Auto-stopping of Review Apps[](#auto-stopping-of-review-apps "Permalink") 借助[環境自動停止](../../ci/environments/index.html#environments-auto-stop)功能,Review Apps 在上次部署后 2 天會自動停止. If you need your Review App to stay up for a longer time, you can [pin its environment](../../ci/environments/index.html#auto-stop-example) or retry the `review-deploy` job to update the “latest deployed at” time. The `review-cleanup` job that automatically runs in scheduled pipelines (and is manual in merge request) stops stale Review Apps after 5 days, deletes their environment after 6 days, and cleans up any dangling Helm releases and Kubernetes resources after 7 days. 自動在計劃的管道中運行的`review-gcp-cleanup`作業(在合并請求中手動執行)將刪除所有未與 Kubernetes 資源一起刪除的懸空 GCP 網絡資源. ## QA runs[](#qa-runs "Permalink") 在`qa`階段(在`review`階段之后)的每個[管道](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730)上, `review-qa-smoke`作業都會自動啟動,并運行 QA 煙霧套件. 您也可以手動啟動`review-qa-all` :它運行完整的質量檢查套件. ## Performance Metrics[](#performance-metrics "Permalink") 在每一個[管道](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730)在`qa`階段, `review-performance`作業自動啟動:這項工作確實使用基本的瀏覽器性能測試[Sitespeed.io 集裝箱](../../user/project/merge_requests/browser_performance_testing.html) . ## Cluster configuration[](#cluster-configuration "Permalink") ### Node pools[](#node-pools "Permalink") 目前, `review-apps`集群使用以下節點池進行設置: * 具有自動`e2-highcpu-16` (16 vCPU,16 GB 內存)可搶占節點 ### Helm[](#helm "Permalink") 使用的 Helm 版本在[`registry.gitlab.com/gitlab-org/gitlab-build-images:gitlab-helm3-kubectl1.14`映像中](https://gitlab.com/gitlab-org/gitlab-build-images/-/blob/master/Dockerfile.gitlab-helm3-kubectl1.14#L7)定義,由`review-deploy`和`review-stop`作業使用. ## How to[](#how-to "Permalink") ### Get access to the GCP Review Apps cluster[](#get-access-to-the-gcp-review-apps-cluster "Permalink") 您需要[打開](https://gitlab.com/gitlab-com/access-requests/-/issues/new) `gcp-review-apps-sg` GCP 組[的訪問請求(內部鏈接)](https://gitlab.com/gitlab-com/access-requests/-/issues/new) . 為了加入群組,您必須在訪問請求中指定所需的 GCP 角色. 該角色將授予您特定的權限,以便與 Review App 容器進行交互. Here are some permissions you may want to have, and the roles that grant them: * `container.pods.getLogs` [檢索 pod 日志所](#dig-into-a-pods-logs)必需. 由[查看者( `roles/viewer` )](https://cloud.google.com/iam/docs/understanding-roles#kubernetes-engine-roles)授予. * `container.pods.exec` [運行 Rails 控制臺](#run-a-rails-console)所需. 由[Kubernetes Engine 開發人員( `roles/container.developer` )](https://cloud.google.com/iam/docs/understanding-roles#kubernetes-engine-roles)授予. ### Log into my Review App[](#log-into-my-review-app "Permalink") 默認用戶名是`root` ,其密碼可以在名為`gitlab-{ce,ee} Review App's root password`的 1Password 安全注釋中找到. ### Enable a feature flag for my Review App[](#enable-a-feature-flag-for-my-review-app "Permalink") 1. 打開您的 Review App 并按照上述說明登錄. 2. 創建一個個人訪問令牌. 3. 使用[Feature 標志 API](../../api/features.html)啟用[功能標志](../../api/features.html) . ### Find my Review App slug[](#find-my-review-app-slug "Permalink") 1. 打開`review-deploy`作業. 2. 查找" `Checking for previous deployment of review-*` . 3. 例如,對于`Checking for previous deployment of review-qa-raise-e-12chm0`在這種情況下,您的 Review App `Checking for previous deployment of review-qa-raise-e-12chm0`將為`review-qa-raise-e-12chm0` . ### Run a Rails console[](#run-a-rails-console "Permalink") 1. 確保首先[具有訪問群集](#get-access-to-the-gcp-review-apps-cluster)和`container.pods.exec`權限的權限. 2. [根據您的 Review App](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps) `review-qa-raise-e-12chm0` [過濾工作量](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps) ,例如`review-qa-raise-e-12chm0` . 3. 查找并打開`task-runner`部署,例如`review-qa-raise-e-12chm0-task-runner` . 4. 單擊"托管窗格"部分中的 Pod,例如`review-qa-raise-e-12chm0-task-runner-d5455cc8-2lsvz` . 5. 點擊`KUBECTL`下拉菜單,然后`Exec` - > `task-runner` . 6. 從默認命令`-it -- gitlab-rails console` `-c task-runner -- ls`替換為`-it -- gitlab-rails console` ,或者 * 運行`kubectl exec --namespace review-apps review-qa-raise-e-12chm0-task-runner-d5455cc8-2lsvz -it -- gitlab-rails console`和 * 用您的 Pod 名稱替換`review-qa-raise-e-12chm0-task-runner-d5455cc8-2lsvz` . ### Dig into a Pod’s logs[](#dig-into-a-pods-logs "Permalink") 1. 確保首先[有權訪問集群](#get-access-to-the-gcp-review-apps-cluster)和`container.pods.getLogs`權限. 2. [根據您的 Review App](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps) `review-qa-raise-e-12chm0` [過濾工作量](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps) ,例如`review-qa-raise-e-12chm0` . 3. 查找并打開`migrations`部署,例如`review-qa-raise-e-12chm0-migrations.1` . 4. 單擊"托管窗格"部分中的 Pod,例如`review-qa-raise-e-12chm0-migrations.1-nqwtx` . 5. 單擊`Container logs`鏈接. ## Diagnosing unhealthy Review App releases[](#diagnosing-unhealthy-review-app-releases "Permalink") 如果[Review App Stability](https://app.periscopedata.com/app/gitlab/496118/Engineering-Productivity-Sandbox?widget=6690556&udv=785399)下降,則可能表明[Review](https://app.periscopedata.com/app/gitlab/496118/Engineering-Productivity-Sandbox?widget=6690556&udv=785399) `review-apps-ce/ee`集群不健康. 領先的指標可能是導致重新啟動的運行狀況檢查失敗或 Review App 部署的多數失敗. [Review Apps Overview 儀表板可](https://console.cloud.google.com/monitoring/classic/dashboards/6798952013815386466?project=gitlab-review-apps&timeDomain=1d)幫助確定群集上的負載峰值,以及節點是否有問題或整個群集是否趨于不正常. ### Release failed with `ImagePullBackOff`[](#release-failed-with-imagepullbackoff "Permalink") **潛在原因:** 如果看到`ImagePullBackoff`狀態,請檢查缺少的 Docker 映像. **在哪里尋找進一步的調試:** 要檢查是否已創建 Docker 映像,請運行以下 Docker 命令: ``` `DOCKER_CLI_EXPERIMENTAL=enabled docker manifest repository:tag` ``` 此命令的輸出指示 Docker 映像是否存在. 例如: ``` DOCKER_CLI_EXPERIMENTAL=enabled docker manifest inspect registry.gitlab.com/gitlab-org/build/cng-mirror/gitlab-rails-ee:39467-allow-a-release-s-associated-milestones-to-be-edited-thro ``` 如果 Docker 映像不存在: * 驗證`helm upgrade --install`命令中的`image.repository`和`image.tag`選項是否與 CNG-mirror 管道使用的存儲庫名稱匹配. * 在`review-build-cng`作業中進一步查看相應的下游 CNG 鏡像管道. ### Node count is always increasing (i.e. never stabilizing or decreasing)[](#node-count-is-always-increasing-ie-never-stabilizing-or-decreasing "Permalink") **潛在原因:** 這可能表明`review-cleanup`作業未能清除過時的審查應用和 Kubernetes 資源. **在哪里尋找進一步的調試:** 查看最新的`review-cleanup`作業日志,并確定是否存在任何意外故障. ### p99 CPU utilization is at 100% for most of the nodes and/or many components[](#p99-cpu-utilization-is-at-100-for-most-of-the-nodes-andor-many-components "Permalink") **潛在原因:** 這可能表明 Helm 無法部署 Review Apps. 當 Helm 有很多`FAILED`版本發布時,CPU 利用率似乎正在增加,這可能是由于 Helm 或 Kubernetes 試圖重新創建組件所致. **在哪里尋找進一步的調試:** 查看最近的`review-deploy`作業日志. **有用的命令:** ``` # Identify if node spikes are common or load on specific nodes which may get rebalanced by the Kubernetes scheduler kubectl top nodes | sort --key 3 --numeric # Identify pods under heavy CPU load kubectl top pods | sort --key 2 --numeric ``` ### The `logging/user/events/FailedMount` chart is going up[](#the-loggingusereventsfailedmount-chart-is-going-up "Permalink") **潛在原因:** 這可能表明存在太多過時的機密和/或配置圖. **在哪里尋找進一步的調試:** 查看[配置列表](https://console.cloud.google.com/kubernetes/config?project=gitlab-review-apps)或`kubectl get secret,cm --sort-by='{.metadata.creationTimestamp}' | grep 'review-'` `kubectl get secret,cm --sort-by='{.metadata.creationTimestamp}' | grep 'review-'` . 懷疑任何超過 5 天的機密或配置圖,應將其刪除. **有用的命令:** ``` # List secrets and config maps ordered by created date kubectl get secret,cm --sort-by='{.metadata.creationTimestamp}' | grep 'review-' # Delete all secrets that are 5 to 9 days old kubectl get secret --sort-by='{.metadata.creationTimestamp}' | grep '^review-' | grep '[5-9]d$' | cut -d' ' -f1 | xargs kubectl delete secret # Delete all secrets that are 10 to 99 days old kubectl get secret --sort-by='{.metadata.creationTimestamp}' | grep '^review-' | grep '[1-9][0-9]d$' | cut -d' ' -f1 | xargs kubectl delete secret # Delete all config maps that are 5 to 9 days old kubectl get cm --sort-by='{.metadata.creationTimestamp}' | grep 'review-' | grep -v 'dns-gitlab-review-app' | grep '[5-9]d$' | cut -d' ' -f1 | xargs kubectl delete cm # Delete all config maps that are 10 to 99 days old kubectl get cm --sort-by='{.metadata.creationTimestamp}' | grep 'review-' | grep -v 'dns-gitlab-review-app' | grep '[1-9][0-9]d$' | cut -d' ' -f1 | xargs kubectl delete cm ``` ### Using K9s[](#using-k9s "Permalink") [K9s](https://github.com/derailed/k9s)是功能強大的命令行儀表板,可讓您按標簽過濾. 這可以幫助確定趨勢超過[審閱應用程序資源請求的應用程序](https://gitlab.com/gitlab-org/gitlab/-/blob/master/scripts/review_apps/base-config.yaml) . Kubernetes 將根據資源請求將 Pod 調度到節點,并允許 CPU 使用量達到上限. * 在 K9s 中,您可以通過輸入`/`字符來排序或添加過濾器 * `-lrelease=<review-app-slug>` -過濾所有發布的 Pod. 這有助于確定單個部署中存在的問題 * `-lapp=<app>` -篩選特定應用程序的所有 pod. 這有助于確定應用程序的資源使用情況. * 您可以滾動到 Kubernetes 資源并按`d` (描述), `s` (shell), `l` (日志)進行更深入的檢查 [![K9s](https://img.kancloud.cn/f3/08/f3086f4caac9ff5d177ecff25b046346_2878x742.png)](img/k9s.png) ### Troubleshoot a pending `dns-gitlab-review-app-external-dns` Deployment[](#troubleshoot-a-pending-dns-gitlab-review-app-external-dns-deployment "Permalink") #### Finding the problem[](#finding-the-problem "Permalink") [過去](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/62834) ,發生了`dns-gitlab-review-app-external-dns`部署處于掛起狀態的情況,有效地阻止了所有 Review App 分配 DNS 記錄,從而使它們無法通過域名訪問. 反過來,這阻止了 Review App 的其他組件正常啟動(例如`gitlab-runner` ). 經過一番挖掘后,我們發現在使用`systemd-mount`瞬時作用域(例如 pod)執行新安裝時,新安裝失敗: ``` MountVolume.SetUp failed for volume "dns-gitlab-review-app-external-dns-token-sj5jm" : mount failed: exit status 1 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/06add1c3-87b4-11e9-80a9-42010a800107/volumes/kubernetes.io~secret/dns-gitlab-review-app-external-dns-token-sj5jm --scope -- mount -t tmpfs tmpfs /var/lib/kubelet/pods/06add1c3-87b4-11e9-80a9-42010a800107/volumes/kubernetes.io~secret/dns-gitlab-review-app-external-dns-token-sj5jm Output: Failed to start transient scope unit: Connection timed out ``` 這可能是因為 GitLab 圖表創建了 67 個資源,導致在基礎 GCP 節點上創建了許多安裝點. 在[根本的問題似乎是一個`systemd`錯誤](https://github.com/kubernetes/kubernetes/issues/57345#issuecomment-359068048)是固定在`systemd` `v237` . 不幸的是,我們的 GCP 節點當前正在使用`v232` . 記錄下來,找出此問題的調試步驟是: 1. 將 kubectl 上下文切換到 review-apps-ce(我們建議使用[kubectx](https://github.com/ahmetb/kubectx/) ) 2. `kubectl get pods | grep dns` 3. `kubectl describe pod <pod name>`并確認確切的錯誤消息 4. 在兔子洞中找到[相關的 Kubernetes 錯誤報告后](https://github.com/kubernetes/kubernetes/issues/57345) ,在網上搜索確切的錯誤消息 5. 通過 GCP 控制臺通過 SSH 訪問節點( **計算機引擎> VM 實例,**然后單擊`dns-gitlab-review-app-external-dns` pod 運行的節點的" SSH"按鈕) 6. In the node: `systemctl --version` => `systemd 232` 7. 收集更多信息: * `mount | grep kube | wc -l` `mount | grep kube | wc -l` =>例如 290 * `systemctl list-units --all | grep -i var-lib-kube | wc -l` `systemctl list-units --all | grep -i var-lib-kube | wc -l` =>例如 142 8. 檢查多少個 Pod 處于不良狀態: * 獲取運行給定節點的所有 Pod: `kubectl get pods --field-selector=spec.nodeName=NODE_NAME` * 獲取給定節點上的所有`Running` pods: `kubectl get pods --field-selector=spec.nodeName=NODE_NAME | grep Running` `kubectl get pods --field-selector=spec.nodeName=NODE_NAME | grep Running` * 在給定節點上獲取所有處于不良狀態的 Pod: `kubectl get pods --field-selector=spec.nodeName=NODE_NAME | grep -v 'Running' | grep -v 'Completed'` `kubectl get pods --field-selector=spec.nodeName=NODE_NAME | grep -v 'Running' | grep -v 'Completed'` #### Solving the problem[](#solving-the-problem "Permalink") 為了解決該問題,我們需要(強制)耗盡一些節點: 1. 在運行`dns-gitlab-review-app-external-dns` pod 的節點上嘗試正常排水,以使 Kubernetes 自動將其移動到另一個節點: `kubectl drain NODE_NAME` 2. 如果那不起作用,您還可以通過刪除所有吊艙來強制"排水"節點: `kubectl delete pods --field-selector=spec.nodeName=NODE_NAME` 3. 在節點中: * 執行`systemctl daemon-reload`以刪除無效/無效的單元 * 如果那不能解決問題,請執行硬重啟: `sudo systemctl reboot` 4. 取消封鎖所有封鎖的節點: `kubectl uncordon NODE_NAME`封鎖`kubectl uncordon NODE_NAME` 同時,由于大多數 Review App 處于損壞狀態,因此我們將其刪除以清理非`Running` Pod 列表. 以下是一個命令,用于根據其上次部署日期(當前日期為當時的 6 月 6 日)刪除 Review Apps, ``` helm ls -d | grep "Jun 4" | cut -f1 | xargs helm delete --purge ``` #### Mitigation steps taken to avoid this problem in the future[](#mitigation-steps-taken-to-avoid-this-problem-in-the-future "Permalink") 我們用較小的計算機創建了一個新的節點池,這樣一來,將來計算機就不太可能遇到"裝載點過多"的問題. ## Frequently Asked Questions[](#frequently-asked-questions "Permalink") **在每次測試運行時觸發 CNG 映像生成是否過多? 這將創建數千個未使用的 Docker 映像.** > 我們必須從某個地方開始,以后再改進. 另外,我們正在使用 CNG-mirror 項目來存儲這些 Docker 映像,以便我們可以在某個時候清除注冊表,并使用一個新的,空的注冊表. **我們如何確保它免受濫用? 應用程序向世界開放,因此我們需要找到一種方法將其限制為僅限我們自己.** > This isn’t enabled for forks. ## Other resources[](#other-resources "Permalink") * [Review Apps integration for CE/EE (presentation)](https://docs.google.com/presentation/d/1QPLr6FO4LduROU8pQIPkX1yfGvD13GEJIBOenqoKxR8/edit?usp=sharing) * [Stability issues](https://gitlab.com/gitlab-org/quality/team-tasks/-/issues/212) ### Helpful command line tools[](#helpful-command-line-tools "Permalink") * [K9s-](https://github.com/derailed/k9s)啟用跨 Pod 的 CLI 儀表板并啟用按標簽過濾 * [船尾](https://github.com/wercker/stern) -基于標簽/字段選擇器啟用跨 Pod 日志拖尾 * * * [Return to Testing documentation](index.html)
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看