Understanding EXPLAIN plans · Gitlab 中文文檔

# Understanding EXPLAIN plans > 原文：[https://docs.gitlab.com/ee/development/understanding_explain_plans.html](https://docs.gitlab.com/ee/development/understanding_explain_plans.html) * [Nodes](#nodes) * [Node statistics](#node-statistics) * [Node types](#node-types) * [Seq Scan](#seq-scan) * [Index Only Scan](#index-only-scan) * [Index Scan](#index-scan) * [Bitmap Index Scan and Bitmap Heap scan](#bitmap-index-scan-and-bitmap-heap-scan) * [Limit](#limit) * [Sort](#sort) * [Nested Loop](#nested-loop) * [Optimising queries](#optimising-queries) * [Queries that can’t be optimised](#queries-that-cant-be-optimised) * [Cardinality and selectivity](#cardinality-and-selectivity) * [Rewriting queries](#rewriting-queries) * [What makes a bad plan](#what-makes-a-bad-plan) * [Producing query plans](#producing-query-plans) * [Rails console](#rails-console) * [ChatOps](#chatops) * [`#database-lab`](#database-lab) * [Tips & Tricks](#tips--tricks) * [Further reading](#further-reading) # Understanding EXPLAIN plans[](#understanding-explain-plans "Permalink") PostgreSQL 允許您使用`EXPLAIN`命令獲得查詢計劃. 嘗試確定查詢的執行方式時，此命令非常有用. 您可以在 SQL 查詢中直接使用此命令，只要查詢以它開頭即可： ``` EXPLAIN SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20); ``` 在 GitLab.com 上運行時，將顯示以下輸出： ``` Aggregate (cost=922411.76..922411.77 rows=1 width=8) -> Seq Scan on projects (cost=0.00..908044.47 rows=5746914 width=0) Filter: (visibility_level = ANY ('{0,20}'::integer[])) ``` 當*僅*使用`EXPLAIN` ，PostgreSQL 實際上不會執行我們的查詢，而是會根據可用的統計信息生成一個*估計的*執行計劃. 這意味著實際計劃可能相差很大. 幸運的是，PostgreSQL 還為我們提供了執行查詢的選項. 為此，我們需要使用`EXPLAIN ANALYZE`而不是`EXPLAIN` ： ``` EXPLAIN ANALYZE SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20); ``` 這將產生： ``` Aggregate (cost=922420.60..922420.61 rows=1 width=8) (actual time=3428.535..3428.535 rows=1 loops=1) -> Seq Scan on projects (cost=0.00..908053.18 rows=5746969 width=0) (actual time=0.041..2987.606 rows=5746940 loops=1) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 65677 Planning time: 2.861 ms Execution time: 3428.596 ms ``` 如我們所見，該計劃是完全不同的，并且包含許多數據. 讓我們逐步討論一下. 由于`EXPLAIN ANALYZE`執行查詢，因此在使用會寫入數據或可能會超時的查詢時應`EXPLAIN ANALYZE`小心. 如果查詢修改了數據，請考慮將其包裝在自動回滾的事務中，如下所示： ``` BEGIN; EXPLAIN ANALYZE DELETE FROM users WHERE id = 1; ROLLBACK; ``` `EXPLAIN`命令還包含其他選項，例如`BUFFERS` ： ``` EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20); ``` 然后將產生： ``` Aggregate (cost=922420.60..922420.61 rows=1 width=8) (actual time=3428.535..3428.535 rows=1 loops=1) Buffers: shared hit=208846 -> Seq Scan on projects (cost=0.00..908053.18 rows=5746969 width=0) (actual time=0.041..2987.606 rows=5746940 loops=1) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 65677 Buffers: shared hit=208846 Planning time: 2.861 ms Execution time: 3428.596 ms ``` 有關更多信息，請參閱官方的[`EXPLAIN`文檔](https://s0www0postgresql0org.icopy.site/docs/current/sql-explain.html)并[使用`EXPLAIN`指南](https://s0www0postgresql0org.icopy.site/docs/current/using-explain.html) . ## Nodes[](#nodes "Permalink") 每個查詢計劃都由節點組成. 節點可以嵌套，并且可以由內而外執行. 這意味著最內部的節點在外部節點之前執行. 最好將其視為嵌套函數調用，并在展開時返回其結果. 例如，一個計劃以`Aggregate`開頭，然后是`Nested Loop` ，然后是`Index Only scan`可以認為是以下 Ruby 代碼： ``` aggregate( nested_loop( index_only_scan() index_only_scan() ) ) ``` 節點使用`->`表示，后跟所采用的節點類型. 例如： ``` Aggregate (cost=922411.76..922411.77 rows=1 width=8) -> Seq Scan on projects (cost=0.00..908044.47 rows=5746914 width=0) Filter: (visibility_level = ANY ('{0,20}'::integer[])) ``` 這里執行的第一個節點是`Seq scan on projects` . `Filter:`是應用于節點結果的附加過濾器. 過濾器與 Ruby 的`Array#select`非常相似：它接受輸入行，應用過濾器，并生成新的行列表. 節點完成后，我們將在其上方執行`Aggregate` . 嵌套節點將如下所示： ``` Aggregate (cost=176.97..176.98 rows=1 width=8) (actual time=0.252..0.252 rows=1 loops=1) Buffers: shared hit=155 -> Nested Loop (cost=0.86..176.75 rows=87 width=0) (actual time=0.035..0.249 rows=36 loops=1) Buffers: shared hit=155 -> Index Only Scan using users_pkey on users users_1 (cost=0.43..4.95 rows=87 width=4) (actual time=0.029..0.123 rows=36 loops=1) Index Cond: (id < 100) Heap Fetches: 0 -> Index Only Scan using users_pkey on users (cost=0.43..1.96 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=36) Index Cond: (id = users_1.id) Heap Fetches: 0 Planning time: 2.585 ms Execution time: 0.310 ms ``` 在這里，我們首先執行兩次單獨的"僅索引"掃描，然后對這兩次掃描的結果執行"嵌套循環". ## Node statistics[](#node-statistics "Permalink") 計劃中的每個節點都有一組關聯的統計信息，例如成本，產生的行數，執行的循環數等等. 例如： ``` Seq Scan on projects (cost=0.00..908044.47 rows=5746914 width=0) ``` 在這里，我們可以看到我們的成本范圍為`0.00..908044.47` （稍后我們將對此進行介紹），并且我們估計（因為我們使用的是`EXPLAIN`而不是`EXPLAIN ANALYZE` ），因此此節點將產生總計 5,746,914 行. `width`統計信息描述了每行的估計寬度，以字節為單位. `costs`字段指定節點的價格. 成本以查詢計劃者的成本參數確定的任意單位衡量. 影響成本的因素取決于各種設置，例如`seq_page_cost` ， `cpu_tuple_cost`和其他各種設置. 費用字段的格式如下： ``` STARTUP COST..TOTAL COST ``` The startup cost states how expensive it was to start the node, with the total cost describing how expensive the entire node was. In general: the greater the values, the more expensive the node. 使用`EXPLAIN ANALYZE` ，這些統計信息還將包括實際花費的時間（以毫秒為單位）以及其他運行時統計信息（例如，產生的行的實際數量）： ``` Seq Scan on projects (cost=0.00..908053.18 rows=5746969 width=0) (actual time=0.041..2987.606 rows=5746940 loops=1) ``` 在這里我們可以看到估計返回了 5,746,969 行，但實際上，我們返回了 5,746,940 行. 我們還可以看到， *僅*此順序掃描就花費了 2.98 秒. 使用`EXPLAIN (ANALYZE, BUFFERS)`還將為我們提供有關由過濾器刪除的行數，使用的緩沖區數等信息. 例如： ``` Seq Scan on projects (cost=0.00..908053.18 rows=5746969 width=0) (actual time=0.041..2987.606 rows=5746940 loops=1) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 65677 Buffers: shared hit=208846 ``` 在這里，我們可以看到我們的過濾器必須刪除 65,677 行，并使用 208,846 個緩沖區. PostgreSQL 中的每個緩沖區都是 8 KB（8192 字節），這意味著我們上面的節點使用*1.6 GB 的緩沖區* . 好多啊！ ## Node types[](#node-types "Permalink") 有很多不同類型的節點，因此我們在這里僅介紹一些較常見的節點. 可以在[PostgreSQL 源文件`plannodes.h`](https://gitlab.com/postgres/postgres/blob/master/src/include/nodes/plannodes.h)找到所有可用節點及其描述的完整列表. ### Seq Scan[](#seq-scan "Permalink") 對數據庫表（的一部分）進行順序掃描. 這類似于使用`Array#each` ，但是在數據庫表上. 檢索大量行時，順序掃描可能會非常慢，因此，對于大型表，最好避免使用這些掃描. ### Index Only Scan[](#index-only-scan "Permalink") 對不需要從表中獲取任何內容的索引進行的掃描. 在某些情況下，僅索引掃描仍可能從表中獲取數據，在這種情況下，節點將包含" `Heap Fetches:`統計信息. ### Index Scan[](#index-scan "Permalink") 對索引的掃描，該索引需要從表中檢索一些數據. ### Bitmap Index Scan and Bitmap Heap scan[](#bitmap-index-scan-and-bitmap-heap-scan "Permalink") 位圖掃描介于順序掃描和索引掃描之間. 當我們從索引掃描中讀取太多數據但執行順序掃描時讀取的數據太少時，通常會使用它們. 位圖掃描使用所謂的[位圖索引](https://en.wikipedia.org/wiki/Bitmap_index)來執行其工作. [PostgreSQL](https://gitlab.com/postgres/postgres/blob/REL_11_STABLE/src/include/nodes/plannodes.h#L441)的[源代碼在位](https://gitlab.com/postgres/postgres/blob/REL_11_STABLE/src/include/nodes/plannodes.h#L441)圖掃描中指出以下內容： > 位圖索引掃描提供潛在元組位置的位圖；它不訪問堆本身. 該位圖可能由祖先位圖堆掃描節點使用，可能是在經過中間位圖和和/或位圖或節點之后將其與其他位圖索引掃描的結果組合在一起的. ### Limit[](#limit "Permalink") 在輸入行上應用`LIMIT` . ### Sort[](#sort "Permalink") 使用`ORDER BY`語句對輸入行進行排序. ### Nested Loop[](#nested-loop "Permalink") 嵌套循環將針對其前面的節點產生的每一行執行其子節點. 例如： ``` -> Nested Loop (cost=0.86..176.75 rows=87 width=0) (actual time=0.035..0.249 rows=36 loops=1) Buffers: shared hit=155 -> Index Only Scan using users_pkey on users users_1 (cost=0.43..4.95 rows=87 width=4) (actual time=0.029..0.123 rows=36 loops=1) Index Cond: (id < 100) Heap Fetches: 0 -> Index Only Scan using users_pkey on users (cost=0.43..1.96 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=36) Index Cond: (id = users_1.id) Heap Fetches: 0 ``` 在這里，第一個子節點（ `Index Only Scan using users_pkey on users users_1` ）產生 36 行，并執行一次（ `rows=36 loops=1` ）. 下一個節點產生 1 行（ `rows=1` ），但重復 36 次（ `loops=36` ）. 這是因為前一個節點產生了 36 行. 這意味著，如果各個子節點繼續產生許多行，則嵌套循環會迅速降低查詢速度. ## Optimising queries[](#optimising-queries "Permalink") 順便說一句，讓我們看看如何優化查詢. 讓我們以以下查詢為例： ``` SELECT COUNT(*) FROM users WHERE twitter != ''; ``` 該查詢僅計算設置了 Twitter 個人資料的用戶數. 讓我們使用`EXPLAIN (ANALYZE, BUFFERS)`運行它： ``` EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM users WHERE twitter != ''; ``` 這將產生以下計劃： ``` Aggregate (cost=845110.21..845110.22 rows=1 width=8) (actual time=1271.157..1271.158 rows=1 loops=1) Buffers: shared hit=202662 -> Seq Scan on users (cost=0.00..844969.99 rows=56087 width=0) (actual time=0.019..1265.883 rows=51833 loops=1) Filter: ((twitter)::text <> ''::text) Rows Removed by Filter: 2487813 Buffers: shared hit=202662 Planning time: 0.390 ms Execution time: 1271.180 ms ``` 從該查詢計劃中，我們可以看到以下內容： 1. 我們需要對`users`表執行順序掃描. 2. 此順序掃描使用`Filter`過濾掉 2,487,813 行. 3. 我們使用 202,622 個緩沖區，相當于 1.58 GB 的內存. 4. 完成所有這些操作需要 1.2 秒. 考慮到我們只是在計算用戶，這是相當昂貴的！在開始進行任何更改之前，讓我們看一下`users`表上是否有可以使用的現有索引. 我們可以通過在`psql`控制臺中運行`\d users` ，然后向下滾動至`Indexes:`部分來獲取此信息： ``` Indexes: "users_pkey" PRIMARY KEY, btree (id) "index_users_on_confirmation_token" UNIQUE, btree (confirmation_token) "index_users_on_email" UNIQUE, btree (email) "index_users_on_reset_password_token" UNIQUE, btree (reset_password_token) "index_users_on_static_object_token" UNIQUE, btree (static_object_token) "index_users_on_unlock_token" UNIQUE, btree (unlock_token) "index_on_users_name_lower" btree (lower(name::text)) "index_users_on_accepted_term_id" btree (accepted_term_id) "index_users_on_admin" btree (admin) "index_users_on_created_at" btree (created_at) "index_users_on_email_trigram" gin (email gin_trgm_ops) "index_users_on_feed_token" btree (feed_token) "index_users_on_group_view" btree (group_view) "index_users_on_incoming_email_token" btree (incoming_email_token) "index_users_on_managing_group_id" btree (managing_group_id) "index_users_on_name" btree (name) "index_users_on_name_trigram" gin (name gin_trgm_ops) "index_users_on_public_email" btree (public_email) WHERE public_email::text <> ''::text "index_users_on_state" btree (state) "index_users_on_state_and_user_type" btree (state, user_type) "index_users_on_unconfirmed_email" btree (unconfirmed_email) WHERE unconfirmed_email IS NOT NULL "index_users_on_user_type" btree (user_type) "index_users_on_username" btree (username) "index_users_on_username_trigram" gin (username gin_trgm_ops) "tmp_idx_on_user_id_where_bio_is_filled" btree (id) WHERE COALESCE(bio, ''::character varying)::text IS DISTINCT FROM ''::text ``` 在這里，我們可以看到`twitter`列上沒有索引，這意味著 PostgreSQL 在這種情況下必須執行順序掃描. 讓我們嘗試通過添加以下索引來解決此問題： ``` CREATE INDEX CONCURRENTLY twitter_test ON users (twitter); ``` 如果現在使用`EXPLAIN (ANALYZE, BUFFERS)`重新運行查詢`EXPLAIN (ANALYZE, BUFFERS)`得到以下計劃： ``` Aggregate (cost=61002.82..61002.83 rows=1 width=8) (actual time=297.311..297.312 rows=1 loops=1) Buffers: shared hit=51854 dirtied=19 -> Index Only Scan using twitter_test on users (cost=0.43..60873.13 rows=51877 width=0) (actual time=279.184..293.532 rows=51833 loops=1) Filter: ((twitter)::text <> ''::text) Rows Removed by Filter: 2487830 Heap Fetches: 26037 Buffers: shared hit=51854 dirtied=19 Planning time: 0.191 ms Execution time: 297.334 ms ``` 現在獲取數據只需不到 300 毫秒，而不是 1.2 秒. 但是，我們仍然使用 51,854 個緩沖區，這大約是 400 MB 的內存. 對于這種簡單的查詢，300 毫秒的速度也很慢. 要了解為什么此查詢仍然昂貴，讓我們看一下以下內容： ``` Index Only Scan using twitter_test on users (cost=0.43..60873.13 rows=51877 width=0) (actual time=279.184..293.532 rows=51833 loops=1) Filter: ((twitter)::text <> ''::text) Rows Removed by Filter: 2487830 ``` 我們從僅對索引進行索引掃描開始，但是我們仍然以某種方式應用了`Filter` ，該過濾`Filter`可過濾掉 2,487,830 行. 這是為什么？好，讓我們看一下如何創建索引： ``` CREATE INDEX CONCURRENTLY twitter_test ON users (twitter); ``` 我們只是簡單地告訴 PostgreSQL 索引`twitter`列的所有可能值，甚至是空字符串. 我們的查詢反過來使用`WHERE twitter != ''` . 這意味著索引確實可以改善事情，因為我們不需要進行順序掃描，但是我們仍然可能會遇到空字符串. 這意味著 PostgreSQL *必須*對索引結果應用過濾器以擺脫這些值. 幸運的是，我們可以使用"部分索引"來進一步改善它. 部分索引是在索引數據時具有`WHERE`條件的索引. 例如： ``` CREATE INDEX CONCURRENTLY some_index ON users (email) WHERE id < 100 ``` 該索引僅索引與`WHERE id < 100`相匹配的行的`email`值. 我們可以使用部分索引將我們的 Twitter 索引更改為以下內容： ``` CREATE INDEX CONCURRENTLY twitter_test ON users (twitter) WHERE twitter != ''; ``` 創建后，如果再次運行查詢，將得到以下計劃： ``` Aggregate (cost=1608.26..1608.27 rows=1 width=8) (actual time=19.821..19.821 rows=1 loops=1) Buffers: shared hit=44036 -> Index Only Scan using twitter_test on users (cost=0.41..1479.71 rows=51420 width=0) (actual time=0.023..15.514 rows=51833 loops=1) Heap Fetches: 1208 Buffers: shared hit=44036 Planning time: 0.123 ms Execution time: 19.848 ms ``` *好多了* ！現在僅需要 20 毫秒即可獲取數據，并且我們僅使用約 344 MB 的緩沖區（而不是原始的 1.58 GB）. 之所以可行，是因為現在 PostgreSQL 不再需要應用`Filter` ，因為索引僅包含不為空的`twitter`值. 請記住，每次您要優化查詢時，不應該只添加部分索引. 每個索引都必須為每次寫入更新，并且它們可能需要相當多的空間，具體取決于索引數據的數量. 結果，首先檢查是否存在可以重用的現有索引. 如果沒有，請檢查是否可以略微更改現有查詢以適合現有查詢和新查詢. 僅當現有索引無法以任何方式使用時，才添加新索引. ## Queries that can’t be optimised[](#queries-that-cant-be-optimised "Permalink") 既然我們已經了解了如何優化查詢，讓我們看一下可能無法優化的另一個查詢： ``` EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20); ``` `EXPLAIN (ANALYZE, BUFFERS)`的輸出如下： ``` Aggregate (cost=922420.60..922420.61 rows=1 width=8) (actual time=3428.535..3428.535 rows=1 loops=1) Buffers: shared hit=208846 -> Seq Scan on projects (cost=0.00..908053.18 rows=5746969 width=0) (actual time=0.041..2987.606 rows=5746940 loops=1) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 65677 Buffers: shared hit=208846 Planning time: 2.861 ms Execution time: 3428.596 ms ``` 查看輸出，我們看到以下過濾器： ``` Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 65677 ``` 查看過濾器刪除的行數，我們可能會想在`projects.visibility_level`上添加索引，以某種方式將此順序掃描+過濾器轉換為僅索引掃描. 不幸的是，這樣做不可能改善任何事情. 與某些人的看法相反，存在索引*并不能保證* PostgreSQL 會實際使用它. 例如，在執行`SELECT * FROM projects`時，僅掃描整個表而不是使用索引然后從表中獲取數據要便宜得多. 在這種情況下，PostgreSQL 可能會決定不使用索引. 其次，讓我們考慮一下查詢的作用：它使所有項目的可見性級別為 0 或 20.在上面的計劃中，我們可以看到生成了很多行（5,745,940），但相對于總行數是多少？通過運行以下查詢來找出答案： ``` SELECT visibility_level, count(*) AS amount FROM projects GROUP BY visibility_level ORDER BY visibility_level ASC; ``` 對于 GitLab.com，這將產生： ``` visibility_level | amount ------------------+--------- 0 | 5071325 10 | 65678 20 | 674801 ``` 這里的項目總數為 5,811,804，其中 5,746,126 為 0 或 20 級.這是整個表的 98％！因此，無論我們做什么，此查詢都將檢索整個表的 98％. 由于大部分時間都花在做到這一點上，因此除了*完全不*運行查詢之外，我們幾乎沒有其他方法可以改進此查詢. 這里重要的是，盡管有些人可能建議您在看到順序掃描時立即直接添加索引，但*更重要的*是首先了解查詢的功能，檢索的數據量等等. 畢竟，您無法優化您不了解的內容. ### Cardinality and selectivity[](#cardinality-and-selectivity "Permalink") 早先我們看到查詢必須檢索表中 98％的行. 數據庫通常使用兩個術語：基數和選擇性. 基數是指表中特定列中唯一值的數量. 選擇性是相對于總行數，操作（例如索引掃描或過濾器）產生的唯一值的數量. 選擇性越高，PostgreSQL 使用索引的可能性就越大. In the above example, there are only 3 unique values: 0, 10, and 20\. This means the cardinality is 3\. The selectivity in turn is also very low: 0.0000003% (2 / 5,811,804), because our `Filter` only filters using two values (`0` and `20`). With such a low selectivity value it’s not surprising that PostgreSQL decides using an index is not worth it, because it would produce almost no unique rows. ## Rewriting queries[](#rewriting-queries "Permalink") 因此，上述查詢無法真正按原樣進行優化，或者至少沒有太多優化. 但是，如果我們稍微改變它的目的怎么辦？如果不是檢索`visibility_level` 0 或 20 的所有項目，而是檢索用戶與之交互的那些項目，該怎么辦？幸運的是，GitLab 對此有一個答案，它是一個名為`user_interacted_projects`的表. 該表具有以下架構： ``` Table "public.user_interacted_projects" Column | Type | Modifiers ------------+---------+----------- user_id | integer | not null project_id | integer | not null Indexes: "index_user_interacted_projects_on_project_id_and_user_id" UNIQUE, btree (project_id, user_id) "index_user_interacted_projects_on_user_id" btree (user_id) Foreign-key constraints: "fk_rails_0894651f08" FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE "fk_rails_722ceba4f7" FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE ``` 讓我們重寫查詢以將該表加入到我們的項目中，并獲取特定用戶的項目： ``` EXPLAIN ANALYZE SELECT COUNT(*) FROM projects INNER JOIN user_interacted_projects ON user_interacted_projects.project_id = projects.id WHERE projects.visibility_level IN (0, 20) AND user_interacted_projects.user_id = 1; ``` 我們在這里做的是以下幾點： 1. 獲得我們的項目. 2. INNER JOIN `user_interacted_projects` ，這意味著我們只剩下在`user_interacted_projects`中有對應行的`projects`中的行. 3. 將其限制為`visibility_level`為 0 或 20 的項目以及 ID 為 1 的用戶與之交互的項目. 如果運行此查詢，則會得到以下計劃： ``` Aggregate (cost=871.03..871.04 rows=1 width=8) (actual time=9.763..9.763 rows=1 loops=1) -> Nested Loop (cost=0.86..870.52 rows=203 width=0) (actual time=1.072..9.748 rows=143 loops=1) -> Index Scan using index_user_interacted_projects_on_user_id on user_interacted_projects (cost=0.43..160.71 rows=205 width=4) (actual time=0.939..2.508 rows=145 loops=1) Index Cond: (user_id = 1) -> Index Scan using projects_pkey on projects (cost=0.43..3.45 rows=1 width=4) (actual time=0.049..0.050 rows=1 loops=145) Index Cond: (id = user_interacted_projects.project_id) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 0 Planning time: 2.614 ms Execution time: 9.809 ms ``` 在這里，僅花了不到 10 毫秒的時間即可獲取數據. 我們還可以看到我們正在檢索的項目要少得多： ``` Index Scan using projects_pkey on projects (cost=0.43..3.45 rows=1 width=4) (actual time=0.049..0.050 rows=1 loops=145) Index Cond: (id = user_interacted_projects.project_id) Filter: (visibility_level = ANY ('{0,20}'::integer[])) Rows Removed by Filter: 0 ``` 在這里，我們看到執行了 145 個循環（ `loops=145` ），每個循環產生 1 行（ `rows=1` ）. 這比以前少了很多，我們的查詢執行得更好！如果我們看一下計劃，我們還會發現我們的成本非常低： ``` Index Scan using projects_pkey on projects (cost=0.43..3.45 rows=1 width=4) (actual time=0.049..0.050 rows=1 loops=145) ``` 在這里，我們的成本僅為 3.45，而僅需 0.050 毫秒即可完成. 下一次索引掃描會貴一些： ``` Index Scan using index_user_interacted_projects_on_user_id on user_interacted_projects (cost=0.43..160.71 rows=205 width=4) (actual time=0.939..2.508 rows=145 loops=1) ``` 此處的成本為 160.71（ `cost=0.43..160.71` ），大約需要 2.5 毫秒（基于`actual time=....`的輸出`actual time=....` ）. 這里最昂貴的部分是對這兩個索引掃描的結果起作用的"嵌套循環"： ``` Nested Loop (cost=0.86..870.52 rows=203 width=0) (actual time=1.072..9.748 rows=143 loops=1) ``` 在這里，我們必須對 203 行（9.748 毫秒）執行 870.52 磁盤頁讀取，在單個循環中產生 143 行. 這里的主要要點是，有時您必須重寫（部分）查詢以使其更好. 有時，這意味著必須稍微更改功能以適應更好的性能. ## What makes a bad plan[](#what-makes-a-bad-plan "Permalink") 這是一個很難回答的問題，因為"壞"的定義與您要解決的問題有關. 但是，在大多數情況下最好避免某些模式，例如： * 大表上的順序掃描 * 刪除大量行的過濾器 * 執行某個步驟（例如索引掃描）需要*很多*緩沖區（例如，GitLab.com 大于 512 MB）. 作為一般準則，請針對以下查詢： 1. 不超過 10 毫秒. 我們每個請求在 SQL 中花費的目標時間約為 100 毫秒，因此每個查詢應盡可能快. 2. 相對于工作負載，不使用過多的緩沖區. 例如，檢索十行應該不需要 1 GB 的緩沖區. 3. 不花費大量時間執行磁盤 IO 操作. 必須啟用設置`track_io_timing` ，此數據才能包含在`EXPLAIN ANALYZE`的輸出中. 4. 在檢索行而不對其進行匯總時應用`LIMIT` ，例如`SELECT * FROM users` . 5. 不使用`Filter`過濾掉過多的行，尤其是在查詢不使用`LIMIT`限制返回的行數的情況下. 通常可以通過添加（部分）索引來刪除過濾器. 這些是*準則* ，不是硬性要求，因為不同的需求可能需要不同的查詢. 唯一的*規則*是，您*必須始終*使用`EXPLAIN (ANALYZE, BUFFERS)`和相關工具*來衡量*您的查詢（最好使用類似生產的數據庫）： * [`explain.depesz.com`](https://explain.depesz.com/). * [`explain.dalibo.com/`](https://explain.dalibo.com/). ## Producing query plans[](#producing-query-plans "Permalink") 有幾種獲取查詢計劃輸出的方法. 當然，您可以直接在`psql`控制臺中運行`EXPLAIN`查詢，或者可以遵循以下其他選項之一. ### Rails console[](#rails-console "Permalink") Using the [`activerecord-explain-analyze`](https://github.com/6/activerecord-explain-analyze) you can directly generate the query plan from the Rails console: ``` pry(main)> require 'activerecord-explain-analyze' => true pry(main)> Project.where('build_timeout > ?', 3600).explain(analyze: true) Project Load (1.9ms) SELECT "projects".* FROM "projects" WHERE (build_timeout > 3600) ? (pry):12 => EXPLAIN for: SELECT "projects".* FROM "projects" WHERE (build_timeout > 3600) Seq Scan on public.projects (cost=0.00..2.17 rows=1 width=742) (actual time=0.040..0.041 rows=0 loops=1) Output: id, name, path, description, created_at, updated_at, creator_id, namespace_id, ... Filter: (projects.build_timeout > 3600) Rows Removed by Filter: 14 Buffers: shared hit=2 Planning time: 0.411 ms Execution time: 0.113 ms ``` ### ChatOps[](#chatops "Permalink") [GitLab 員工還可以使用`/chatops` slash 命令在 Slack 中使用我們的 ChatOps 解決方案](chatops_on_gitlabcom.html) . 您可以使用 ChatOps 通過運行以下命令來獲取查詢計劃： ``` /chatops run explain SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20) ``` 還支持使用[https://explain.depesz.com/](https://explain.depesz.com/)可視化計劃： ``` /chatops run explain --visual SELECT COUNT(*) FROM projects WHERE visibility_level IN (0, 20) ``` 不需要查詢. 有關可用選項的更多信息，請運行： ``` /chatops run explain --help ``` ### `#database-lab`[](#database-lab "Permalink") GitLab 員工可以使用的另一種工具是由[Joe](https://gitlab.com/postgres-ai/joe)支持的聊天機器人，該機器人使用[Database Lab](https://gitlab.com/postgres-ai/database-lab)立即為開發人員提供他們自己的生產數據庫克隆. Joe 在 Slack 的[`#database-lab`](https://gitlab.slack.com/archives/CLJMDRD8C)頻道中可用. 與 ChatOps 不同，它為您提供了一種執行 DDL 語句（如創建索引和表）并獲取查詢計劃的方法，該查詢計劃不僅適用于`SELECT`而且適用于`UPDATE`和`DELETE` . 例如，為了測試新索引，您可以執行以下操作：創建索引： ``` exec CREATE INDEX index_projects_last_activity ON projects (last_activity_at) WHERE last_activity_at IS NOT NULL ``` 分析表以更新其統計信息： ``` exec ANALYZE projects ``` 獲取查詢計劃： ``` explain SELECT * FROM projects WHERE last_activity_at < CURRENT_DATE ``` 完成后，您可以回滾您的更改： ``` reset ``` 有關可用選項的更多信息，請運行： ``` help ``` #### Tips & Tricks[](#tips--tricks "Permalink") 現在，在整個會話期間都將維護數據庫連接，因此您可以對任何會話變量（例如`enable_seqscan`或`work_mem` ）使用`exec set ...` 這些設置將應用于所有后續命令，直到您重置它們. 也可以使用事務. 當您處理修改數據的語句（例如 INSERT，UPDATE 和 DELETE）時，這可能很有用. `explain`命令將執行`EXPLAIN ANALYZE` ，該語句執行該語句. 為了從干凈狀態開始運行每個`explain` ，您可以將其包裝在事務中，例如： ``` exec BEGIN explain UPDATE some_table SET some_column = TRUE exec ROLLBACK ``` ## Further reading[](#further-reading "Permalink") 在[Dalibo.org](https://www.dalibo.com/en/)的[演示中](https://public.dalibo.com/exports/conferences/_archives/_2012/201211_explain/understanding_explain.pdf)可以找到關于理解查詢計劃的更廣泛的指南.