<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                <div id="article_content" class="article_content clearfix"> <link rel="stylesheet" href="https://csdnimg.cn/release/blogv2/dist/mdeditor/css/editerView/kdoc_html_views-1a98987dfd.css"> <link rel="stylesheet" href="https://csdnimg.cn/release/blogv2/dist/mdeditor/css/editerView/ck_htmledit_views-25cebea3f9.css"> <div id="content_views" class="htmledit_views"> <h1><a name="t0"></a>1、ClicHouse語法優化規則</h1> <p></p> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="https://so.csdn.net/so/search?q=ClickHouse&amp;spm=1001.2101.3001.7020" target="_blank" class="hl hl-1" data-report-click="{&quot;spm&quot;:&quot;1001.2101.3001.7020&quot;,&quot;dest&quot;:&quot;https://so.csdn.net/so/search?q=ClickHouse&amp;spm=1001.2101.3001.7020&quot;,&quot;extra&quot;:&quot;{\&quot;searchword\&quot;:\&quot;ClickHouse\&quot;}&quot;}" data-tit="ClickHouse" data-pretit="clickhouse">ClickHouse</a>的SQL優化規則是基于RBO(Rule Based Optimization),下面是一些優化規則。</p> <h2><a name="t1"></a>1.1、COUNT優化</h2> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;在調用count函數時,如果使用的是count()或者count(*),且沒有where條件,則會直接使用system.tables的total_rows,例如:</p> <p style="text-align:center;"><img alt="" height="507" src="https://img-blog.csdnimg.cn/95c125b1508c47899eac0609d6e278df.png?x-oss-process=image/watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBAWXVhbl9DU0RG,size_18,color_FFFFFF,t_70,g_se,x_16" width="576"></p> <p></p> <p></p> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;注意:Optimized trivial count,這是對count的優化。</p> <h2><a name="t2"></a>1.2、消除子查詢重復字段</h2> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;語句子查詢中有兩個重復的id字段,會被去重。</p> <h2><a name="t3"></a>1.3、謂詞下推</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 當group by有having子句,但是沒有with cube、with rollup或者with totals修飾的時候,having會下推到where提前過濾。</p> <pre data-index="0"><code class="language-sql hljs">EXPLAIN SYNTAX <span class="hljs-keyword">SELECT</span> UserID <span class="hljs-keyword">FROM</span> hits_v1 <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> UserID <span class="hljs-keyword">HAVING</span> UserID<span class="hljs-operator">=</span><span class="hljs-string">'123456789'</span>;</code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p style="text-align:center;"><img alt="" height="281" src="https://img-blog.csdnimg.cn/94db83a9631c4386baf1a76e9f58ff09.png?x-oss-process=image/watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBAWXVhbl9DU0RG,size_20,color_FFFFFF,t_70,g_se,x_16" width="673"></p> <h2><a name="t4"></a>1.4、聚合計算外推</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 聚合函數內的計算,會外推,如下:</p> <pre data-index="1"><code class="language-sql hljs">EXPLAIN SYNTAX <span class="hljs-keyword">SELECT</span> <span class="hljs-built_in">sum</span>(UserID <span class="hljs-operator">*</span> <span class="hljs-number">2</span>) <span class="hljs-keyword">FROM</span> visits_v1;</code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p style="text-align:center;"><img alt="" height="223" src="https://img-blog.csdnimg.cn/655b2e0198994b2e911e70efb7b44753.png?x-oss-process=image/watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBAWXVhbl9DU0RG,size_14,color_FFFFFF,t_70,g_se,x_16" width="463"></p> <h2><a name="t5"></a>1.5、聚合函數消除</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 如果對聚合鍵,也就是group by key使用min、max等聚合函數,則會將函數消除。</p> <pre data-index="2"><code class="language-sql hljs">EXPLAIN SYNTAX <span class="hljs-keyword">SELECT</span> <span class="hljs-built_in">sum</span>(UserID <span class="hljs-operator">*</span> <span class="hljs-number">2</span>),<span class="hljs-built_in">max</span>(VisitID),<span class="hljs-built_in">max</span>(UserID) <span class="hljs-keyword">FROM</span> visits_v1 <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> UserID;</code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p><img alt="" height="340" src="https://img-blog.csdnimg.cn/027a2031177141d599fed92891122536.png?x-oss-process=image/watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBAWXVhbl9DU0RG,size_20,color_FFFFFF,t_70,g_se,x_16" width="745"></p> <h2><a name="t6"></a>1.6、刪除重復的order by key、刪除重復的limit by key</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 當order by、limit后有跟相同的key,SYNTAX會提示可進行優化。</p> <h2><a name="t7"></a>1.7、三元運算優化</h2> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;set optimize_if_chain_to_multiif = 1;</p> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;或者直接使用如下SQL:</p> <pre data-index="3"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">-- 查看語法優化</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">EXPLAIN SYNTAX <span class="hljs-keyword">SELECT</span> number <span class="hljs-operator">=</span><span class="hljs-number">1</span>? <span class="hljs-string">'hello'</span>:(number<span class="hljs-operator">=</span><span class="hljs-number">2</span>?<span class="hljs-string">'world'</span>:<span class="hljs-string">'abcdef'</span>) <span class="hljs-keyword">from</span> numbers(<span class="hljs-number">10</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">settings optimize_if_chain_to_multiif <span class="hljs-operator">=</span> <span class="hljs-number">1</span>;</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <h1><a name="t8"></a>2、查詢優化</h1> <h2><a name="t9"></a>2.1、單表查詢</h2> <h3><a name="t10"></a>2.1.1、Prewhere替代where</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; Prewhere和where語句的作用相同,用來過濾數據。不同之處在于prewhere只支持*MergeTree族系列引擎的表,首先會讀取指定的列數據,來判斷數據過濾,等待數據過濾之后再讀取select聲明的列字段來補全其余屬性。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 當查詢列明顯多于篩選列時使用PreWhere可十倍提升查詢性能,Prewhere會自動優化執行過濾階段的數據讀取方式,降低io操作。參數optimize_move_to_prewhere=1為開啟狀態,可以設置為0對其關閉。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 在某些場合下,prewhere語句比where語句處理的數據量更少性能更高。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 默認情況,我們肯定不會關閉where自動優化成prewhere,但是某些場景即使開啟優化,也不會自動轉換成prewhere,需要手動指定prewhere:</p> <ul><li>使用常量表達式</li><li>使用默認值為alias類型的字段</li><li>包含了arrayJOIN,globalIn,globalNotIn或者IndexHint的查詢</li><li>select查詢的列字段和where的謂詞相同</li><li>使用了主鍵字段</li></ul> <h3><a name="t11"></a>2.1.2、數據采樣</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; 通過采樣運算可極大提升數據分析的性能</p> <pre data-index="4"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">select</span> Title,<span class="hljs-built_in">count</span>(<span class="hljs-operator">*</span>) <span class="hljs-keyword">As</span> PageViews </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">FROM</span> hits_v1</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">SAMPLE <span class="hljs-number">01.</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">WHERE</span> CounterID <span class="hljs-operator">=</span> <span class="hljs-number">57</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> Title</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> PageViews <span class="hljs-keyword">DESC</span> LIMIT <span class="hljs-number">1000</span></div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;采樣修飾符只有在MergeTree engine表中才有效,且在創建表時需要制定采樣策略。</p> <h3><a name="t12"></a>2.1.3、列裁剪與分區裁剪</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;數據量太大應避免使用select * 操作,查詢的性能會與查詢的字段大小和數量成線性關系,字段越少,消耗的io資源越少,性能就會越高。</p> <ul><li>列裁剪表示只選取所需要的列。</li><li>分區裁剪就是只讀取需要的分區,在過濾條件中指定。</li></ul> <h3><a name="t13"></a>2.1.4、order by 結合where、limit使用</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;千萬以上數據集進行order by 查詢時需要配合where條件和limit語句一起使用。</p> <h3><a name="t14"></a>2.1.5、避免構建虛擬列</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;如非必須,不要在結果集上構建虛擬列,虛擬列非常消耗資源浪費性能,可以考慮在前端處理,或者在表中構造實際字段進行額外存儲。</p> <h3><a name="t15"></a>2.1.6、uniqueCombined替代distinct</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;性能可提升10倍以上,uniqCombined底層采用類似HyperLogLog算法實現,能接收2%左右的數據誤差,可直接使用這種去重方式提升查詢性能。Count(distinct)會使用uniqExact精確去重。</p> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;不建議在千萬級不同數據上執行distinct去重查詢,改為近似去重uniqCombined。</p> <h2><a name="t16"></a>2.2、 多表關聯</h2> <h3><a name="t17"></a>2.2.1、用in代替join</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;當多表聯查時,查詢的數據僅從其中一張表出事,可以考慮用in操作而不是join</p> <h3><a name="t18"></a>2.2.2、大小表join</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;多表join時要滿足小表在右的原則,右表關聯時被加載到內存中與左表進行比較,ClickHouse中無論是Left join、Right Join還是Inner join永遠都是拿著右表中的每一條記錄到左表中查找該記錄是否存在,所以右表必須是小表。</p> <h3><a name="t19"></a>2.2.3、注意謂詞下推(版本差異)</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ClikHouses在join查詢時不會主動發起謂詞下推的操作,需要每個子查詢提前完成過濾操作,需要注意的是,是否執行謂詞下推,對性能影響差別很大(新版本中已經不存在此問題,但是需要注意謂詞的位置不用依然有性能的差異)。</p> <h3><a name="t20"></a>2.2.4、分布式表使用GLOBAL</h3> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;兩張分布式表的in和join之前必須加上GLOBAL關鍵字,右表只會在接收查詢請求的那個節點查詢一次,并將其分發到其它節點上,如果不加GLOBAL關鍵字的話,每個節點都會單獨發起一次對右表的查詢,而右表又是分布式表,就導致右表一共會被查詢次(N是該分布式表的分片數量),這就是查詢放大,會帶來很大開銷。</p> <h3><a name="t21"></a>2.2.5、使用字典表</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; 將一些需要關聯分析的業務創建成字典表進行join操作,前提是字典表不宜太大,因為字典表會常駐內存。</p> <h3><a name="t22"></a>2.2.6、提前過濾</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; 通過增加邏輯過濾可以減少數據掃描,達到提供執行速度及降低內存消息的目的。</p> <h1><a name="t23"></a>3、數據一致性</h1> <p>&nbsp; &nbsp; &nbsp; &nbsp; ReplacingMergeeTree:數據的去重只會在數據合并期間進行。合并會在后臺一個不確定的時間進行,因此你無法預先作出計劃。有一些數據可能仍未被處理。集暖你可以調用optimize語句發起計劃外的合并,但請不要依靠它,因為OPTIMIZE語句會引發對護具的大量讀寫。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 因此,ReplacingMergeTree適用于在后臺清除重復的數據以節省空間,但是它不保證沒有重復的數據出現。</p> <h2><a name="t24"></a>3.1、案例</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 1、創建表</p> <pre data-index="5"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> test_a(</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> user_id UInt64,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> score String,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> deleted UInt8 <span class="hljs-keyword">DEFAULT</span> <span class="hljs-number">0</span>,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> create_time DateTime <span class="hljs-keyword">DEFAULT</span> toDateTime(<span class="hljs-number">0</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">)ENGINE <span class="hljs-operator">=</span> ReplacingMergeTree(create_time)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> user_id;</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp; &nbsp; &nbsp; &nbsp; 其中:</p> <ul><li>user_id是數據去重更新的標識;</li><li>create_time是版本號字段,每組數據中create_time最大的一行表示最新的數據;</li><li>delete是自定義的一個標記位,比如0代表為刪除,1代表刪除數據。</li></ul> <p>&nbsp; &nbsp; &nbsp; &nbsp; 2、寫入與修改數據</p> <pre data-index="6"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> <span class="hljs-keyword">TABLE</span> test_a(user_id,score)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">WITH</span>(</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">SELECT</span> [<span class="hljs-string">'A'</span>,<span class="hljs-string">'B'</span>,<span class="hljs-string">'C'</span>,<span class="hljs-string">'D'</span>,<span class="hljs-string">'E'</span>,<span class="hljs-string">'F'</span>,<span class="hljs-string">'G'</span>]</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">)<span class="hljs-keyword">AS</span> dict</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">SELECT</span> number <span class="hljs-keyword">AS</span> user_id,dict[number<span class="hljs-operator">%</span><span class="hljs-number">7</span><span class="hljs-operator">+</span><span class="hljs-number">1</span>] <span class="hljs-keyword">FROM</span> numbers(<span class="hljs-number">10000</span>);</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> <span class="hljs-keyword">TABLE</span> test_a(user_id,score,create_time)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">WITH</span>(</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="10"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">SELECT</span> [<span class="hljs-string">'AA'</span>,<span class="hljs-string">'BB'</span>,<span class="hljs-string">'CC'</span>,<span class="hljs-string">'DD'</span>,<span class="hljs-string">'EE'</span>,<span class="hljs-string">'FF'</span>,<span class="hljs-string">'GG'</span>]</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="11"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line">)<span class="hljs-keyword">AS</span> dict</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="12"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">SELECT</span> number <span class="hljs-keyword">AS</span> user_id,dict[number<span class="hljs-operator">%</span><span class="hljs-number">7</span><span class="hljs-operator">+</span><span class="hljs-number">1</span>],now() <span class="hljs-keyword">as</span> create_time <span class="hljs-keyword">FROM</span> numbers(<span class="hljs-number">5000</span>);</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp; &nbsp; &nbsp; &nbsp; 3、此時尚未觸發分區合并,所以還未去重。</p> <h2><a name="t25"></a>3.2、手動OPTIMIZE</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 在寫入數據后,立刻執行OPTIMIZE強制觸發新寫入分區的合并動作</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 語法:OPTIMIZE TABLE [db.]name [ON CLUSTER cluster]&nbsp; [PARTITION partition | PARTITION ID 'partition_id'] [FINAL] [DEDUPLICATE [BY expression]]</p> <pre data-index="7"><code class="language-sql hljs">OPTIMIZE <span class="hljs-keyword">TABLE</span> test_a <span class="hljs-keyword">FINAL</span>;</code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <h2><a name="t26"></a>3.3、通過Group by 去重</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 1、執行去重的查詢</p> <pre data-index="8"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">SELECT</span> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> user_id,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> argMax(score,create_time) <span class="hljs-keyword">AS</span> score,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> argMax(deleted,create_time) <span class="hljs-keyword">AS</span> deleted,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-built_in">max</span>(create_time) <span class="hljs-keyword">AS</span> ctime</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">FROM</span> test_a</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> user_id</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">HAVING</span> deleted <span class="hljs-operator">=</span> <span class="hljs-number">0</span>;</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp; &nbsp; &nbsp; &nbsp; 函數說明:</p> <ul><li>argMax(field1,field2):按照field2的最大值取field1的值。</li></ul> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;當我們更新數據時,會寫入一行新的數據,例如上面語句中,通過查詢最大的create_time得到修改后的score字段值。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 2、創建視圖</p> <pre data-index="9"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">VIEW</span> view_test_a <span class="hljs-keyword">AS</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">SELECT</span> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> user_id,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> argMax(score,create_time) <span class="hljs-keyword">AS</span> score,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> argMax(deleted,create_time) <span class="hljs-keyword">AS</span> deleted,</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-built_in">max</span>(create_time) <span class="hljs-keyword">AS</span> ctime</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">FROM</span> test_a</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> user_id</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">HAVING</span> deleted <span class="hljs-operator">=</span> <span class="hljs-number">0</span>;</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp; &nbsp; &nbsp; &nbsp; 3、刪除數據測試</p> <pre data-index="10"><code class="language-sql hljs"><ol class="hljs-ln" style="width:100%"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">-- 原有1W條數據</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">select</span> <span class="hljs-built_in">count</span>(<span class="hljs-operator">*</span>) <span class="hljs-keyword">from</span> test_a;</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">-- 插入并查詢,原表有10001條數據,視圖表有10000條數據</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> <span class="hljs-keyword">TABLE</span> test_a(user_id,score,create_time) <span class="hljs-keyword">VALUES</span> (<span class="hljs-number">0</span>,<span class="hljs-string">'AAAA'</span>,now());</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">-- 插入并查詢,原表有10002條數據,視圖表有9999條數據</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> <span class="hljs-keyword">TABLE</span> test_a(user_id,score,deleted,create_time) <span class="hljs-keyword">VALUES</span> (<span class="hljs-number">0</span>,<span class="hljs-string">'AAAA'</span>,<span class="hljs-number">1</span>,now());</div></div></li></ol></code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p style="text-align:center;"><img alt="" height="547" src="https://img-blog.csdnimg.cn/cf2faafcd5824ac4bc01b87a8f207433.png?x-oss-process=image/watermark,type_d3F5LXplbmhlaQ,shadow_50,text_Q1NETiBAWXVhbl9DU0RG,size_20,color_FFFFFF,t_70,g_se,x_16" width="829"></p> <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;這行數據并沒有真正的被刪除,而是被過濾掉了。在一些合適場景下,可以結合表級別的TTL最終將物理數據刪除。</p> <h2><a name="t27"></a>3.4、通過FINAL查詢</h2> <p>&nbsp; &nbsp; &nbsp; &nbsp; 在查詢語句后增加FINAL修飾符,這樣在查詢的過程中將會執行Merge的特殊邏輯(例如數據去重,預聚合等)。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 但是這種方法在早期版本基本沒有人使用,因為在增加FINAL之后,我們的查詢將會變成一個單線程的執行過程,查詢速度非常慢。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color:#fe2c24;"><strong>在v20.5.2.7-stable版本中,FINAL查詢是支持多線程執行</strong>,</span><span style="color:#0d0016;">并且可以通過</span><span style="color:#fe2c24;">max_final_threads參數</span><span style="color:#0d0016;">控制單個查詢的線程數。但是目前讀取part部分的動作依然是串行的。</span></p> <p><span style="color:#0d0016;">&nbsp; &nbsp; &nbsp; &nbsp; FIANL查詢最終的性能和很多因素相關,列字段的大小、分區的數量等等都會影響到最終的查詢時間,所以還要結合實際場景取舍。</span></p> <p><span style="color:#0d0016;">&nbsp; &nbsp; &nbsp; &nbsp; 參考鏈接:https://github.com/ClickHouse/ClickHouse/pull/10463</span></p> <h1><a name="t28"></a><span style="color:#0d0016;">4、物化視圖</span></h1> <p><span style="color:#0d0016;">&nbsp; &nbsp; &nbsp; &nbsp; ClickHouse的物化視圖是一種查詢結果的持久化,它確實是給我們帶來了查詢效率的提升。用戶查起來跟表沒有quiet,它就會一張表,它也像是一張時刻在預計算的表,創建的過程它是用了一個特殊引擎,加上后來as select,就是create一個table as select的寫法。</span></p> <p><span style="color:#0d0016;">&nbsp; &nbsp; &nbsp; &nbsp; “查詢結果集”的范圍很寬泛,可以是基礎表中部分數據的一份簡單拷貝,也可以使多表Join之后產生的結果或其子集,或者原始數據的聚合指標等等。所以,物化視圖不會隨著基礎表的變化而變化,所以它也稱為快照(snapshot)。</span></p> <h2><a name="t29"></a><span style="color:#0d0016;">4.1、概述</span></h2> <h3><a name="t30"></a><span style="color:#0d0016;">4.1.1、物化視圖與普通視圖的區別</span></h3> <p><span style="color:#0d0016;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color:#fe2c24;">&nbsp;普通視圖不保存數據,保存的僅僅是查詢語句,</span>查詢的時候還是從原表讀取數據,可以將普通視圖理解為是個子查詢。<span style="color:#fe2c24;">物化視圖是把查詢的結果根據相應的引擎存入到了磁盤或內存中,</span>對數據重新進行了組織,你可以理解物化視圖是完全的一張新表。</p> <h3><a name="t31"></a>4.1.2、優缺點</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; 優點:查詢速度快,要是把物化視圖這些規則全部寫好, 它比元數據查詢快樂很多,總的行數少了,因為都預計算好了。</p> <p>&nbsp; &nbsp; &nbsp; &nbsp; 缺點:它的本質是一個流式數據的使用場景,是累加式的技術,所以要用歷史數據做去重,去核這樣的分析,在物化視圖里面是不太好用的。在某些場景的使用也是有限的。而且如果一戰表加了很多物化視圖,在寫這張表的時候,就會消耗很多機器的資源,比如護具帶寬占滿、存儲一下子增加了很多。</p> <h3><a name="t32"></a>4.1.3、基本語法</h3> <p>&nbsp; &nbsp; &nbsp; &nbsp; 也是create語法,會創建一個隱藏的目標表來保存視圖數據。也可以TO表名,保存到一張顯示的表。沒有加TO表名,表名默認就會.inner.物化視圖名。</p> <pre data-index="11"><code class="language-sql hljs"><span class="hljs-keyword">CREATE</span> [MATERIALIZED] <span class="hljs-keyword">VIEW</span> [IF <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span>] [db.]table_name [<span class="hljs-keyword">TO</span>[db.]name] [ENGINE <span class="hljs-operator">=</span> engine] [POPULATE] <span class="hljs-keyword">AS</span> <span class="hljs-keyword">SELECT</span> ...</code><div class="hljs-button {2}" data-title="復制" onclick="hljs.copyCode(event)"></div></pre> <p>&nbsp; &nbsp; &nbsp; &nbsp; 1、創建物化視圖的限制</p> <ol><li>必須指定物化視圖的engine用于數據存儲</li><li>TO [db].[table]語法的使用,不得使用POPULATE。</li><li>查詢語句(select)可以包含下面的子句:DISTINCT,GROUP BY,ORDER BY,LIMIT...</li><li>物化視圖的alter操作有些限制,操作起來不大方便。</li><li>如物化視圖的定義使用了 TO [db.]name子語句,則可以將目標表的視圖卸載DETACH再裝載ATTACH。</li></ol> <p>&nbsp; &nbsp; &nbsp; &nbsp; 2、物化視圖的數據更新</p> <ol><li>物化視圖創建好之后,若源表被寫入新數據則物化視圖也會同步更新。</li><li>POPULATE關鍵字決定了物化視圖的更新策略: <ol><li>若有POPULATE則在創建視圖的過程會將源表已經存在的數據一并導入,類似于create table ... as</li><li>若無POPULATE則物化視圖在創建之后沒有數據,只會在創建只有同步之后寫入源表的數據。</li><li>clickhouse官方并不推薦使用POPULATE,因為在創建物化視圖的過程中同時寫入的數據不能被插入物化視圖。</li></ol></li><li>物化視圖不支持同步刪除,若源表的數據不存在(刪除了)則物化視圖的數據仍然保留。</li><li>物化視圖是一種特殊的數據表,可以用show table查看。</li></ol> <h2><a name="t33"></a></h2> <p></p> <p></p> <p></p> </div><div data-report-view="{&quot;mod&quot;:&quot;1585297308_001&quot;,&quot;spm&quot;:&quot;1001.2101.3001.6548&quot;,&quot;dest&quot;:&quot;https://blog.csdn.net/Yuan_CSDF/article/details/122293843&quot;,&quot;extend1&quot;:&quot;pc&quot;,&quot;ab&quot;:&quot;new&quot;}"><div></div></div> </div>
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看