<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                [TOC] ## select * 基本的Select操作 * 語法結構 ~~~ SELECT [ALL | DISTINCT] select_expr, select_expr, ... FROM table_reference [WHERE where_condition] [GROUP BY col_list [HAVING condition]] [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY| ORDER BY col_list] ] [LIMIT number] ~~~ **注:** 1. **order by 會對輸入做全局排序,因此只有一個reducer,會導致當輸入規模較大時,需要較長的計算時間** 2. **sort by不是全局排序,其在數據進入reducer前完成排序。因此,如果用sort by進行排序,并且設置`mapred.reduce.tasks>1`,則sort by只保證每個reducer的輸出有序,不保證全局有序** 3. distribute by(字段)(分發)根據指定的字段將數據分到不同的reducer,且分發算法是hash散列。 4. Cluster by(字段)(桶) 除了具有Distribute by的功能外,還會對該字段進行排序。 因此,如果分桶和sort字段是同一個時,此時,`cluster by = distribute by + sort by` 分桶表的作用:最大的作用是用來提高join操作的效率; (思考這個問題: `select a.id,a.name,b.addr from a join b on a.id = b.id;` 如果a表和b表已經是分桶表,而且分桶的字段是id字段 做這個join操作時,還需要全表做笛卡爾積嗎?) **注意:在hive中提供了一種“嚴格模式”的設置來阻止用戶執行可能會帶來未知不好影響的查詢** 設置屬性hive.mapred.mode 為strict能夠阻止以下三種類型的查詢: 1. 除非在where語段中包含了分區過濾,否則不能查詢分區了的表。這是因為分區表通常保存的數據量都比較大,沒有限定分區查詢會掃描所有分區,耗費很多資源。 不允許:`select *from logs;` 允許:`select * from logs where day=20151212;` 2. ? 包含order by,但沒有limit子句的查詢。因為order by 會將所有的結果發送給單個reducer來執行排序,這樣的排序很耗時 3. ? 笛卡爾乘積;簡單理解就是JOIN沒帶ON,而是帶where的 **案例** ~~~ create external table student_ext(Sno int,Sname string,Sex string,Sage int,Sdept string) row format delimited fields terminated by ',' location '/stu'; ~~~ ~~~ //where查詢 select * from student_ext where sno=95020; //分組 select sex,count(*) from student_ext group by sex; ~~~ ~~~ //分區,排序,但是這個只有1個reduce,沒意義 select * from student_ext cluster by sex; ~~~ ~~~ //設置4個reduce //這樣每個reduce自己內部會排序 hive> set mapred.reduce.task=4; hive> create table tt_1 as select * from student_ext cluster by sno; //查看結果,這個tt_1文件夾下面有4個文件 dfs -cat /user/hive/warehouse/db1.db/tt_1/000000_0; //這個結果和上面一樣,分成4個reduce create table tt_2 as select * from student_ext distribute by sno sort by sno; //排序可以按照其他方式排序 create table tt_3 as select * from student_ext distribute by sno sort by sage; ~~~
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看