<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ??碼云GVP開源項目 12k star Uniapp+ElementUI 功能強大 支持多語言、二開方便! 廣告
                # hive數據導入 可以通過多種方式將數據導入hive表 ## 1.通過外部表導入 用戶在hive上建external表,建表的同時指定hdfs路徑,在數據拷貝到指定hdfs路徑的同時,也同時完成數據插入external表。 例如: 編輯文件test.txt ~~~ $ cat test.txt? 1 ? ? ? hello 2 ? ? ? world 3 ? ? ? test 4 ? ? ? case ~~~ 字段之間以'\t'分割 啟動hive: `$ hive` 建external表: ~~~ hive>?CREATE EXTERNAL TABLE MYTEST(num INT, name STRING) ? ? > COMMENT 'this is a test' ? ? > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ? ? > STORED AS TEXTFILE ? ? > LOCATION '/data/test'; OK Time taken: 0.714 seconds hive>?show tables; OK mytest partition_test partition_test_input test Time taken: 0.07 seconds hive>?desc mytest?; OK num ? ? int name ? ?string Time taken: 0.121 seconds| ~~~ 數據拷貝到hdfs: $ hadoop fs -put test.txt /data/test 查看hive表數據: ~~~ hive>?select * from mytest; OK 1 ? ? ? hello 2 ? ? ? world 3 ? ? ? test 4 ? ? ? case Time taken: 0.375 seconds hive>?select num from mytest; Total MapReduce jobs = 1 Launching Job 1 out of 1 ...... Total MapReduce CPU Time Spent: 510 msec OK 1 2 3 4 Time taken: 27.157 seconds ~~~ 這種方式常常用于當hdfs上有一些歷史數據,而我們需要在這些數據上做一些hive的操作時使用。這種方式避免了數據拷貝開銷 ## 2.從本地導入 數據不在hdfs上,直接從本地導入hive表 文件/home/work/test.txt內容同上 建表: ~~~ hive> CREATE TABLE MYTEST2(num INT, name STRING) ? ? > COMMENT 'this is a test2' ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ? ?? ? ? > STORED AS TEXTFILE; ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? OK Time taken: 0.077 seconds ~~~ 導數據入表: ~~~ hive>?LOAD DATA LOCAL INPATH '/home/work/test.txt' INTO TABLE MYTEST2; Copying data from file:/home/work/test.txt Copying file: file:/home/work/test.txt Loading data to table default.mytest2 OK Time taken: 0.24 seconds ~~~ 查看數據: ~~~ hive>?select * from MYTEST2; OK 1 ? ? ? hello 2 ? ? ? world 3 ? ? ? test 4 ? ? ? case Time taken: 0.11 seconds ~~~ 這種方式導入的本地數據可以是一個文件,一個文件夾或者通配符,需要注意的是,如果是文件夾,文件夾內不能包含子目錄,同樣,通配符只能通配文件。 ## 3.從hdfs導入 上述test.txt文件已經導入/data/test 則可以使用下述命令直接將數據導入hive表: ~~~ hive>?CREATE TABLE MYTEST3(num INT, name STRING) ? ? > COMMENT "this is a test3" ? ? > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ? ? > STORED AS TEXTFILE; OK Time taken: 4.735 seconds hive>?LOAD DATA INPATH '/data/test/test.txt' INTO TABLE MYTEST3; Loading data to table default.mytest3 OK Time taken: 0.337 seconds hive>?select * from MYTEST3?; OK 1 ? ? ? hello 2 ? ? ? world 3 ? ? ? test 4 ? ? ? case Time taken: 0.227 seconds ~~~ ## 4\. 從其它表導入數據: ~~~ hive>?CREATE EXTERNAL TABLE MYTEST4(num INT) ; OK Time taken: 0.091 seconds hive> FROM MYTEST3 test3 ? ? > INSERT OVERWRITE TABLE MYTEST4 ? ? > select test3.num where name='world'; Total MapReduce jobs = 2 Launching Job 1 out of 2 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201207230024_0002, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201207230024_0002 Kill Command = /home/work/hadoop/hadoop-1.0.3/libexec/../bin/hadoop job ?-Dmapred.job.tracker=localhost:9001 -kill job_201207230024_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2012-07-23 18:59:02,365 Stage-1 map = 0%, ?reduce = 0% 2012-07-23 18:59:08,417 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:09,435 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:10,445 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:11,455 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:12,470 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:13,489 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 0.62 sec 2012-07-23 18:59:14,508 Stage-1 map = 100%, ?reduce = 100%, Cumulative CPU 0.62 sec MapReduce Total cumulative CPU time: 620 msec Ended Job = job_201207230024_0002 Ended Job = -174856900, job is filtered out (removed at runtime). Moving data to: hdfs://localhost:9000/tmp/hive-work/hive_2012-07-23_18-58-44_166_189728317691010041/-ext-10000 Loading data to table default.mytest4 Deleted hdfs://localhost:9000/user/hive/warehouse/mytest4 Table default.mytest4 stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 2, raw_data_size: 0] 1 Rows loaded to mytest4 MapReduce Jobs Launched:? Job 0: Map: 1 ? Accumulative CPU: 0.62 sec ? HDFS Read: 242 HDFS Write: 2 SUCESS Total MapReduce CPU Time Spent: 620 msec OK Time taken: 30.663 seconds hive>?select * from mytest4; OK 2 Time taken: 0.103 seconds ~~~
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看