<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ??一站式輕松地調用各大LLM模型接口,支持GPT4、智譜、豆包、星火、月之暗面及文生圖、文生視頻 廣告
                (1)編寫Streaming程序 ```scala import org.apache.spark.SparkConf import org.apache.spark.streaming.dstream.{DStream} import org.apache.spark.streaming.{Seconds, StreamingContext} /** * @Date 2021/01/20 */ object HDFSWordCount { def main(args: Array[String]): Unit = { /** ********* 1. 創建StreamingContext ************/ val conf: SparkConf = new SparkConf().setMaster("local[4]").setAppName(this.getClass.getName) // Seconds(5)是批處理間隔,即將5秒內新收集的數據作為一個單位進行處理 val ssc: StreamingContext = new StreamingContext(conf, Seconds(5)) /** ********* 2. 加載數據 ************/ // 注意:textFileStream參數只能是目錄,不能指定具體的文件 // 一行數據為DStream的一個元素 val lines: DStream[String] = ssc.textFileStream("hdfs://hadoop101:9000/spark/input") // val lines: DStream[String] = ssc.textFileStream("file:///E:\\hadoop\\input") // lines.count().print() /** ********* 3. 使用DSteam的各種算子進行計算 ************/ val result: DStream[(String, Int)] = lines.flatMap(_.split(",")).map((_, 1)).reduceByKey(_ + _) result.print() // 打印輸出 /** ********* 4. 啟動Streaming程序 ************/ ssc.start() /** *********** 5. 等待應用程序終止 ****************/ // 或者調用ssc.stop()直接停止當前的Streaming程序 ssc.awaitTermination() } } ``` (2)啟動上面的Streaming程序 (3)往HDFS`/spark-streaming/input`目錄上傳文件 ```shell [root@hadoop101 hadoop]# bin/hdfs dfs -mkdir /spark-streaming/input [root@hadoop101 hadoop]# bin/hdfs dfs -put /test-data/words1.txt /spark-streaming/input ``` (4)程序將打印出如下信息 ```txt ------------------------------------------- Time: 1612769255000 ms ------------------------------------------- (python,2) (hadoop,1) (hello,2) (kafka,1) ```
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看