<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                Scala --> 基于JVM --> java改進版 多核-》多進程 分布式 Spark 應用場合: 大數據-分布式-批處理 Spark平臺的工具庫 Core- RDD Spark SQL離線 Streaming 在線 GraphX-rdd高維pageRank MLlib機器學習 Spark是一個計算框架,可以基于JAVA語言,可以基于Scala語言 Spark算子: ~~~ List.reduce((x,y)=>x+y) 簡化為: List.reduce(_+_) 去除重復 raw.flatMap(_.split("\\W+")).distinct.collect.take(10) ~~~ 案例一、單詞統計: val filerdd = sc.textFile(file).flatMap(_.split("\\W+")).map(x=>(x,1)).reduceByKey(_+_).collect.take(100).foreach(println) 案例二、分組案例(找到每門課程前3名): 文件樣本: ~~~ hadoop 100 hadoop 68 spark 90 spark 80 hadoop 95 spark 100 hadoop 60 spark 92 spark 93 ~~~ 代碼 ~~~ val raw = sc.textFile("file:/home/hadoop/g1"); var frdd = raw.map(_.split(" ")).map(x=>(x(0),x(1).toDouble)) frdd.groupByKey.map{ x=> var str = x._2.toArray.sortWith(_>_).take(3).mkString(",") pirntln(x._1+" top 3="+str) }.collect ~~~ 運行結果: ~~~ spark top 3 = 100.0, 93.0, 92.0 hadoop top3=100.0, 95.0, 68.0 ~~~ 案例三、日志分析 ~~~ #Software: Microsoft Internet Information Services 5.0 #Version: 1.0 #Date: 2004-12-19 00:01:21 #Fields: date time c-ip cs-username s-ip s-port cs-method cs-uri-stem cs-uri-query sc-status cs(User-Agent) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /index.asp - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /all.css - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/head.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c1.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c3.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c4.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c7.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c12.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c13.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c14.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c18.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c15.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c20.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/index_r2_c22.jpg - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) 2004-12-19 00:01:21 172.16.52.16 - 211.66.184.35 80 GET /images/spacer.gif - 200 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.0.3705) ~~~ 代碼 ~~~ def main(args: Array[String]): Unit = { //創建SparkConf對象 val conf = new SparkConf() //設置應用程序名稱,在程序運行的監控界面可以看到名稱 conf.setAppName("My First Spark App!") //設置local使程序在本地運行,不需要安裝Spark集群 conf.setMaster("local") val sc = new SparkContext(conf) val lines = sc.textFile("file:///e:/大數據資料/weblog.txt",1) val lines2 = lines.filter(!_.startsWith("#")); val lines3 = lines2.map(_.split("\\s+")).map(x=>(x(0),x(7))); val lines4 = lines3.map{x=> val k = x._1+":"+x._2 (k,1) }.reduceByKey(_+_); lines4.foreach(println); } ~~ 運行結果 ~~~ (2004-12-19:/news/newshtml/schoolNews/20040618182152.asp,1) (2004-12-19:/images/index_r2_c14.jpg,269) (2004-12-19:/south/administer/images/BYTESKY-64.GIF,1) (2004-12-19:/south/administer/ryjs/images/calling+card/cjd.jpg,1) (2004-12-19:/south/administer/images/bytesky-69.gif,1) (2004-12-19:/news/newshtml/insideInform/chengjiao/fu2.htm,1) (2004-12-19:/images/zuotiao1.jpg,272) (2004-12-19:/news/img/2004-9-22d.jpg,1) ~~~ 案例四、正則表達式 4.1得到數字的數據 ~~~ val numPattern="[0-9]+".r for(matchString <- numPattern.findAllIn("99345 Scala,22298 Spark")) println(matchString) ~~~ ~~~ 4.2 得到非數字的數據 ~~~ //創建SparkConf對象 val conf = new SparkConf() //設置應用程序名稱,在程序運行的監控界面可以看到名稱 conf.setAppName("My First Spark App!") //設置local使程序在本地運行,不需要安裝Spark集群 conf.setMaster("local") val sc = new SparkContext(conf) val lines = sc.textFile("file:///e:/大數據資料/weblog.txt",1) //非數字 val reg = """[^0-9]""".r val lines2 = lines.map(x=>reg.findAllIn(x).mkString) lines2.foreach(println); ~~~
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看