<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                自定義函數過程: ```scala 1. 定義函數; 2. 注冊函數; SparkSession.udf.register():只在sql()中有效; functions.udf():對DataFrame API均有效; 3. 函數調用; ``` (1)需求:統計用戶的喜好個數 (2)輸入數據格式`hobbies.txt` ```txt alice jogging,Coding,cooking lina travel,dance ``` (3)輸出數據格式 ```txt alice jogging,Coding,cooking 3 lina travel,dance 2 ``` (4)代碼 ```scala import org.apache.spark.SparkContext import org.apache.spark.sql.SparkSession object DefineFun { // 1. 創建對應的case class case class Hobbies(name:String, hobbies:String) def main(args: Array[String]): Unit = { val spark:SparkSession = SparkSession.builder() .master("local[4]") .appName(this.getClass.getName) .getOrCreate() val sc:SparkContext = spark.sparkContext import spark.implicits._ // 2. 加載數據創建DataFrame val infoRdd = sc.textFile("file:///E:\\hadoop\\input\\hobbies.txt") val hobbiesDF = infoRdd.map(_.split("\t")).map(p=>Hobbies(p(0), p(1))).toDF() hobbiesDF.show() // +-----+--------------------+ // | name| hobbies| // +-----+--------------------+ // |alice|jogging,Coding,co...| // | lina| travel,dance| // +-----+--------------------+ // 3. 創建視圖 hobbiesDF.createOrReplaceTempView("hobbies_view") // 4. 在spark.udf.register中注冊,該函數只在spark.sql中生效 // spark.udf.register(funName, 匿名函數) spark.udf.register("hobby_num", (s:String)=>s.split(',').size) // 5. 在spark.sql中調用函數 spark.sql("select name, hobbies, hobby_num(hobbies) as hobby_num from hobbies_view").show() // +-----+--------------------+---------+ // | name| hobbies|hobby_num| // +-----+--------------------+---------+ // |alice|jogging,Coding,co...| 3| // | lina| travel,dance| 2| // +-----+--------------------+---------+ // 6. 或者使用functions.udf方法注冊,在DataFrame中生效 import org.apache.spark.sql.functions._ val hobby_num2 = udf((s:String)=>s.split(",").size) hobbiesDF.select($"name", $"hobbies", hobby_num2($"hobbies").as("hobby_num")).show() // +-----+--------------------+---------+ // | name| hobbies|hobby_num| // +-----+--------------------+---------+ // |alice|jogging,Coding,co...| 3| // | lina| travel,dance| 2| // +-----+--------------------+---------+ } } ```
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看