<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ??一站式輕松地調用各大LLM模型接口,支持GPT4、智譜、豆包、星火、月之暗面及文生圖、文生視頻 廣告
                * Spark SQL內置的函數都在 `org.apache.spark.sql.functions`中(functions是一個object,不是一個package). * 內置函數大致分類如下: ```scala 類別 函數舉例 聚合函數 count(),countDistinct(),avg(),max(),min() 集合函數 sort_array、explode 日期、時間函數 hour、quarter、next_day 數學函數 asin、atan、sqrt、tan、round 開窗函數 row_number 字符串函數 concat、format_number、regexp_extract 其他函數 isNaN、sha、randn、callUDF ``` 下面是內置函數使用的一個例子: ```scala import org.apache.spark.SparkContext import org.apache.spark.sql.{Row, SparkSession} import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType} object InnerFun { def main(args: Array[String]): Unit = { val spark: SparkSession = SparkSession.builder() .master("local[4]") .appName(this.getClass.getName) .getOrCreate() val sc: SparkContext = spark.sparkContext import spark.implicits._ // 示例數據 val accessLog = Array( "2016-12-27,001", "2016-12-27,001", "2016-12-27,002", "2016-12-28,003", "2016-12-28,004", "2016-12-28,002", "2016-12-28,002", "2016-12-28,001" ) // 創建DataFrame val accessLogRDD = sc.parallelize(accessLog).map(row => { val splited = row.split(",") Row(splited(0), splited(1).toInt) }) val structTypes = StructType(Array( StructField("day", StringType, true), StructField("userId", IntegerType, true) )) val accessLogDF = spark.createDataFrame(accessLogRDD, structTypes) accessLogDF.show() // +----------+------+ // | day|userId| // +----------+------+ // |2016-12-27| 1| // |2016-12-27| 1| // |2016-12-27| 2| // |2016-12-28| 3| // |2016-12-28| 4| // |2016-12-28| 2| // |2016-12-28| 2| // |2016-12-28| 1| // +----------+------+ // 導入Spark SQL內置的函數 import org.apache.spark.sql.functions._ //求每天所有的訪問量(pv) accessLogDF.groupBy("day").agg(count("userId").as("pv")) .show() // +----------+---+ // | day| pv| // +----------+---+ // |2016-12-28| 5| // |2016-12-27| 3| // +----------+---+ //求每天的去重訪問量(uv) accessLogDF.groupBy("day").agg(countDistinct("userId").as("uv")) .show() // +----------+---+ // | day| uv| // +----------+---+ // |2016-12-28| 4| // |2016-12-27| 2| // +----------+---+ } } ```
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看