Spark 2.0 的特性
更簡單(Easier: ANSI SQL and Streamlined APIs)
- Unifying DataFrames and Datasets in Scala/Java
- SparkSession
- Simpler, more performant Accumulator API
- DataFrame-based Machine Learning API emerges as the primary ML API
- Machine learning pipeline persistence
- Distributed algorithms in R
- User-defined functions (UDFs) in R
更快速(Faster: Apache Spark as a Compiler)
- 搭載了第二代 Tungsten 引擎,此技術官方稱為「whole-stage code generation」
更聰明(Smarter: Structured Streaming)
- Integrated API with batch jobs
- Transactional interaction with storage systems
- Rich integration with the rest of Spark
Spark 2.0 官方介紹影片:SPARK EAST SUMMIT in New York(2016/02/16 )
版本釋出記錄:
Spark 2.2.0(2017/07/11)Spark 2.1.1(2017/05/02)
Spark 2.1.0(2016/12/28)
Spark 2.0.2(2016/11/14)
Spark 2.0.1(2016/10/03)
Spark 2.0.0(2016/07/26)
沒有留言:
張貼留言