by Ian J. Ghent | Mar 1, 2022 | Apache Spark, Apache Spark Cafe, Python, R, SAS
Back in 2015, when we set out to build SPROCKET, the World’s only SAS modernization solution, one key design question plagued our thoughts. Scalable, simple, fast and open-source, it was obvious from the early days of Apache Spark that it was analytics platform...
by Andrea Bacqué | Sep 25, 2020 | Apache Spark, Apache Spark Cafe, Customer Experience, Java, Python, R, SAS, Scala, Solutions
I have the privilege of exchanging with Chief Data Officers around the world. I’m noting a consistent trend emerge with their data science modernization efforts. Getting rid of legacy is a bigger challenge than most anticipated. CDO’s all know that to...
by Mike Sun | May 20, 2020 | Apache Spark, Apache Spark Cafe, Java, Python, R, Scala
RDD, DataFrame, and Dataset are the three most common data structures in Spark, and they make processing very large data easy and convenient. Because of the lazy evaluation algorithm of Spark, these data structures are not executed right way during creations,...
by Ian J. Ghent | Oct 21, 2015 | Apache Spark, Java, Python, R, Scala
The revolution of Spark is igniting an vigorous debate within the data operations, data science and analytic professionals community. Spark supports of 4 different programming languages (not including SQL), making choosing a programming language difficult. Although...