The Wise With Data Team have been eagerly anticipating Spark 3.0’s release and it is official as of today. As experts in open source data science, we’d like to share what to expect with Spark 3.0. Also a reminder that Spark Summit 2020 is next week…
What are we expecting with 3.0? Here’s a few highlights from our perspective:
– 2X performance improvement over Spark 2.4 – which was already more performant over legacy platforms
– Significant improvements in pandas APIs
– Simplified PySpark exceptions w/ better Python error handling
– Structured streaming improved with new User Interface
Stay tuned for more news on these upgrades over the next few weeks. We plan to cover them in more depth.
For now, keep in mind there are no major code changes required to adopt this version. A published migration guide is available to support upgrading.
Onto Spark + AI Summit next week, we are thrilled to see this fast growing bi-annual event grow year over year (900 @ Summit EU in 2015 to over 2300 @ Summit EU in 2019). The organizer’s this year have outdone themselves by quickly turning an in person event into an online event, with very short notice. Kudos to the whole team at Databricks.
This year’s Spark Summit promises an amazing agenda including incredibly talented speakers representing a far range of Enterprises, across most sectors, all there to champion the leading platform in data science.
What are we expecting to experience at next week’s summit? A lot of learning with some of the most advanced opportunities to hear about Data & ML Industry Use Cases and learn through Data Science, Deep Learning and Machine Learning tracks.
And we hope to network and learn more about your adoption of Apache Spark as we’ve developed the world’s only automated migration solution from SAS to PySpark. We call it SPROCKET – it’s fast, simple and accurate.
For more information, contact us @ [email protected]