I’m often called upon to provide guidance on why adopting modern analytics is so important. In 2022, all the best features and functionality are in the open source world, especially the market leading Databricks and the PySpark language. Those capabilities are critically important to solve modern business challenges, but they are vastly underrated and underutilized. Their value is only recognized by industry leading experts, who understand the vast capabilities of these modern marvels.
By contrast, those that have been doing analytics for decades, can’t even begin to grasp what they are missing out on. Most SAS language users, would be in complete awe of the things that could be done in PySpark, let alone the the mind twisting delight of using tools like Delta, Spark-NLP and GraphFrames. But, in terms of absolute killer features of modern Python based analytics, the speed and scalability of PySpark is king.
Just how much faster is modern analytics, say compared with SAS, a language designed for data volumes of the 1960’s? The bigger the job, the more the highly efficient parallel-first architecture of PySpark really shines. So while every job will differ, most users experience between 10x and 100x performance improvements migrating their SAS9 code to the PySpark (with SPROCKET).
100 times faster…how is that even possible?! Long story short, a whole lot of cleaver open-source engineering makes this possible (fully distributed redundant computing, lazy execution, whole-stage code gen, column oriented storage, DPP, AQE, etc.). All those innovations combine to give our customers incredible performance gains when modernizing. The mind struggles to even fathom improvements of that magnitude.
There are so few parallels to that kind of improvement in our everyday life. Imagine if your phone, or better yet your car had 100x better battery life? What if solar panels produced 100x more power? What if tunnel boring machines dug 100x faster? How would that change our quality of life? One of the best analogies comes from transportation, a favorite topic of mine. It’s not like going from walking to a bike (~5x faster), or to a car (~10-20x faster); it’s more like the difference between walking and a jet airplane (but forget that whole messy airport thing). How different would our world be today, without people and goods being able to fly all over the globe?
Modern open source analytics delivers the equivalent of hundreds of years of transportation innovation, almost instantly and free of vendor lock-in. Put another way, organizations still trudging forward on foot with legacy analytics are being passed by their peers, who are drinking Champaign and looking down from their private jets. To add insult to injury, those still on legacy have to “rent” their hiking boots, to the tune of millions of dollars per year, while the jet-setters have access to free planes and fuel.
What does the performance of modern analytics mean to your business? Productivity and innovation, pure and simple. Leveraging your resources to their full potential, and clearing away all the technical hurdles like performance and scalability. This lets your users focus on what really matters, solving business problems with data, and serving your customers.
Our customers are constantly sharing with us how the improved performance has fundamentally changed their business. In one case, a customer’s marketing models went from taking 20 days to execute to 3 hours (~160x improvement). They could now refresh the models daily instead of monthly, with 5x greater chances of getting customer responses, driving a 2x revenue gain for the program.
While it’s exciting to talk about how this futuristic platform drives 100x improvements, it’s all just hypothetical gains if a migration takes many years. That’s the really exciting part! Not only is the analytics 100x faster, but with WiseWithData’s SPROCKET Migration Solution we’re going to get you there 100x faster as well.
Do you have the need for speed? Reach out to hello@wisewithdata.com to find out more about SPROCKET.