Spark Café Blog
Discover More About Analytics Modernization
The Making Of An Analytics Crisis
I recently attended a fantastic conference on Business Continuity and Disaster Recovery put on by DRI Canada. The keynote speaker, Dr. Thomas Homer-Dixon was phenomenal. He founded the Cascade Institute, which studies crisis and the factors leading up to them. One key...
Putting A Price On Technical Debt – For One Canadian Bank It May Be A $13.4B Deal
Many people have heard of TD Bank Group’s attempted acquisition of First Horizon, and the subsequent collapse of that deal. With recent turmoil in the banking system, one might assume that the collapse of the deal had something to do with those developments. It would...
What Would Happen If Oracle Buys SAS?
I often like to think in the hypothetical, thought experiments if you will. What if, at this very moment SAS was hammering out the final details of a deal to sell itself to Oracle. This might seem a far fetched notion to some, but I assure you that it is not only in...
When did PySpark become THE modernization path for legacy SAS?
Back in 2015, when my colleagues and I first started reading about and experimenting with Apache Spark and PySpark, we knew there was something special brewing in open analytics. Being in the data and analytics world since 1997, I’ve seen lots of trends come and go....
The 50% Upfront Scam
We’re increasingly hearing from customers about a common scheme being perpetrated by some of the largest SI’s and IT service vendors on the planet. In the sketchy world of IT body shops, it is common practice to ask for 50% upfront and 50% upon completion of a...
The Tradeoff Between Being Simple and Being Right
At WiseWithData, our engineers have developed the most sophisticated automation on the planet to transpile code from one language to another. Weighing in at over 500,000 lines, SPROCKET is simultaneously a marvel of simplicity, elegance and complexity. Complexity,...
Top 10 Ways To Spot A Modernization Imposter
My son loves Among Us, the wildly successful computer game released in 2018 by Innersloth, a team of only 3 developers at the time. Personally, I love the story of Innersloth and Among Us, as it shows how a small but passionate team of people can change the world. My...
The SAS Migration Torture Test
In the Additive Manufacturing & 3D Printing world, a “torture test” is often used to test the limits and capabilities of a 3D printer. There are many different designs out there, but they all aim to capture just how capable is the automated machine. For filament...
What does a 100x improvement really look like?
I’m often called upon to provide guidance on why adopting modern analytics is so important. In 2022, all the best features and functionality are in the open source world, especially the market leading Databricks and the PySpark language. Those capabilities are...
My Tumultuous, Sometimes Frustrating Yet Thrilling Journey from SAS to Python and PySpark
Mental Illness and Obsolete Skills October 2019 was the beginning of a very difficult and dark period for me. In truth, the decline began like a slow-motion train wreck during Christmas of 2018, but the main crisis point and collapse was triggered in October 2019. In...
Planning a Prison Break – Breaking Free of Vendor Lock-in
It seems like everything these days is only for rent. The way we consume movies, music, books, computing, even the features in our cars, have all moved to be subscription model services, not products you buy and own. Pay once, own forever has increasingly become pay...
Foolishly Automating Or Automating Fools
What an absurd title! A strange twist on the “working hard or hardly working” cliché, but read on and I promise it will all make sense. Here's a tale of a court jester and his great invention. Long ago in a vast and complex kingdom, there lived a great king. The king...
Why R is not ouR target – problems with the open source SAS competitor
Back in 2015, when we set out to build SPROCKET, the World's only SAS modernization solution, one key design question plagued our thoughts. Scalable, simple, fast and open-source, it was obvious from the early days of Apache Spark that it was analytics platform of the...
Why the RIPL API has been 12 years in the making
Not to date myself, but I’ve been using SAS for over 25 years. I’ve worked at banks like Capital One and Citibank, Telecom companies like Verizon and Bell Canada, government institutions like Statistics Canada, and the company behind the SAS language. In all my years...
What is automation?
During engagements with our clients I am often asked what is our definition of automation. Understanding the options that are available on the market today to help with a SAS code conversion can be challenging. We often hear and use the terms brute force, manual,...
What stupid things you can do in SAS but shouldn’t
SAS is the Kapil Dev(Cricket) of computer programming languages. For my non cricket friends, what I mean is it was the best alrounder of its time. Google says ‘SAS language is a computer programming language used for statistical analysis’ but I bet someone has used it...
Introduction To Databricks and PySpark For SAS Developers
This is a collaborative post between Databricks and WiseWithData. We thank Founder and President Ian J. Ghent, Head of Pre-Sales Solutions R &D Bryan Chuinkam, and Head of Migration Solutions R&D Ban (Mike) Sun of WiseWithData for their contributions....
Will The SAS Language Be Regulated Out Of Existence?
The majority of the revenue derived from the SAS language comes from three highly regulated industries, Financial Services, Insurance, and Health & Life Sciences (HLS). In fact, its dominance in those industries, comes in part from regulators historical...
Technology Alliances: The Customer Value
As we announce our strategic technology alliance with Databricks I can’t help but to think how customers see these types of announcements. Partnerships are announced everyday, it is not uncommon to see a large enterprise hardware vendor announce a partnership with a...
Moving your SAS libraries into a Lake House architecture using Apache Spark, Parquet and Delta
SAS Libraries SAS supports a wide array of different library engines that provide a semantic layer between the underlying data source and SAS. The engine provides an abstraction between SAS code and the physical location of the data. In addition to the native SAS...
WiseWithData Sparked my interest
We've all been faced with the job search in our careers at one point, like many this is where I found myself in the later part of 2020. The good news, opportunity is abundant. As much doom and gloom 2020 has brought us, it has also brought a lot of greatness. I...
Is SAS Still Relevant (2020 Edition)
Back in 2016, I wrote a blog article posing an interesting question; given the rise of Apache Spark as the defacto analytics platform, is SAS still relevant? The answer back then was a definitive yes. Well a lot has changed in the past 4 years! While we are reflecting...
Apache Spark’s the easiest way to migrate to the Cloud
If your organization's made a decison to move to the Cloud, chances are you are considering one of the top Cloud Service Providers. You should know they all run Apache Spark for its lightning-fast cluster computing and incomparable data processing performance - @...
Solving the CDOs challenge with legacy SAS
I have the privilege of exchanging with Chief Data Officers around the world. I'm noting a consistent trend emerge with their data science modernization efforts. Getting rid of legacy is a bigger challenge than most anticipated. CDO's all know that to make real-time...
ELEVATE Humanitarian & OSS Award Winners
At WiseWithData, we believe we can be and help support agents of positive change in our communities and around the world. We see Open Source Software (OSS) as a truly shining example of the power of global communities coming together to solve common challenges. Thanks...
A new (and old) direction for SPROCKET
It was 5 years ago that we announced the development of the SPROCKET conversion service. One of the consistent messages that we've been hearing from our clients is that they love the SAS to PySpark migration solution, but they want more. Many of our customers are...
SPARK 3.0 Just in Time for Next Week’s Spark Summit !
Spark 3.0 is here – just in time for Spark + AI Summit next week
Brute Force Unwise for SAS to PySpark Conversion
As Enterprises look to rationalize investments and re-architect technology stacks, their data needs to be open and accessible - not locked into closed formats or proprietary systems. In the world of data science, Apache Spark stands out as the clear leader with its...
RDDs vs DataFrames vs DataSets: The Three Data Structures of Spark
RDD, DataFrame, and Dataset are the three most common data structures in Spark, and they make processing very large data easy and convenient. Because of the lazy evaluation algorithm of Spark, these data structures are not executed right way during creations,...
Introduction to Koalas
What is Koalas? Koalas is an implementation of the pandas DataFrame API on top of Apache Spark. Pandas is the go-to Python Library for data analysis, while Apache Spark is becoming the go to for big data processing. Koalas allows you leverage the simplicity of Pandas...