Friday, October 12, 2018

Data Virtualization - SQL Server 2019 Enhanced PolyBase

Microsoft recently released the white paper for SQL Server 2019.  There are many features announced, but I am most excited about the Enhanced PolyBase feature.  PolyBase allows a user to pull data from a wide array of data sources using only T-SQL queries.  Also, using SQL Server PolyBase as your single point of entry enables data security to be administered via Active Directory. Top it off with the ability to easily scale out by adding additional compute nodes when volume and/or data sources increase.  Are you bouncing in your chair yet?  I certainly am!

First introduced in SQL Server 2016, PolyBase was limited in the number of data sources it could pull data from.  In SQL Server 2019 Microsoft added a multitude of data sources including Oracle, MongoDB, Cassandra, Spark, and many more!  I work for a company that stores a large amount of data in different forms so I'm getting my hopes up here....  Dear Microsoft, Please don't disappoint me!

Imagine a world where ETL processing becomes a thing of the past.  You can leave your data in its original location and allow SQL Server PolyBase to do the rest.  Okay... too good to be true?  I honestly don't know yet!  I will be exploring/deep diving Enhanced PolyBase with my colleague, Mongo Mike,  over the next couple of months and I will definitely post updates throughout our journey.  Stay tuned!

