In this series of blog posts, I will explain how you can connect MongoDB to SQL Server 2019 using Polybase so that you have the benefit of both a schemaless and relational database technologies integrated and working together to form a modern data ecosystem that can handle both traditional and “big data”.
In my previous post I explained how to install and configure Mongo DB in an Azure VM running Linux. In this post I will walk you through the process of setting up SQL Server 2019, which is the area on the right of the diagram below.
Spin up Azure VM with SQL Server 2019
Microsoft has a pre-built VM with the latest release of SQL Server 2019 ( at the time of this writing it is CTP2.3) which makes it really quick and easy to setup. Simply navigate to your Azure portal and search for SQL Server 2019. You should see a Free SQL Server : (CTP2.3) SQL 2019 Developer option, once you select it you should see the following.
Click create and fill out the subsequent screens as follows.
Step 1 Basics
NOTE Be sure to remember the Username and Password because it will be required later when we connect to the VM and install Polybase.
Step 2 Size
I went for a DS2_v2 but you are free to pick a size that suites your needs. NOTE Microsoft recommends a DS2 or higher for development and functional testing.
Step 3 Settings
You will need to open a public inbound port (3389 RDP) so that you can remotely connect to it and install Polybase.
Step 4 SQL Server settings
This last step is optional. You can enable external connections directly into the SQL Server database which is handy if you want to connect to the database without having to log onto the VM. Once the VM is created we will need to log into it to install Polybase.
Install Polybase
Unfortunately, the pre-built VM does not have Polybase installed on it so you will need to log onto the VM and install. To connect to the VM go the resource in the Azure Portal and select Connect. You should see a screen like this.
Download the RDP file and enter the credentials you used when you first created the VM.
Once you have logged onto the VM you will need to navigate to the SQL Server 2019 installation software. You can find it here C:\SQLServerFull. Double click on setup.
In the SQL Server Installation Center menu select New SQL Server stand-alone installation or add features to an existing installation.
For the Installation Type select Add features to an existing instance of SQL Server 2019 CTP2.3
On the Feature Selection screen select PolyBase Query Service for External Data.
On the PolyBase Configuration screen select the first option. A PolyBase scale-out group is ideal for scenarios in which you have multiple external data sources that you want to connect to and you need to optimize performance.
We will use the defaults for Server Configuration.
Review the summary and click the install button. If all goes well you should see the following screen when complete.
In the next post I will explain how to configure Polybase to connect it to your MongDB database installed on separate VM. This setup allows you to extend your reach without overstretching by letting the data stay where it is but still making it available for integration and analytics with line of business application data.
Hopefully you have found this to be another practical post.
Until next time.
Anthony