This post is how to use the Spark Connector to Connect to SQL Server.
Install Spark Connector
- spark-mssql-connector_2.12-1.2.0.jar
Install msal
- pip install msal
Connect using Azure SPN
- import msal
- global_token_cache = msal.TokenCache()
- secret = "<GET SECRET SECURELY>"
- global_spn_app = msal.ConfidentialClientApplication(
- <CLIENT_ID>, Authority='https://login.microsoftonline.com/<TENANT_ID>',
- client_credential=secret,
- token_cache=global_token_cache,
- )
- result = global_spn_app.acquire_token_for_client(scopes=['https://database.windows.net//.default'])
- jdbc_df = spark.read \
- .format("com.microsoft.sqlserver.jdbc.spark") \
- .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
- .option("query", "SELECT * FROM SOMETHING") \
- .option("accessToken", result['access_token']) \
- .option("encrypt", "true") \
- .option("hostNameInCertificate", "*.database.windows.net") \
- .load()
Connect using Domain Auth
- secret = "<GET SECRET SECURELY>"
- jdbc_df = spark.read \
- .format("com.microsoft.sqlserver.jdbc.spark") \
- .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
- .option("query", "SELECT * FROM SOMETHING") \
- .option("authentication", "ActiveDirectoryPassword") \
- .option("user", "<USER>@<DOMAIN>") \
- .option("password", "<SECRET>") \
- .load()
Connect using SQL Auth
I do not recommend SQL Auth
- secret = "<GET SECRET SECURELY>"
- jdbc_df = spark.read \
- .format("com.microsoft.sqlserver.jdbc.spark") \
- .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
- .option("query", "SELECT * FROM SOMETHING") \
- .option("user", "<USER>") \
- .option("password", "<SECRET>") \
- .load()