Spark Connector Connect to SQL Server

(Last Updated On: )

This post is how to use the Spark Connector to Connect to SQL Server.

Install Spark Connector

  1. spark-mssql-connector_2.12-1.2.0.jar

Install msal

  1. pip install msal

Connect using Azure SPN

  1. import msal
  2. global_token_cache = msal.TokenCache()
  3.  
  4. secret = "<GET SECRET SECURELY>"
  5.  
  6. global_spn_app = msal.ConfidentialClientApplication(
  7. <CLIENT_ID>, Authority='https://login.microsoftonline.com/<TENANT_ID>',
  8. client_credential=secret,
  9. token_cache=global_token_cache,
  10. )
  11.  
  12. result = global_spn_app.acquire_token_for_client(scopes=['https://database.windows.net//.default'])
  13.  
  14. jdbc_df = spark.read \
  15. .format("com.microsoft.sqlserver.jdbc.spark") \
  16. .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
  17. .option("query", "SELECT * FROM SOMETHING") \
  18. .option("accessToken", result['access_token']) \
  19. .option("encrypt", "true") \
  20. .option("hostNameInCertificate", "*.database.windows.net") \
  21. .load()

Connect using Domain Auth

  1. secret = "<GET SECRET SECURELY>"
  2.  
  3. jdbc_df = spark.read \
  4. .format("com.microsoft.sqlserver.jdbc.spark") \
  5. .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
  6. .option("query", "SELECT * FROM SOMETHING") \
  7. .option("authentication", "ActiveDirectoryPassword") \
  8. .option("user", "<USER>@<DOMAIN>") \
  9. .option("password", "<SECRET>") \
  10. .load()

Connect using SQL Auth

I do not recommend SQL Auth

  1. secret = "<GET SECRET SECURELY>"
  2.  
  3. jdbc_df = spark.read \
  4. .format("com.microsoft.sqlserver.jdbc.spark") \
  5. .option("url", 'jdbc:sqlserver://<SERVER_NAME>:<PORT>;database=<DATABASE>;') \
  6. .option("query", "SELECT * FROM SOMETHING") \
  7. .option("user", "<USER>") \
  8. .option("password", "<SECRET>") \
  9. .load()