Azure: Tags

This post is how to work with tags for Azure resources.

Install Graph Extension

az extension add --name resource-graph

Tag List

az tag list --subscription <NAME>

Query for Specific Tag Value

az graph query -q "project name, resourceGroup, type, tags | where tags.<TAGNAME>=~'<VALUE>'"

Query for Multiple Tags

az graph query -q "project name, resourceGroup, type, tags | where tags.<TAGNAME>=~'<VALUE>' | where tags.<TAGNAME>=~'<VALUE>'"

Query for Resource Groups

az graph query -q "ResourceContainers | project name, type, tags | where tags.<TAGNAME>=~'<VALUE>'"

Query For Multiple Resource Types

az graph query -q "project name, resourceGroup, type, tags | where tags.<TAGNAME>=~'<VALUE>' | where type =~ 'microsoft.sql/servers/databases' or type =~ 'microsoft.storage/storageaccounts'"

 

Synapse: Get Secret

This post is how to get a secret from a key vault in Synapse.

If you have Data Exfiltration enabled (which is recommended) then you need to have a Managed Private Endpoint setup to your KeyVault.

You also need to ensure your Synapse Managed Identity has access to your Key Vault.

You also need a un-parameterized Linked Service Created.

Then you can query your Key Vault to get the secret with the following command.

from notebookutils import mssparkutils

secret = mssparkutils.credentials.getSecret('<KEY_VAULT_NAME>', '<SECRET_KEY>', '<LINKED_SERVICE_KEYVAULT_NAME>')

 

Databricks: Get Secret

This post is how to get a secret from a key vault in Databricks.

First you need to setup dbutils.

Next you have to make sure your Databricks installation has a Key Vault integrated Scope setup.

Then you need to make sure that Databricks is allowed to communicate with your KeyVault.

Then you can query your Key Vault to get the secret with the following command.

secret = dbutils.secrets.get(scope='<SCOPE>', key='<SECRET_KEY>')

 

Azure: Python SDK

This post is how to use the Azure Python SDK.

If you are using Databricks you can get the secret by using the following Databricks: Get Secret

If you are using Synapse you can get the secret by using the following Synapse: Get Secret

Package Installations

pip install azure-identity
pip install azure-storage-file
pip install azure-storage-file-datalake

Setup Credentials

Service Principal

from azure.common.credentials import ServicePrincipalCredentials
secret = "<GET_SECRET_SECURELY>"
credential = ServicePrincipalCredential("<SPN_CLIENT_ID>", secret, tenant="<TENANT_ID>")

Token Credential

from azure.identity import ClientSecretCredential
secret = "<GET_SECRET_SECURELY>"
token_credential = ClientSecretCredential("<TENANT_ID>", "<SPN_CLIENT_ID>", secret)

Subscription Client

Client

from azure.mgmt.resource import SubscriptionClient
subscription_client = SubscriptionClient(credential)

Get List

subscriptions = subscription_client.subscriptions.list()
for subscription in subscriptions:
    print(subscription.display_name)

Storage Account

Client

from azure.mgmt.storage import StorageManagementClient
storage_client = StorageManagementClient(credential, "<SUBSCRIPTION_ID>")

Get List by Resource Group

storage_accounts = storage_client.storage_accounts.list_by_resource_group("<RESOURCE_GROUP_NAME>")
for sa in storage_accounts:
    print(sa.name)

List Containers in Storage Account

containers = storage_client.blob_containers.list("<RESOURCE_GROUP_NAME>", sa.name)

Containers

Client

from azure.storage.blob import ContainerClient
account_url_blob = f"https://{sa.name}.blob.core.windows.net"
container_client = ContainerClient.from_container_url(
    container_url=account_url_blob + "/" + container.name,
    credential=token_credential
)

Get Container Properties

container_client.get_container_properties()

List Blobs

for b in container_client.list_blobs():
    print(b)

Data Lake Service

Client

from azure.storage.filedatalake import DataLakeServiceClient
storage_account_url_dfs = f"https://{sa.name}.df.core.windows.net"
data_lake_service_client = DataLakeServiceClient(storage_account_url_dfs, token_credential)

DataLake Directory

from azure.storage.filedatalake import DataLakeDirectoryClient
data_lake_directory_client = DataLakeDirectoryClient(account_url=account_url_dfs, credential=credential)

FileSystem

Client

file_system_client = data_lake_service_client.get_file_system_client(file_system="<CONTAINER_NAME>")

Get Directory Client

directory_client = file_system_client.get_directory_client("<CONTAINER_SUB_FOLDER>")

Get Directory Access Control

acl_props = directory_client.get_access_control()

Microsoft Graph Client

Package Installations

pip install msgraph-sdk
pip install msrestazure
pip install azure-identity

Credential

from azure.identity.aio import ClientSecretCredential

secret = "<GET_SECRET_SECURELY>"
credential = ClientSecretCredential('<TENANT_ID>', '<CLIENT_ID>', secret)

Client

from msgraph import GraphServiceClient

def create_session(credential):
  scopes = ['https://graph.microsoft.com/.default']
  graph_client = GraphServiceClient(credential, scopes)
  return graph_client

graph_client = create_session(credential)

Get Groups

#This will only get you the first 100 groups. If you have more then you need to check again
groups = await graph_client.groups.get()
print(len(groups))

while groups is not None and groups.odata_next_link is not None:
  groups = await graph_client.groups.with_url(groups.odata_next_link).get()
  print(len(groups))

Get Group Members

id = '<GROUP_ID>'
group_members = await graph_client.groups.by_group_id(id).members.get()

 

Databricks: Bearer Token CLI

This post is how to get the bearer token using the CLI and setting the env variable.

First install Azure CLI.

Databricks Resource ID = 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

Get Access Token

az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

Set Access Token

Linux

export DATABRICKS_AAD_TOKEN="<TOKEN>"

Windows

set DATABRICKS_AAD_TOKEN="<TOKEN>"

Set Config File

Linux

export DATABRICKS_CONFIG_FILE="<LOCATION>"

Windows

set DATABRICKS_CONFIG_FILE="<LOCATION>"

 

Databricks: Rest API

This post is how to communicate with Databricks using Rest API’s.

Databricks Resource ID = 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

Get Bearer Token for Service Principal

curl -X GET https://login.microsoft.com/<TENANTID>/oauth2/token -H 'Content-Type: application/x-www-form-urlencoded' -d'grant_type=client_credential&client_id=<CLIENTID>&resource=2ff814a6-3304-4ab8-85cb-cd0e6f879c1d&client_secret=<SECRET>

Get Bearer Token for Service Principal Using management.core.windows.net

curl -X GET https://login.microsoftonline.com/<TENANTID>/oauth2/token -H 'Content-Type: application/x-www-form-urlencoded' -d'grant_type=client_credential&client_id=<CLIENTID>&resource=https://management.core.windows.net/&amp;client_secret=<SECRET>'

Start Cluster

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/clusters/start -d '{ "cluster_id": "<CLUSTER_ID>"}'

Stop Cluster

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/clusters/stop -d '{ "cluster_id": "<CLUSTER_ID>"}'

List Clusters

curl --location -g --trace -X --request GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/clusters/list

Job List

curl --location -g --trace -X --request GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/jobs/list

Job Python Run

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/jobs/run-now -d '{"job_id": <JOB_ID>, "python_params": [] }'

Job Get

curl --location -g --trace -X --request GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/jobs/runs/get?run_id=<JOB_RUN_ID>

Create Job

Databricks Create Job

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/jobs/create -d '<PAYLOAD>'

Create Job Payload

{
	"name": "<NAME>",
	"max_concurrent_runs": 1,
	"tasks": [
		{
			"task_key": "<TASK_KEY>",
			"run_if": "ALL_SUCCESS",
			"max_retries": 1,
			"timeout_seconds": <TIMEOUT_SECONDS>, 
			"notebook_tasks": {
				"notebook_path": "<PATH>",
				"source": "WORKSPACE",
				"base_parameters": {
					"<KEY>": "<VALUE>",
					"<KEY2>": "<VALUE2>",
				}
			},
			"libraries": [
				{
					"pypi": {
						"package": "<PACKAGE_NAME==VERSION>",
						"coordinates": ""
					}
				},
				{
					"jar": "<LOCATION>"
				}
			],
			"new_cluster": {
				"custom_tags": {
					"<TAG_NAME>": "<TAG_VALUE>"
				},
				"azure_attributes": {
					"first_on_demand": 1,
					"availability": "SPOT_AZURE",
					"spot_bid_max_price": 75
				},
				"instance_pool_id": "<WORKER_INSTANCE_POOL_ID>",
				"driver_instances_pool_id": "<DRIVER_INSTANCE_POOL_ID>",
				"data_security_mode": "SINGLE_USER",
				"spark_version": "<SPARK_VERSION>",
				"node_type_id": "<NODE_TYPE_ID>", 
				"runtime_engine": "STANDARD",
				"policy_id": "<POLICY_ID>",
				"autoscale": {
					"min_workers": <MIN_WORKERS>,
					"max_workers": <MAX_WORKERS>
				},
				"spark_conf": {
					"<CONFIG_KEY>": "<CONFIG_VALUE>"
				},
				"cluster_log_conf": {
					"dbfs": {
						"destination": "<LOG_DESTINATION>"
					}
				},
				"spark_env_vars": {
					"<ENV_NAME>": "<ENV_VALUE>"
				},
				"init_scripts": [
					{
						"volumes": {
							"destination": "<INIT_SCRIPT_LOCATION>"
						}
					}
				]
			}
		}
	],
	"format": "SINGLE_TASK"
}

Job Permission Patch

curl --location -g --trace -X --request PATCH -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/permissions/jobs/<JOB_ID> -d '{ "access_control_list": [{ "group_name": "<GROUP_NAME>", "permission_level": "<PERMISSION>"}]}'

Get Service Principal List

curl -X GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/preview/scim/v2/ServicePrincipals

Delete Service Principal List From Databricks ONLY

curl --location -g --trace -X --request DELETE -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/preview/scim/v2/ServicePrincipals/<APPLICATION_ID>

Add Service Principal To Databricks

curl --location --request POST 'https://<DATABRICKS_url>/api/2.0/preview/scim/v2/ServicePrincipals' --header 'Authorization: Bearer <TOKEN>' --header 'Content-Type: application/json' --data-raw '{ "schemas": ["urn:ietf:params:scim:schemas:core:2.0:ServicePrincipal"], "applicationId": "<CLIENTID>", "displayName": "<DISPLAYNAME>", "groups": [{"value": "<GROUP_ID>"}], "entitlements": [{ "value": "allow-cluster-create"}] }'

List Secret Scopes

curl --location -g --trace -X --request GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/secrets/scopes/list

Create KeyVault Secret Scope

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/secrets/scopes/create -d '{"scope": "<Keyvault_name>", "scope_backend_type": "AZURE_KEYVAULT", "backend_azure_keyvault": {"resource_id": "<RESOURCE_ID>", "dns_name": "<KEYVAULT_URL>"}, "initial_manage_principal": "users"}'

IP Access Lists

curl -X GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/ip-access-lists

List Git Repos

curl --location -g --trace -X --request GET -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/repos

Update Git Repo

curl --location -g --trace -X --request POST -H 'Authorization: Bearer <TOKEN>' https://<DATABRICKS_url>/api/2.0/repos/<REPO_ID> -d '{ "branch": "<BRANCH_NAME>" }'

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Azure: Install/Configure CLI

This post will show you how to install the Azure CLI.

First you need to install the CLI.

Once it is installed you can set your config directory. This is useful for having multiple logins going at the same time.

set AZURE_CONFIG_DIR=<YOUR_DIRECTORY>

You can then login. There are different ways to do that

Way 1: This will popup a login where you enter your login credentials

az login

Way 2: This will ask you for password via the command line

az login -u <YOUR_LOGIN>

Way 3:

az login -u <YOUR_LOGIN> -p <YOUR_PASSWORD>

Way 4: logs in as a service principal

az login --service-principal --user-name <SPN_ID> --password <SPN_KEY> --tenant <TENANTID>

Show your Account

az account show

Set Account Subscription

az account set -s <SUBSCRIPTION_ID>

List Tags For A Resource

az tag list --subscription <SUBSCRIPTION_NAME>

Install Graph

az extension add --name resource-graph

Query for Anything that Has a Tag

az graph query -q "resourceGraoup, type, tags" | where tags.<TAG_NAME>=~'<VALUE>'

Query for More than One Tag

az graph query -q "resourceGraoup, type, tags" | where tags.<TAG_NAME>=~'<VALUE>' | tags.<TAG_NAME>=='<VALUE>'

Query Type

az graph query -q "resourceGroup, type, tags" | where type =~ 'microsoft.sql/servers/databases'

 

Azure: EventHub

In this tutorial I will show you how to connect to event hub from Python. Ensure you have first installed an IDE (Eclipse) and Python3.7.

Python Package Installation

pip3 install azure-eventhub

Create a Producer

This will publish events to event hub. The important part here is the “EndPoint”. You need to login to Azure Portal and get the get the endpoint from the “Shared Access Policies” from the event hub namespace.

from azure.eventhub import EventHubProducerClient, EventData, EventHubConsumerClient

connection_str = 'Endpoint=sb://testeventhubnamespace.servicebus.windows.net/;SharedAccessKeyName=<<THE_ACCESS_KEY_NAME>>;SharedAccessKey=<<THE_ACCESS_KEY>>'
eventhub_name = '<<THE_EVENT_HUB_NAME>>'
producer = EventHubProducerClient.from_connection_string(connection_str, eventhub_name=eventhub_name)

event_data_batch = producer.create_batch()

event_data_batch.add(EventData('My Test Data'))

with producer:
    producer.send_batch(event_data_batch)

Create a Consumer

This will monitor the event hub for new messages.

from azure.eventhub import EventHubProducerClient, EventData, EventHubConsumerClient

connection_str = 'Endpoint=sb://testeventhubnamespace.servicebus.windows.net/;SharedAccessKeyName=<<THE_ACCESS_KEY_NAME>>;SharedAccessKey=<<THE_ACCESS_KEY>>'
eventhub_name = '<<THE_EVENT_HUB_NAME>>'
consumer_group = '<<THE_EVENT_HUB_CONSUMER_GROUP>>'
client = EventHubConsumerClient.from_connection_string(connection_str, consumer_group, eventhub_name=eventhub_name)

def on_event(partition_context, event):
    print("Received event from partition {} - {}".format(partition_context.partition_id, event))
    partition_context.update_checkpoint(event)

with client:
    #client.receive(
    #    on_event=on_event, 
    #    starting_position="-1",  # "-1" is from the beginning of the partition.
    #)
    client.receive(
        on_event=on_event
    )