Deploying Azure Databricks with Private Endpoints Using Terraform
A production-ready, multi-environment Terraform configuration for deploying Azure Databricks with full network isolation. VNet injection, private endpoints for storage and workspace, automated subnet delegation management, and a PowerShell deployment script that handles the hard parts for you.
Table of Contents
Why Private Endpoints for Databricks?
If you're running Databricks in an enterprise environment, especially in government or financial services, you'll quickly hit this requirement: no public network access. The default Databricks deployment exposes the workspace, storage accounts, and data plane to the public internet. For regulated industries, that's a non-starter.
The solution is a combination of VNet injection (Databricks clusters run inside your own virtual network) and private endpoints (all data plane access goes through private IPs). This gives you complete network isolation while maintaining full Databricks functionality.
The challenge? Getting all the pieces right. Subnet delegations, NSG associations, private DNS zones, storage account firewall rules, access connector identities, and the Databricks workspace itself all need to be configured in the correct order with the correct dependencies. Miss one piece and the deployment fails with cryptic error messages.
This Terraform configuration handles all of it, across multiple environments, with a single variable change.
Architecture Overview
┌──────────────────────────────────────────────────────────┐
│ Your Azure Subscription │
│ │
│ ┌─────────────────────── VNet ────────────────────────┐ │
│ │ │ │
│ │ ┌────────────┐ ┌────────────┐ ┌──────────────┐ │ │
│ │ │ Databricks │ │ Databricks │ │ Private │ │ │
│ │ │ Public │ │ Private │ │ Endpoint │ │ │
│ │ │ Subnet │ │ Subnet │ │ Subnet │ │ │
│ │ │ (delegated)│ │ (delegated)│ │ (no deleg.) │ │ │
│ │ └─────┬──────┘ └─────┬──────┘ └──────┬───────┘ │ │
│ │ └───────┬───────┘ │ │ │
│ │ │ │ │ │
│ │ ┌──────────▼──────────┐ Private Endpoints: │ │
│ │ │ Databricks │ • STG1 Blob + DFS │ │
│ │ │ Workspace │ • STG2 Blob + DFS │ │
│ │ │ (Premium, No PIP) │ • Databricks UI/API │ │
│ │ └──────────┬──────────┘ │ │ │
│ │ │ │ │ │
│ │ ┌───────────┼───────────┐ │ │ │
│ │ ▼ ▼ │ │ │
│ │ ┌──────┐ ┌──────┐ │ │ │
│ │ │ STG1 │ ◄────────────│ STG2 │ ◄─────────┘ │ │
│ │ │(Data)│ Access │ (UC) │ Access │ │
│ │ │ │ Connector │ │ Connector │ │
│ │ └──────┘ (MI) └──────┘ (MI) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Private DNS Zones: │
│ • privatelink.blob.core.windows.net │
│ • privatelink.dfs.core.windows.net │
│ • privatelink.azuredatabricks.net │
└───────────────────────────────────────────────────────────┘
What Gets Deployed
A single terraform apply creates all of the following:
- Databricks Workspace (Premium SKU) with VNet injection, no public IP, and public access disabled
- Two Storage Accounts with HNS enabled (Data Lake Gen2), public access denied, and default-deny network rules
- Six Storage Containers for data ingestion, Unity Catalog, and environment-specific workloads
- Two Access Connectors with System-Assigned Managed Identities for secure, keyless storage access
- Five Private Endpoints (2x Blob, 2x DFS, 1x Databricks UI/API)
- Private DNS Zone integration for automatic name resolution
- Network Security Group associated with both Databricks subnets
- Eight IAM Role Assignments (Storage Blob + Queue Data Contributor for connectors and admin group)
Prerequisites
Azure Infrastructure (must exist before deployment)
| Resource | Naming Pattern | Purpose |
|---|---|---|
| Virtual Network | {prefix}-{env}-cc-vnet-01 | Network boundary |
| Public Subnet | {prefix}-{env}-databricks-public-snet-01 | Databricks public nodes |
| Private Subnet | {prefix}-{env}-databricks-private-snet-01 | Databricks private nodes |
| PEP Subnet | {prefix}-{env}-pep-snet-01 | Private endpoints |
| Private DNS Zones | In a central/identity subscription | Name resolution |
| Terraform State Storage | Any storage account | State file management |
Permissions
Your service principal needs Contributor on the target subscription and Network Contributor on the VNet resource group. For cross-subscription DNS zones, it also needs Reader on the identity/DNS subscription.
Project Structure
databricks-private-endpoints/
├── provider.tf # Azure provider + backend config
├── variables.tf # Environment variable + all naming locals
├── resource_group.tf # Resource group
├── network.tf # VNet/subnet data sources
├── databricks.tf # Workspace + access connectors
├── storage.tf # Storage accounts, containers, IAM
├── private_endpoint.tf # Private endpoints + DNS zone refs
├── nsg.tf # Network security groups
├── deploy.ps1 # Automated deployment script
├── azure-pipelines.yml # CI/CD pipeline (Azure DevOps)
├── azure-auth.env.example # Credential template
├── terraform.tfvars.example # Variable template
└── backend-configs/ # Per-environment state file configs
├── dev.tfbackend
├── test.tfbackend
├── stage.tfbackend
├── analytics.tfbackend
└── poc.tfbackend
Core Terraform Walkthrough
Dynamic Naming with Locals
The entire configuration is driven by a single environment variable. Every resource name is generated dynamically in the locals block:
variable "environment" {
type = string
default = "test"
validation {
condition = contains(["dev", "stage", "analytics", "poc", "test"], var.environment)
error_message = "Environment must be one of: dev, stage, analytics, poc, test."
}
}
locals {
resource_group_name = "contoso-${var.environment}-databricks-rg-01"
databricks_workspace = "contoso-${var.environment}-databricks-wks-01"
# Storage accounts have a 24-char limit, so abbreviate long env names
env_abbr = var.environment == "analytics" ? "anltcs" : var.environment
stg1_name = "contoso${local.env_abbr}ccdbwingstg001"
stg2_name = "contoso${local.env_abbr}ccdbwucstg001"
}
Azure storage accounts must be 3-24 characters, lowercase and numbers only. The "analytics" environment would exceed this, so we abbreviate it to "anltcs". This is handled automatically in variables.tf.
The Databricks Workspace
The workspace is the centerpiece. Key settings: Premium SKU for Unity Catalog support, VNet injection with no public IP, and public network access disabled:
resource "azurerm_databricks_workspace" "main" {
name = local.databricks_workspace_name
resource_group_name = local.resource_group_name
location = var.location
sku = "premium"
public_network_access_enabled = false
network_security_group_rules_required = "NoAzureDatabricksRules"
custom_parameters {
no_public_ip = true
virtual_network_id = data.azurerm_virtual_network.main.id
public_subnet_name = data.azurerm_subnet.databricks_public.name
private_subnet_name = data.azurerm_subnet.databricks_private.name
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.databricks_public_nsg_association.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.databricks_private_nsg_association.id
}
}
Storage with Managed Identity Access
Instead of using storage account keys (a security anti-pattern), we use Databricks Access Connectors with System-Assigned Managed Identities. Each connector gets Storage Blob Data Contributor and Storage Queue Data Contributor roles on its respective storage account:
resource "azurerm_databricks_access_connector" "unity_catalog" {
name = local.access_connector_unity_catalog_name
resource_group_name = local.resource_group_name
location = var.location
identity {
type = "SystemAssigned"
}
}
resource "azurerm_role_assignment" "unity_blob" {
scope = azurerm_storage_account.stg2.id
role_definition_name = "Storage Blob Data Contributor"
principal_id = azurerm_databricks_access_connector.unity_catalog.identity[0].principal_id
}
Private Endpoints
Five private endpoints cover all data plane access. Each one connects to a private DNS zone for automatic name resolution:
resource "azurerm_private_endpoint" "databricks_workspace" {
name = "contoso-${var.environment}-pep-databricks-workspace-01"
location = var.location
resource_group_name = local.resource_group_name
subnet_id = data.azurerm_subnet.pep.id
private_service_connection {
name = "databricks-workspace-connection"
private_connection_resource_id = azurerm_databricks_workspace.main.id
subresource_names = ["databricks_ui_api"]
is_manual_connection = false
}
private_dns_zone_group {
name = "databricks-zone-group"
private_dns_zone_ids = [data.azurerm_private_dns_zone.databricks_workspace.id]
}
}
The Deployment Script
The deploy.ps1 PowerShell script is where the magic happens. It automates the entire deployment workflow, including a critical feature: automatic subnet delegation management.
Why Subnet Delegations Matter
Azure Databricks with VNet injection has strict subnet requirements:
- Databricks subnets (public and private) MUST be delegated to Microsoft.Databricks/workspaces
- Private Endpoint subnet MUST NOT have any delegation
Get this wrong and you'll see errors like "required public subnet delegation not found" or "PrivateEndpointCreationNotAllowedAsSubnetIsDelegated." The deployment script checks all three subnets before every plan or apply and fixes them automatically.
# Deploy to any environment with a single command
.\deploy.ps1 -Environment dev -Action init
.\deploy.ps1 -Environment dev -Action plan
.\deploy.ps1 -Environment dev -Action apply
# The script automatically:
# 1. Loads credentials from azure-auth.env
# 2. Selects the correct subscription
# 3. Authenticates with Azure
# 4. Checks and fixes subnet delegations
# 5. Runs the Terraform command
# 6. Reports results
Multi-Environment Support
Each environment is completely isolated:
- Separate subscriptions per environment (configured in azure-auth.env)
- Separate state files in Azure Storage (dev.tfstate, test.tfstate, etc.)
- Separate backend configs in the backend-configs/ folder
- Dynamic resource naming prevents any cross-environment conflicts
Switching environments is a single variable change. The deploy script handles subscription switching, state file routing, and credential management automatically.
CI/CD Pipeline
The included azure-pipelines.yml defines a multi-stage pipeline: Build/Validate, then deploy to Dev, Test, UAT, and Production in sequence. Each stage uses environment-specific backend configs and requires approval gates for production.
Troubleshooting Guide
The most common issues you'll encounter, and how to fix them:
Network Intent Policy Errors (During Destroy)
When you destroy a Databricks workspace, Azure leaves behind Network Intent Policies (NIPs) on the subnets. These block NSG removal. The fix: delete the workspace first, wait 5-10 minutes, delete the NIPs manually, then retry destroy.
Failed Workspace State
If a deployment fails midway, the workspace can get stuck in a "Failed" provisioning state. Delete it via Azure CLI, clean the Terraform state, then redeploy. The full troubleshooting guide is included in the repo.
Get the Code
All Terraform files, deployment script, CI/CD pipeline, and troubleshooting guide included.
View on GitHub →Clone the repo, replace contoso with your organization's prefix in variables.tf, configure your credentials in azure-auth.env, and you're ready to deploy. The entire setup takes about 15 minutes once your networking prerequisites are in place.