Databricks Managed Location Catalog Schema Level
π Unity Catalog Setup from Scratch (Azure + Databricks)π
Using UC Connector as Managed Identity
πΉ Part 1: Azure Setupπ
1. Create a Storage Account
In Azure Portal, search Storage Accounts β Create.
Important settings:
Performance: Standard Redundancy: LRS (for testing) Enable Hierarchical Namespace (HNS) β (must be ON for Unity Catalog). Create a container inside it (e.g. uc-data).
2. Assign Permissions to UC Managed Identity
In Azure Portal β Storage Account β Access Control (IAM) β Add Role Assignment. Assign these roles to the UC Managed Identity (the connector): - Storage Blob Data Owner (read/write access). - Storage Blob Delegator (needed for ABFS driver).
Scope: Storage Account level (recommended).
π Tip: You can find the UC managed identity name in Databricks β Admin Console β Identity Federation.
πΉ Part 2: Databricks Setup (UI + SQL)π
3. Verify Metastore
Go to Databricks Admin Console β Unity Catalog β Metastores.
If you donβt have one, click Create Metastore. Region = same as Storage Account. Assign this metastore to your workspace.
π Do not skip this. Unity Catalog wonβt work without a metastore.
4. Create Storage Credential (UI or SQL)
This ties the Azure Managed Identity to Unity Catalog.
UI: Go to Catalog β Storage Credentials β Create. Select Azure Managed Identity. Enter a name (e.g., uc-cred). Paste the UC Connector Managed Identity ID.
CREATE STORAGE CREDENTIAL uc_cred
WITH AZURE_MANAGED_IDENTITY 'your-managed-identity-client-id'
COMMENT 'UC Connector credential for ADLS';
5. Register an External Location
This points Unity Catalog to your ADLS container/folder.
UI:
Go to Catalog β External Locations β Create.
Name: ext_loc_dev
Path: abfss://uc-data@
CREATE EXTERNAL LOCATION ext_loc_dev
URL 'abfss://uc-data@<storageaccount>.dfs.core.windows.net/'
WITH (STORAGE CREDENTIAL uc_cred)
COMMENT 'External location for dev data';
6. Create Catalog
Decide whether it will use:
Metastore root storage (if defined), OR Catalog-level managed location (recommended).
UI:
Go to Catalog Explorer β Create Catalog. Name: dev Managed Location:
Storage credential: uc-cred.SQL equivalent:
7. Create Schema
Schemas can also have their own managed locations (if needed).
CREATE SCHEMA dev.bronze
MANAGED LOCATION 'abfss://uc-data@<storageaccount>.dfs.core.windows.net/dev/bronze';
9. Grant Permissions
For yourself or a group (like account users):
-- Catalog usage
GRANT USAGE ON CATALOG dev TO `account users`;
-- Schema usage
GRANT USAGE ON SCHEMA dev.bronze TO `account users`;
-- External location permissions
GRANT READ FILES, WRITE FILES ON EXTERNAL LOCATION ext_loc_dev TO `account users`;
-- Table level
GRANT SELECT, MODIFY ON TABLE dev.bronze.trades TO `account users`;
10. Create Tables at Different Levels
This gets created in metastore level container because its managed and we havent specified external location at catalog level.
-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev.bronze.raw_sales (
id INT,
name STRING,
invoice_no INT,
price double
);
INSERT INTO dev_ext.bronze.raw_sales VALUES (1,'Cookies',1,200.50);
This gets created in catalog level container because the catalog associated to the schema is external.
-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev_ext.bronze.raw_sales (
id INT,
name STRING,
invoice_no INT,
price double
);
INSERT INTO dev_ext.bronze.raw_sales VALUES (1,'Cookies',1,200.50);
Creating schema in external location.
CREATE EXTERNAL LOCATION 'ext_schema'
MANAGED LOCATION 'https://adbvedanthnew.databricks.net/adb/schema/bronze_ext'
This gets created in schema level since we specified external location at schema level.
-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev_ext.bronze_ext.raw_sales (
id INT,
name STRING,
invoice_no INT,
price double
);
INSERT INTO dev_ext.bronze_Ext.raw_sales VALUES (1,'Cookies',1,200.50);
Where data is stored?π
- Stored in metastore root location
- Store in Catalog Level Ext Location
- Store in Schema Level External Location
Summaryπ
πΉ 1. Managed Table (no LOCATION specified)π
CREATE SCHEMA finance
MANAGED LOCATION 'abfss://finance@companydatalake.dfs.core.windows.net/schemas/finance';
CREATE TABLE finance.transactions (
id INT,
amount DECIMAL(10,2)
);
-
Since you didnβt give a
LOCATION
for the table: -
Unity Catalog treats this as a managed table.
- UC stores the data under the schemaβs managed location.
β
Path =
abfss://finance@companydatalake.dfs.core.windows.net/schemas/finance/transactions/
So in your example, you are exactly right.
πΉ 2. External Table (explicit LOCATION specified)π
CREATE TABLE finance.transactions_ext (
id INT,
amount DECIMAL(10,2)
)
LOCATION 'abfss://raw@companydatalake.dfs.core.windows.net/landing/transactions/';
- Here you told UC exactly where the data lives.
- This is an external table.
- UC does not move or manage the files β it just registers metadata pointing at that path.
β
Path =
abfss://raw@companydatalake.dfs.core.windows.net/landing/transactions/
πΉ What If Schema Has No Managed Location?π
If you do:
- No schema location defined.
- A managed table like:
will fall back to the metastoreβs root storage location (the one you defined when setting up Unity Catalog).
β
Path example:
abfss://uc-metastore@companydatalake.dfs.core.windows.net/finance/transactions/
πΉ Summary Tableπ
Case | Table Type | Storage Path |
---|---|---|
Schema has MANAGED LOCATION , table has no LOCATION |
Managed table | Inside schemaβs managed location (e.g., /schemas/finance/transactions/ ) |
Schema has MANAGED LOCATION , table has explicit LOCATION |
External table | Exact path you provided |
Schema has no location, table has no LOCATION |
Managed table | Falls back to metastore root storage |
β
So in your example (schema has managed location
+ table no LOCATION
) β yes, data files are stored under schemaβs storage location.
Real Case Study : Employee Data Sensitivityπ
For example, let's say your organization has a company compliance policy that requires production data relating to human resources to reside in the bucket s3://mycompany-hr-prod. In Unity Catalog, you can achieve this requirement by setting a location on a catalog level, creating a catalog called, for example hr_prod, and assigning the location s3://mycompany-hr-prod/unity-catalog to it. This means that managed tables or volumes created in the hr_prod catalog (for example, using CREATE TABLE hr_prod.default.table β¦) store their data in s3://mycompany-hr-prod/unity-catalog. Optionally, you can choose to provide schema-level locations to organize data within the hr_prod catalog at a more granular level.
If storage isolation is not required for some catalogs, you can optionally set a storage location at the metastore level. This location serves as a default location for managed tables and volumes in catalogs and schemas that don't have assigned storage. Typically, however, Databricks recommends that you assign separate managed storage locations for each catalog.