Skip to content

Databricks Managed Location Catalog Schema Level

πŸš€ Unity Catalog Setup from Scratch (Azure + Databricks)πŸ”—

Using UC Connector as Managed Identity

πŸ”Ή Part 1: Azure SetupπŸ”—

1. Create a Storage Account

In Azure Portal, search Storage Accounts β†’ Create.

Important settings:

Performance: Standard Redundancy: LRS (for testing) Enable Hierarchical Namespace (HNS) βœ… (must be ON for Unity Catalog). Create a container inside it (e.g. uc-data).

2. Assign Permissions to UC Managed Identity

In Azure Portal β†’ Storage Account β†’ Access Control (IAM) β†’ Add Role Assignment. Assign these roles to the UC Managed Identity (the connector): - Storage Blob Data Owner (read/write access). - Storage Blob Delegator (needed for ABFS driver).

Scope: Storage Account level (recommended).

πŸ‘‰ Tip: You can find the UC managed identity name in Databricks β†’ Admin Console β†’ Identity Federation.

πŸ”Ή Part 2: Databricks Setup (UI + SQL)πŸ”—

3. Verify Metastore

Go to Databricks Admin Console β†’ Unity Catalog β†’ Metastores.

If you don’t have one, click Create Metastore. Region = same as Storage Account. Assign this metastore to your workspace.

πŸ‘‰ Do not skip this. Unity Catalog won’t work without a metastore.

4. Create Storage Credential (UI or SQL)

This ties the Azure Managed Identity to Unity Catalog.

UI: Go to Catalog β†’ Storage Credentials β†’ Create. Select Azure Managed Identity. Enter a name (e.g., uc-cred). Paste the UC Connector Managed Identity ID.

CREATE STORAGE CREDENTIAL uc_cred
WITH AZURE_MANAGED_IDENTITY 'your-managed-identity-client-id'
COMMENT 'UC Connector credential for ADLS';

5. Register an External Location

This points Unity Catalog to your ADLS container/folder.

UI: Go to Catalog β†’ External Locations β†’ Create. Name: ext_loc_dev Path: abfss://uc-data@.dfs.core.windows.net/ Storage credential: uc-cred (Storage Credential is at Storage Account level so for one cred is enough)

CREATE EXTERNAL LOCATION ext_loc_dev
URL 'abfss://uc-data@<storageaccount>.dfs.core.windows.net/'
WITH (STORAGE CREDENTIAL uc_cred)
COMMENT 'External location for dev data';

6. Create Catalog

Decide whether it will use:

Metastore root storage (if defined), OR Catalog-level managed location (recommended).

UI:

Go to Catalog Explorer β†’ Create Catalog. Name: dev Managed Location:

abfss://uc-data@<storageaccount>.dfs.core.windows.net/dev
Storage credential: uc-cred.

SQL equivalent:

CREATE CATALOG dev
MANAGED LOCATION 'abfss://uc-data@<storageaccount>.dfs.core.windows.net/dev';

image

7. Create Schema

Schemas can also have their own managed locations (if needed).

CREATE SCHEMA dev.bronze
MANAGED LOCATION 'abfss://uc-data@<storageaccount>.dfs.core.windows.net/dev/bronze';

9. Grant Permissions

For yourself or a group (like account users):

-- Catalog usage
GRANT USAGE ON CATALOG dev TO `account users`;

-- Schema usage
GRANT USAGE ON SCHEMA dev.bronze TO `account users`;

-- External location permissions
GRANT READ FILES, WRITE FILES ON EXTERNAL LOCATION ext_loc_dev TO `account users`;

-- Table level
GRANT SELECT, MODIFY ON TABLE dev.bronze.trades TO `account users`;

10. Create Tables at Different Levels

CREATE SCHEMA dev.bronze
COMMENT 'This is schema in dev catalog without external location'

This gets created in metastore level container because its managed and we havent specified external location at catalog level.

-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev.bronze.raw_sales (
  id INT,
  name STRING,
  invoice_no INT,
  price double
);

INSERT INTO dev_ext.bronze.raw_sales VALUES (1,'Cookies',1,200.50);

This gets created in catalog level container because the catalog associated to the schema is external.

-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev_ext.bronze.raw_sales (
  id INT,
  name STRING,
  invoice_no INT,
  price double
);

INSERT INTO dev_ext.bronze.raw_sales VALUES (1,'Cookies',1,200.50);

Creating schema in external location.

CREATE EXTERNAL LOCATION 'ext_schema'
MANAGED LOCATION 'https://adbvedanthnew.databricks.net/adb/schema/bronze_ext'

This gets created in schema level since we specified external location at schema level.

-- CREATE A TABLE UNDER ALL THREE SCHEMA
CREATE TABLE IF NOT EXISTS dev_ext.bronze_ext.raw_sales (
  id INT,
  name STRING,
  invoice_no INT,
  price double
);

INSERT INTO dev_ext.bronze_Ext.raw_sales VALUES (1,'Cookies',1,200.50);

image

Where data is stored?πŸ”—

  1. Stored in metastore root location

image

image

  1. Store in Catalog Level Ext Location
DESC EXTENDED DEV_EXT.BRONZE.RAW_SALE;

image

  1. Store in Schema Level External Location
DESC EXTENDED DEV_EXT.BRONZE_EXT.RAW_SALE

image

SummaryπŸ”—


πŸ”Ή 1. Managed Table (no LOCATION specified)πŸ”—

CREATE SCHEMA finance
MANAGED LOCATION 'abfss://finance@companydatalake.dfs.core.windows.net/schemas/finance';

CREATE TABLE finance.transactions (
  id INT,
  amount DECIMAL(10,2)
);
  • Since you didn’t give a LOCATION for the table:

  • Unity Catalog treats this as a managed table.

  • UC stores the data under the schema’s managed location.

βœ… Path = abfss://finance@companydatalake.dfs.core.windows.net/schemas/finance/transactions/

So in your example, you are exactly right.


πŸ”Ή 2. External Table (explicit LOCATION specified)πŸ”—

CREATE TABLE finance.transactions_ext (
  id INT,
  amount DECIMAL(10,2)
)
LOCATION 'abfss://raw@companydatalake.dfs.core.windows.net/landing/transactions/';
  • Here you told UC exactly where the data lives.
  • This is an external table.
  • UC does not move or manage the files β€” it just registers metadata pointing at that path.

βœ… Path = abfss://raw@companydatalake.dfs.core.windows.net/landing/transactions/


πŸ”Ή What If Schema Has No Managed Location?πŸ”—

If you do:

CREATE SCHEMA finance;
  • No schema location defined.
  • A managed table like:
CREATE TABLE finance.transactions (id INT, amount DECIMAL(10,2));

will fall back to the metastore’s root storage location (the one you defined when setting up Unity Catalog).

βœ… Path example: abfss://uc-metastore@companydatalake.dfs.core.windows.net/finance/transactions/


πŸ”Ή Summary TableπŸ”—

Case Table Type Storage Path
Schema has MANAGED LOCATION, table has no LOCATION Managed table Inside schema’s managed location (e.g., /schemas/finance/transactions/)
Schema has MANAGED LOCATION, table has explicit LOCATION External table Exact path you provided
Schema has no location, table has no LOCATION Managed table Falls back to metastore root storage

βœ… So in your example (schema has managed location + table no LOCATION) β†’ yes, data files are stored under schema’s storage location.


Real Case Study : Employee Data SensitivityπŸ”—

For example, let's say your organization has a company compliance policy that requires production data relating to human resources to reside in the bucket s3://mycompany-hr-prod. In Unity Catalog, you can achieve this requirement by setting a location on a catalog level, creating a catalog called, for example hr_prod, and assigning the location s3://mycompany-hr-prod/unity-catalog to it. This means that managed tables or volumes created in the hr_prod catalog (for example, using CREATE TABLE hr_prod.default.table …) store their data in s3://mycompany-hr-prod/unity-catalog. Optionally, you can choose to provide schema-level locations to organize data within the hr_prod catalog at a more granular level.

If storage isolation is not required for some catalogs, you can optionally set a storage location at the metastore level. This location serves as a default location for managed tables and volumes in catalogs and schemas that don't have assigned storage. Typically, however, Databricks recommends that you assign separate managed storage locations for each catalog.