Databricks
PRODAs per the documentation here, note that we only support metadata tag extraction for databricks version 13.3 version and higher.
In this section, we provide guides and references to use the Databricks connector.
Configure and schedule Databricks metadata and profiler workflows from the OpenMetadata UI:
- Requirements
- Unity Catalog
- Metadata Ingestion
- Query Usage
- Data Profiler
- Data Quality
- Lineage
- dbt Integration
- Troubleshooting
Ingestion Deployment
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with custom Airflow plugins to handle the workflow deployment. If you want to install it manually in an already existing Airflow host, you can follow this guide.
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check the following docs to run the Ingestion Framework in any orchestrator externally.
Run Connectors from the OpenMetadata UI
Learn how to manage your deployment to run connectors from the UIRun the Connector Externally
Get the YAML to run the ingestion externallyExternal Schedulers
Get more information about running the Ingestion Framework ExternallyHow to Run the Connector Externally
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with custom Airflow plugins to handle the workflow deployment.
If, instead, you want to manage your workflows externally on your preferred orchestrator, you can check the following docs to run the Ingestion Framework anywhere.
Requirements
Python Requirements
We have support for Python versions 3.9-3.11
To run the Databricks ingestion, you will need to install:
Authentication Types
Databricks connector supports three authentication methods:
- Personal Access Token (PAT): Generated Personal Access Token for Databricks workspace authentication.
- Databricks OAuth (Service Principal): OAuth2 Machine-to-Machine authentication using a Service Principal.
- Azure AD Setup: Specifically for Azure Databricks workspaces that use Azure Active Directory for identity management. Uses Azure Service Principal authentication through Azure AD.
Permission Requirements
The required permissions vary based on the authentication method used:
Personal Access Token Permissions
When using PAT, the token inherits the permissions of the user who created it. Ensure the user has:
Service Principal Permissions (OAuth/Azure AD)
For Service Principal authentication, grant permissions to the Service Principal:
Adjust <user>, <catalog_name>, <schema_name>, and <table_name> according to your specific deployment and security requirements.
Unity Catalog
If you are using unity catalog in Databricks, then checkout the Unity Catalog connector.
Metadata Ingestion
1. Visit the Services Page
Click Settings in the side navigation bar and then Services.
The first step is to ingest the metadata from your sources. To do that, you first need to create a Service connection first.
This Service will be the bridge between OpenMetadata and your source system.
Once a Service is created, it can be used to configure your ingestion workflows.

Select your Service Type and Add a New Service

Add a new Service from the Services page

Select your Service from the list
4. Name and Describe your Service
Provide a name and description for your Service.
Service Name
OpenMetadata uniquely identifies Services by their Service Name. Provide a name that distinguishes your deployment from other Services, including the other Databricks Services that you might be ingesting metadata from.
Note that when the name is set, it cannot be changed.

Provide a Name and description for your Service
5. Configure the Service Connection
In this step, we will configure the connection settings required for Databricks.
Please follow the instructions below to properly configure the Service to read from your sources. You will also find helper documentation on the right-hand side panel in the UI.

Configure the Service connection by filling the form