how-to-guides

No menu items for this category
OpenMetadata Documentation

Tests in the OpenMetadata UI

Here you can see all the supported tests definitions and how to configure them in the UI.

A Test Definition is a generic definition of a test. This Test Definition then gets specified in a Test Case. This Test Case is where the parameter(s) of a Test Definition are specified.

In this section, you will learn what tests we currently support and how to configure them in the OpenMetadata UI.

Tests applied on top of a Table. Here is the list of all table tests:

Validate that the total number of rows in a table exactly matches an expected value.**

  • To monitor tables where row count is expected to remain fixed (e.g., dimension tables).
  • To catch over- or under-loading issues after ETL processes.
  • To verify baseline data volumes for test/staging/prod comparisons.
PropertyDescription
Expected ValueThe exact number of rows the table should contain.
ConditionStatus
Actual row count = expected value✅ Success
Actual row count ≠ expected value❌ Failed

Ensure that the total number of rows in the table falls within an expected range.

  • To monitor for abnormal growth or shrinkage in table size.
  • To catch failed inserts, unintended truncations, or unexpected data surges.
  • To set alerts based on historical data volume expectations.
PropertyDescription
Min ValueMinimum expected number of rows (minValue)
Max ValueMaximum allowed number of rows (maxValue)
  • At least one of these values is required to run the test.
ConditionStatus
Row count is between minValue and maxValue✅ Success
Row count is outside the defined range❌ Failed

Validate that the table contains exactly the expected number of columns.

  • To detect unapproved schema changes (e.g., columns being added or dropped).
  • To enforce data contracts between teams or systems.
  • To ensure structural consistency across environments.
PropertyDescription
Expected CountExact number of columns the table must have.
ConditionStatus
Actual column count = expected count✅ Success
Actual column count ≠ expected count❌ Failed

Validate that the number of columns in a table falls within a defined range.

  • To detect schema drift or changes in table structure.
  • To ensure a table has a predictable number of columns across environments (e.g., staging vs. production).
PropertyDescription
Min ColumnsMinimum number of expected columns (minColValue)
Max ColumnsMaximum number of allowed columns (maxColValue)
ConditionStatus
Actual column count is within the defined range✅ Success
Actual column count is outside the defined range❌ Failed

Ensure that a specific column is present in the table schema.

  • To validate that required schema fields exist (e.g., order_id, customer_id).
  • To monitor schema changes that might break downstream processes.
  • To enforce critical column presence in governed datasets.
PropertyDescription
Column NameName of the column that must exist in the table.
ConditionStatus
columnName exists in the table schema✅ Success
columnName is missing from the table❌ Failed

Validate that a table’s column names match a predefined set — with or without order sensitivity.

  • To ensure schema alignment across different environments or pipeline stages.
  • To detect unexpected column additions, deletions, or reordering.
  • To enforce table contracts where the exact structure is critical.
PropertyDescription
Column NamesComma-separated list of expected column names (e.g., col1, col2, col3)
OrderedBoolean flag (true or false) — whether the order of columns must match.
OrderedConditionStatus
falseAll expected column names exist (any order)✅ Success
trueColumn names match and appear in the exact order✅ Success
falseSome columns are missing or extra❌ Failed
trueColumns are present but order is incorrect❌ Failed

Use this test to define your own validation logic using a custom SQL expression.

  • To implement logic beyond predefined test definitions.
  • To detect outliers, nulls, duplicates, or business-specific data anomalies.
  • When you need full flexibility using SQL syntax.
PropertyDescription
SQL ExpressionThe SQL query used to evaluate the test.
StrategyDefines how to interpret the result. Options: ROWS (default) or COUNT.
ThresholdThe maximum allowed rows or count before marking the test as failed. Default is 0.
StrategyConditionStatus
ROWSNumber of returned rows ≤ threshold✅ Success
ROWSNumber of returned rows > threshold❌ Failed
COUNTCount result ≤ threshold✅ Success
COUNTCount result > threshold❌ Failed

Check that the number of rows inserted during a defined time window falls within an expected range.**

  • To detect whether recent data ingestion volumes are within acceptable limits.
  • To monitor time-partitioned tables for daily/hourly/monthly data drops or spikes.
  • To validate pipeline freshness and completeness over time.
PropertyDescription
Min Row CountMinimum number of inserted rows expected in the given range.
Max Row CountMaximum number of inserted rows allowed in the given range.
Column NameTimestamp column used to filter the inserted rows.
Range TypeTime granularity: HOUR, DAY, MONTH, or YEAR.
Range IntervalNumber of units (e.g., last 1 DAY, 2 HOURS, etc.).
ConditionStatus
Row count within min and max for the interval✅ Success
Row count outside of the expected range❌ Failed

Use this test to verify data consistency between two tables, even across different platforms or services.

  • After data replication or migration (e.g., Snowflake → Redshift).
  • To validate data integrity between source and target systems.
PropertyDescription
Key ColumnsColumns used as the row-matching key. Defaults to the table's primary key if not specified.
Columns to CompareSubset of columns used for comparison. If not provided, all columns will be compared.
Second TableFully qualified name of the second table (e.g., redshift_dbt.dev.dbt_jaffle.boolean_test).
ThresholdMaximum number of mismatched rows allowed. Default is 0 (strict equality).
Filter Condition(Optional) A WHERE clause (e.g., id != 999) to limit rows involved in the comparison.
Case-Sensitive ColumnsSet to true if column name case must match exactly (default is false).
ConditionStatus
Number of differing rows ≤ threshold✅ Success
Number of differing rows > threshold❌ Failed
  • Snowflake
  • BigQuery
  • Athena
  • Redshift
  • Postgres
  • MySQL
  • MSSQL
  • Oracle
  • Trino
  • SAP Hana

Ensure that table data is being updated frequently enough to be considered fresh.

  • To monitor data pipelines for staleness or lag.
  • To detect delays in scheduled batch updates.
  • To ensure compliance with SLAs for near real-time data delivery.
PropertyDescription
ColumnThe datetime column used to determine the last update.
Time Since UpdateTime threshold (in seconds) — maximum age of the most recent data entry.
ConditionStatus
Last update time ≤ timeSinceUpdate✅ Success
Last update time > timeSinceUpdate❌ Failed

Tests applied on top of Column metrics. Here is the list of all column tests:

Ensures each value in a column appears only once.

Uniqueness

  • Primary keys or natural identifiers
  • Fields like email, username, or ID
ConditionStatus
All values are unique
Any duplicate value found

Ensures there are no NULL entries in the column.

Completeness

  • Mandatory fields such as email, amount, created_at
  • Required keys or business-critical columns
ConditionStatus
No NULLs present
Any NULL value present

This test allows us to specify how many values in a column we expect that will match a certain regex expression. Please note that for certain databases we will fall back to SQL LIKE expression. The databases supporting regex pattern as of 0.13.2 are:

  • redshift
  • postgres
  • oracle
  • mysql
  • mariaDB
  • sqlite
  • clickhouse
  • snowflake

Ensures all values match a specified regular expression pattern.

Validity

  • Emails, zip codes, IDs, structured formats
ConditionStatus
All values match regex
Any value does not match

This test allows us to specify values in a column we expect that will not match a certain regex expression. If the test find values matching the forbiddenRegex the test will fail. Please note that for certain databases we will fall back to SQL LIKE expression. The databases supporting regex pattern as of 0.13.2 are:

  • redshift
  • postgres
  • oracle
  • mysql
  • mariaDB
  • sqlite
  • clickhouse
  • snowflake

The other databases will fall back to the LIKE expression

Ensures values do not match a restricted regex pattern.

Validity

  • Prevent forbidden values, test strings, or patterns
ConditionStatus
No value matches forbidden pattern
Any value matches the pattern

Ensures values are within a predefined whitelist.

Validity

  • Enum values: status, currency, country_code
ConditionStatus
All values in set (if matchEnum = true)
Any value not in set (if matchEnum = true)
Any value from set exists (if matchEnum = false)
No values from set found (if matchEnum = false)

Ensures values are not in a specified blacklist.

Validity

  • Block invalid values like "NA", "Unknown", -1
ConditionStatus
No values from forbidden set
Any value from forbidden set found

Validates numeric values of a column are within a given range.

Accuracy

  • Username length, field input length validation
ConditionStatus
Length within [min, max]
Length < min or > max

Ensures total missing values (NULL + defined "missing" strings) match a target count.

Completeness

  • Auditing known missing values
  • Accounting for "NA", "N/A", "null"
ConditionStatus
Missing count = expected value
Missing count ≠ expected value

Ensures that the length of each string value in the column is within a defined character range.

Accuracy

  • To validate field length constraints like name, address, or description
  • To catch too-short or too-long values that may break UI or downstream logic
ConditionStatus
All values have length within [min, max]
Any value length < min or > max

Validates the maximum value of a column lies within a range.

Accuracy

  • Cap validation for score, amount, age
ConditionStatus
Max value in range [min, max]
Max < min or Max > max

Validates the minimum value of a column lies within a range.

Accuracy

  • Threshold validation for discount, price, etc.
ConditionStatus
Min value in range [min, max]
Min < min or Min > max

Validates that the mean (average) value is in the expected range.

Accuracy

  • Check dataset drift or pipeline behavior
ConditionStatus
Mean value in [min, max]
Mean < min or Mean > max

Validates the median value is in the expected range.

Accuracy

  • Median income, score, latency checks
ConditionStatus
Median in range [min, max]
Median < min or Median > max

Validates the total sum of values in a column is within a defined range.

Accuracy

  • Revenue, units sold, total scores, etc.
ConditionStatus
Sum in range [min, max]
Sum < min or Sum > max

Validates the standard deviation (spread) of values is acceptable.

Accuracy

  • Monitoring variance in numeric datasets
ConditionStatus
Std Dev in [min, max]
Std Dev < min or > max

Validates latitude/longitude values are within a defined area.

Accuracy

  • Verifying address coordinates
  • Mapping regional data
ConditionStatus
Coordinates within buffer of expected location
Any record outside allowed radius