Data Assets

A data asset represents a specific table, view, or query within a data source. It's the heart of the model: assets are profiled for stats and health, rules are authored directly on them, and those rules are run together inside a rule group. An asset can also be marked restricted (PII/PHI/PCI) to hide it from restricted viewers, or marked observable to auto-generate drift rules.

Asset types

| Type | Description | |---|---| | Table | A physical table or view in the database | | Query | A custom SQL query (virtual table) |

Creating a data asset

Via the UI

  1. Navigate to Data Assets → New asset
  2. Select the data source
  3. Choose Table or Query
  4. For Table: enter the schema and table name
  5. For Query: write the SQL select statement
  6. Click Save

Via the API

# Table asset
curl -X POST https://dq.your-company.com/api/v1/data-assets \
  -H "Authorization: Bearer $DQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "type": "data-asset",
      "attributes": {
        "name": "orders",
        "datasource_id": "abc123",
        "asset_type": "table",
        "schema": "public",
        "table": "orders"
      }
    }
  }'

# Query asset
curl -X POST https://dq.your-company.com/api/v1/data-assets \
  -H "Authorization: Bearer $DQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "type": "data-asset",
      "attributes": {
        "name": "recent_orders",
        "datasource_id": "abc123",
        "asset_type": "query",
        "query": "SELECT * FROM orders WHERE created_at > CURRENT_DATE - INTERVAL '\''7 days'\''"
      }
    }
  }'

Batch definitions

A batch definition controls which rows are included in a checkpoint run. This is useful for:

  • Partition filtering: only validate data for the current day's partition
  • Incremental validation: only check rows added since the last run
{
  "batch_definition": {
    "type": "partition_key",
    "column": "created_at",
    "mode": "daily"
  }
}

Batch definition modes:

| Mode | Description | |---|---| | whole_table | All rows (default) | | daily | Rows where the partition column equals today's date | | custom_sql | Custom SQL WHERE clause appended to the asset query |

Schema discovery

Click Discover schema on a data asset to fetch the current column list from the warehouse. This populates the expectation builder's column picker.

Sampling

For large tables, you can enable sampling to speed up expectation development (not recommended for production runs):

{
  "sample_size": 10000,
  "sample_method": "random"
}

Sampling affects the expectation editor preview only. Production checkpoint runs always query the full dataset (or the batch definition's filtered set).