Set table metadata by dbt macro
... when the docs feature doesn't work for you
Hey, quick knowledge share. I recently wanted to use the docs feature in dbt. In particular, I wanted dbt to populate table and field descriptions to BigQuery. This should be possible by setting persist_docs to True in the dbt_profiles.yml file.
However after trying this for hours on end, without success, I finally figured out why. This config is only useful when you run `dbt docs generate` — for me, running `dbt build` it did nothing. I was not that keen on incorporating another step in our pipelines, and the docs generation failed anyway due to some customizations we’ve made.
So I decided to set up a macro that can run as a post-hook to use the same config, and set the table and field level descriptions. You can find it below:
{% macro set_table_descriptions() -%}
{% if execute %}
{% set node = graph.nodes['model.' + project_name + "." + this.identifier] %}
{% if node.config.persist_docs.relation and node.config.description %}
ALTER TABLE
{{ this }}
SET OPTIONS(description="""{{ node.config.description }}""");
{% endif %}
{% if node.config.persist_docs.columns and node.config.columns %}
ALTER TABLE
{{ this }}
{% for column in node.config.columns %}
ALTER COLUMN {{ column.name }} SET OPTIONS(description="""{{ column.description }}"""){% if not loop.last %},{% endif %}
{% endfor %}
{% endif %}
{% endif %}
{% endmacro %}This can be set as a post hook on a model, or all models with
+post-hook:
- "{{ set_table_descriptions() }}"The way it works is, on an actual execute, it pulls out the configuration of this particular model. It then checks whether persist_docs is set on the table and fields, and goes ahead and patches those.
Some improvements that are not super hard to do would be to also handle VIEWs, you can do that with e.g.:
ALTER {% if config.get('materialized') == "view" %}VIEW{% else %}TABLE{% endif %} {{ this }}You can also set labels on the table with a `labels` key within the OPTIONS call:
SET OPTIONS (
labels=[
STRUCT('invocation_id', '{{ invocation_id }}'),
],
description="""some text""
)That’s all for now.
