How To Build Powerful dbt Pipelines Using DuckDB in Just Minutes
Here's why everyone is talking about building dbt pipelines with DuckDB for streamlined data workflows...
My transformation logic in a data pipeline in dbt required, dynamically generating regex based on another column. My intuition was, its a simple pandas apply with lambda function to get the value out of the other col; but in SQL damn hard.
I have been playing with DuckDB rececntly. So, I built my first dbt python model with duchdb adapter. I had a small issue which got resolved super quickly thanks to Josh Wills.
Simply install all the dependencies in your virtualenv where you run dbt. Use dbt docs to write the file with model()
returning a single dataframe and you are done.
💡: You can build dbt pipelines in a box using duckdb excel plugin which can help you connect dbt to excel and gsheet. This can be powerful to convert current setups of non-tech data people into analytics engineers.