Using environment variables with components
With dg
and components, you can easily configure components depending on the environment in which they are run. To demonstrate this, we'll walk through setting up an example ELT pipeline with a Sling component which reads Snowflake credentials from environment variables.
For more information on using environment variables with non-component Dagster code, see Using environment variables and secrets in Dagster code.
1. Create a new Dagster components project
First, we'll set up a basic ELT pipeline using Sling in an empty Dagster components project:
uvx create-dagster@latest project ingestion
cd ingestion && source .venv/bin/activate
We'll install dagster-sling
and scaffold an empty Sling connection component:
uv add dagster-sling
dg list components
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Key ┃ Summary ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster.DefinitionsComponent │ An arbitrary set of Dagster definitions. │
├───────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ dagster.DefsFolderComponent │ A component that represents a directory containing multiple │
│ │ Dagster definition modules. │
├───────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ dagster.FunctionComponent │ Represents a Python function, alongside the set of assets or │
│ │ asset checks that it is responsible for executing. │
├───────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ dagster.PythonScriptComponent │ Represents a Python script, alongside the set of assets and │
│ │ asset checks that it is responsible for executing. │
├───────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ dagster.TemplatedSqlComponent │ A component which executes templated SQL from a string or file. │
├───────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
│ dagster.UvRunComponent │ Represents a Python script, alongside the set of assets or asset │
│ │ checks that it is responsible for executing. │
├───────────────────────────────────────────────────┼────────────────────────────────── ────────────────────────────────┤
│ dagster_sling.SlingReplicationCollectionComponent │ Expose one or more Sling replications to Dagster as assets. │
└───────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────┘
dg scaffold defs dagster_sling.SlingReplicationCollectionComponent ingest_to_snowflake
Creating defs at /.../ingestion/src/ingestion/defs/ingest_to_snowflake.
2. Use environment variables in a component
Next, we will configure a Sling connection that will sync a local CSV file to a Snowflake database, with credentials provided with environment variables:
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv
source: LOCAL
target: SNOWFLAKE
defaults:
mode: full-refresh
object: "{stream_table}"
streams:
file://raw_customers.csv:
object: "sandbox.raw_customers"
We will use the env
function to template credentials into Sling configuration in our defs.yaml
file. Running dg check yaml
will highlight that we
need to explicitly encode these environment dependencies at the bottom of the file:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
connections:
SNOWFLAKE:
type: snowflake
account: "{{ env.SNOWFLAKE_ACCOUNT }}"
user: "{{ env.SNOWFLAKE_USER }}"
password: "{{ env.SNOWFLAKE_PASSWORD }}"
database: "{{ env.SNOWFLAKE_DATABASE }}"
replications:
- path: replication.yaml
dg check yaml --validate-requirements
/.../ingestion/src/ingestion/defs/ingest_files/defs.yaml:1 - requirements.env Component uses environment variables that are not specified in the component file: SNOWFLAKE_ACCOUNT, SNOWFLAKE_DATABASE, SNOWFLAKE_PASSWORD, SNOWFLAKE_USER
|
1 | type: dagster_sling.SlingReplicationCollectionComponent
| ^ Component uses environment variables that are not specified in the component file: SNOWFLAKE_ACCOUNT, SNOWFLAKE_DATABASE, SNOWFLAKE_PASSWORD, SNOWFLAKE_USER
2 |
3 | attributes:
4 | connections:
5 | SNOWFLAKE:
6 | type: snowflake
7 | account: "{{ env.SNOWFLAKE_ACCOUNT }}"
8 | user: "{{ env.SNOWFLAKE_USER }}"
9 | password: "{{ env.SNOWFLAKE_PASSWORD }}"
10 | database: "{{ env.SNOWFLAKE_DATABASE }}"
11 | replications:
12 | - path: replication.yaml
|
After adding the environment dependencies, running dg check yaml
again will confirm that the file is valid:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
connections:
SNOWFLAKE:
type: snowflake
account: "{{ env.SNOWFLAKE_ACCOUNT }}"
user: "{{ env.SNOWFLAKE_USER }}"
password: "{{ env.SNOWFLAKE_PASSWORD }}"
database: "{{ env.SNOWFLAKE_DATABASE }}"
replications:
- path: replication.yaml
requirements:
env:
- SNOWFLAKE_ACCOUNT
- SNOWFLAKE_USER
- SNOWFLAKE_PASSWORD
- SNOWFLAKE_DATABASE
dg check yaml
All component YAML validated successfully.
Next, you can invoke dg list env
, which shows all environment variables configured or used by components in the project. Here we can see all of the Snowflake credentials we must configure in our shell or .env
file in order to run our project:
dg list env
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Env Var ┃ Value ┃ Components ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ SNOWFLAKE_ACCOUNT │ │ ingest_files │
│ SNOWFLAKE_DATABASE │ │ ingest_files │
│ SNOWFLAKE_PASSWORD │ │ ingest_files │
│ SNOWFLAKE_USER │ │ ingest_files │
└────────────────────┴───────┴──────────────┘
You can edit the .env
file in your project root to specify environment variables for Dagster to use when running the project locally. You can run dg list env
again to see that they are now set:
echo 'SNOWFLAKE_ACCOUNT=...' >> .env
echo 'SNOWFLAKE_USER=...' >> .env
echo 'SNOWFLAKE_PASSWORD=...' >> .env
echo "SNOWFLAKE_DATABASE=sandbox" >> .env
dg list env
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Env Var ┃ Value ┃ Components ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ SNOWFLAKE_ACCOUNT │ ✓ │ ingest_files │
│ SNOWFLAKE_DATABASE │ ✓ │ ingest_files │
│ SNOWFLAKE_PASSWORD │ ✓ │ ingest_files │