Advanced config types
In some cases, you may want to define a more complex config schema for your assets and ops. For example, you may want to define a config schema that takes in a list of files or complex data. In this guide, we'll walk through some common patterns for defining more complex config schemas.
Attaching metadata to config fields
Config fields can be annotated with metadata, which can be used to provide additional information about the field, using the Pydantic Field
class.
For example, we can annotate a config field with a description, which will be displayed in the documentation for the config field. We can add a value range to a field, which will be validated when config is specified.
import dagster as dg
from pydantic import Field
class MyMetadataConfig(dg.Config):
person_name: str = Field(description="The name of the person to greet")
age: int = Field(gt=0, lt=100, description="The age of the person to greet")
# errors, since age is not in the valid range!
MyMetadataConfig(person_name="Alice", age=200)
Defaults and optional config fields
Config fields can have an attached default value. Fields with defaults are not required, meaning they do not need to be specified when constructing the config object.
For example, we can attach a default value of "hello"
to the greeting_phrase
field, and can construct MyAssetConfig
without specifying a phrase. Fields which are marked as Optional
, such as person_name
, implicitly have a default value of None
, but can also be explicitly set to None
as in the example below:
from typing import Optional
import dagster as dg
from pydantic import Field
class MyAssetConfig(dg.Config):
person_name: Optional[str] = None
# can pass default to pydantic.Field to attach metadata to the field
greeting_phrase: str = Field(
default="hello", description="The greeting phrase to use."
)
@dg.asset
def greeting(config: MyAssetConfig) -> str:
if config.person_name:
return f"{config.greeting_phrase} {config.person_name}"
else:
return config.greeting_phrase
asset_result = dg.materialize(
[greeting],
run_config=dg.RunConfig({"greeting": MyAssetConfig()}),
)
Required config fields
By default, fields which are typed as Optional
are not required to be specified in the config, and have an implicit default value of None
. If you want to require that a field be specified in the config, you may use an ellipsis (...
) to require that a value be passed.
from typing import Optional, Callable
import dagster as dg
from pydantic import Field
class MyAssetConfig(dg.Config):
# ellipsis indicates that even though the type is Optional,
# an input is required
person_first_name: Optional[str] = ...
# ellipsis can also be used with pydantic.Field to attach metadata
person_last_name: Optional[Callable] = Field(
default=..., description="The last name of the person to greet"
)
@dg.asset
def goodbye(config: MyAssetConfig) -> str:
full_name = f"{config.person_first_name} {config.person_last_name}".strip()
if full_name:
return f"Goodbye, {full_name}"
else:
return "Goodbye"
# errors, since person_first_name and person_last_name are required
goodbye(MyAssetConfig())
# works, since both person_first_name and person_last_name are provided
goodbye(MyAssetConfig(person_first_name="Alice", person_last_name=None))
Basic data structures
Basic Python data structures can be used in your config schemas along with nested versions of these data structures. The data structures which can be used are:
List
Dict
Mapping
For example, we can define a config schema that takes in a list of user names and a mapping of user names to user scores.
import dagster as dg
class MyDataStructuresConfig(dg.Config):
user_names: list[str]
user_scores: dict[str, int]
@dg.asset
def scoreboard(config: MyDataStructuresConfig): ...
result = dg.materialize(
[scoreboard],
run_config=dg.RunConfig(
{
"scoreboard": MyDataStructuresConfig(
user_names=["Alice", "Bob"],
user_scores={"Alice": 10, "Bob": 20},
)
}
),
)