DagsterDocs
Quick search

Main Concepts#

Dagster is a data orchestrator. The framework lets you define pipelines (DAGs) in terms of the data flow between reusable, logical components called solids. These pipelines can be developed and iterated locally and run anywhere.

How to use the Main Concepts section#

Each page in this section contains:

  • Overview: description of the concept
  • Relevant APIs: Index of top-level Dagster APIs that are relevant to the concept
  • Examples: Easy to copy code snippets for the concept
  • Patterns: A list of advanced patterns to use and anti-patterns to avoid with the concept

Sections#

Solids and Pipelines

Solids and Pipelines are the building blocks of Dagster code. You use these in conjunction to define execution graphs. This section covers how to define and use both solids and pipelines.

Modes and Resources

Modes, alongside resources, enable you to separate the pipeline logic from the environments and therefore make it easier to test and develop data pipelines in various environments.

Testing

Dagster enables you to build testable and maintainable data applications. This section shows that Dagster enables you to unit-test your data pipelines, separate business logic from environments for testing, and run data quality tests.

Configuration System

Dagster provides a configuration system that makes your data applications reliable and maintainable. This section demonstrates how configurations work with different Dagster entities.

IO Management

IO Managers are user-provided objects that store solid outputs and load them as inputs to downstream solids. This section explains how Dagster thinks about IO management and shows how to define and use IO managers and other IO-related features.

Dagit

Dagit is a web-based interface for viewing and interacting with Dagster objects. This section walks you through Dagit's functionalities.

Repositories and Workspaces

A workspace is a collection of user-defined repositories and information about where to find them. Dagster tools, like Dagit and the Dagster CLI, use workspaces to load user code. This section shows how to define and when to use repositories and workspaces.

Schedules, Sensors, and Partitions

Schedulers can launch runs on a fixed interval, while sensors allow you to run based on any external state change. This section demonstrates how to define them and their convenient capabilities like partitioning and backfilling.

Logging

Dagster includes a rich and extensible logging system. This section showcases Dagster's built-in logger and shows how you can customize loggers to fit your logging and monitoring infrastructure.

Executors

Executors are responsible for executing steps within a pipeline run. This section explains how to define an executor to meet your computation needs.