github.com/jessewei/sqlg-airflow

ETL templating tool for Apache airflow, with demo

Open this visualization on its own page →

Contributors

1

Lines of Code

903

From

2020-10-28

To

2020-12-20

About jessewei/sqlg-airflow

sqlg-airflow is a Docker-based deployment of Apache Airflow designed to simplify ETL pipeline orchestration. The project provides pre-configured Docker images and compose files that bundle Airflow with popular database drivers including PostgreSQL, Oracle, and SQL Server, reducing setup complexity for users who need to work with multiple data sources.

The repository includes multiple Docker Compose configurations supporting different Airflow executors: SequentialExecutor for single-machine setups, LocalExecutor for parallel task execution, and CeleryExecutor for distributed processing with Redis as a message broker. Users can customize their deployments by specifying Airflow extra packages and Python dependencies at build time, and the included entrypoint script automatically handles package installation and Airflow configuration through environment variables.

The project emphasizes ease of use through environment variable configuration for database connections, automatic fernet key generation for encrypted passwords, and straightforward volume mounting for custom plugins and Python requirements. It includes Docker Compose files that pre-configure PostgreSQL and Redis backends with sensible defaults, and provides scaling capabilities for worker processes in distributed setups.

Share this video