Defining Config

Defining Config

⚠️

These docs are still a work in progress! Contact us if you find anything missing or out of place

Defining the configuration for each pipeline is done on a code-level, with each pipeline having its own config file. This config file is written in a simple top-level .json and is for defining metadata and execution parameters.

For each pipeline, there needs to be a config file, these are typically named the same as the pipeline, but use a .config.json extension, i.e. store_fetch.config.json.

runner.config.json
{
  "name": "Example to Bucket",
  "runFile" "runner.py",
  "timeout": 600,
  "cpus": 1,
  "memory": 1,
  "retries": 3,
  "loader": false
}
ParameterDescription
nameThe name of the pipeline. This is used to identify the pipeline in the frontend UI.
runFileThe name of the file that contains pipeline code, and most crucily the runner function.
timeoutSet a timeout to limit the time a pipeline can run for. The current maximum of pipeline executions is 60 minutes. We recommend parallelising your task so that it runs across multiple executions.
cpusDefine the number of vCPUs to use for the pipeline execution. The available options are: 1, 2, 4, 8, 16 (vCPUs).
memoryDefine the amount of memory to use for the pipeline execution. The available options are: 1, 2, 4, 8, 16, 32, 64, 128 (GB).
retriesAn execution can automatically be re-triggered if an error occurs. Define the number of retries to attempt. Maximum is 5.
loaderA boolean for whether or not the pipeline includes a loader function to execute first.

Supported Configurations

vCPUs + Memory

1GB2GB4GB8GB16GB32GB64GB128GB
1 vCPUXXXXX
2 vCPUXXX
4 vCPUXXX
8 vCPUXXX
16 vCPUXXX