Loading Data via Pipelines

A pipeline in Databend Cloud allows for automatic discovery of file updates from Amazon S3 and automatically loads them into a table. Here are some scenarios where using a pipeline is recommended:

You have a large number of CSV or Parquet files in an S3 bucket and want to load them to Databend Cloud in one go for further analysis.
AWS automatically loads data into your S3 buckets, such as billing data, which can be automatically loaded into Databend Cloud for visualization and further analysis.
You have a continuous stream of user behavior logs being stored in S3, which can be automatically loaded into Databend Cloud using pipelines for further analysis.

Pipeline Now Free

During public beta, pipeline is offered free of charge on Databend Cloud. Limit of 4 Pipelines per organization. Contact us by submitting a ticket for additional requests.

Creating a Pipeline

To create a pipeline in Databend Cloud, you must first create a table that will serve as the target for the data to be imported into. The table schema must match the structure of the data to be imported in order for the pipeline to work properly.

To create a pipeline:

On the Data page, navigate to and select your target table, then select the Pipeline tab on the right.

Alt text

Click Configuration to create a pipeline, then provide the required information to enable access to your files in an Amazon S3 bucket.
If you import CSV files, you can specify details about the file format by clicking Settings.

Alt text

Click OK. The data loading process will begin only if all the connection information is accurate. After the loading is complete, you will be able to view the import logs like this on the page:

Alt text

Activating or Deactivating a Pipeline

Once created successfully, a pipeline is activated by default. The pipeline will periodically detect your file changes on Amazon S3 and automatically load them into the table in Databend Cloud until you deactivate it.

To deactivate a pipeline, go to the Pipeline tab and toggle the Active button.

Alt text

Loading Data via Pipelines

Creating a Pipeline

Activating or Deactivating a Pipeline

加入我们的社区

GitHub

Slack

Twitter

YouTube

开始使用 Databend Cloud

Creating a Pipeline​

Activating or Deactivating a Pipeline​

加入我们的社区

GitHub

Slack

Twitter

YouTube

开始使用 Databend Cloud

Creating a Pipeline

Activating or Deactivating a Pipeline