跳到主要内容

Virtual Column

企业版功能
VIRTUAL COLUMN是企业版功能。 如需获取许可证,请联系 Databend 支持团队

A virtual column is a construct formed by extracting nested fields within Variant data and storing that data in separate storage files. Consider using virtual columns when you regularly query specific nested fields within Variant data to realize the following benefits:

  • Accelerated Query Processing: Virtual columns streamline the querying process by eliminating the need to traverse the entire nested structure to locate the desired data. Direct data retrieval from virtual columns parallels the process of accessing regular columns, resulting in a significant acceleration of query execution.

  • Reduced Memory Usage: Variant data often includes numerous internal fields, and reading all of them can lead to substantial memory consumption. By transitioning to reading virtual columns, there is a notable reduction in memory usage, mitigating the risk of potential memory overflows.

Alt text

Managing Virtual Columns

Databend provides a variety of commands to manage virtual columns. For details, see VIRTUAL COLUMN.

Usage Examples

This example demonstrates the practical use of virtual columns and their impact on query execution:

-- Create a table named 'test' with columns 'id' and 'val' of type Variant.
CREATE TABLE test(id int, val variant);

-- Create virtual columns for specific elements in the 'val' column.
CREATE VIRTUAL COLUMN (
val ['name'], -- Extract the 'name' field.
val ['tags'] [0], -- Extract the first element in the 'tags' array.
val ['pricings'] [0] ['type'] -- Extract the 'type' field from the first pricing in the 'pricings' array.
) FOR test;

-- Insert a sample record into the 'test' table with Variant data.
INSERT INTO
test
VALUES
(
1,
'{"id":1,"name":"databend","tags":["powerful","fast"],"pricings":[{"type":"Standard","price":"Pay as you go"},{"type":"Enterprise","price":"Custom"}]}'
);

-- Explain the query execution plan for selecting specific fields from the table.
EXPLAIN
SELECT
val ['name'],
val ['tags'] [0],
val ['pricings'] [0] ['type']
FROM
test;

-[ EXPLAIN ]-----------------------------------
TableScan
├── table: default.default.test
├── output columns: [val['name'] (#2), val['tags'][0] (#3), val['pricings'][0]['type'] (#4)]
├── read rows: 1
├── read bytes: 203
├── partitions total: 1
├── partitions scanned: 1
├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1, bloom pruning: 0 to 0>]
├── push downs: [filters: [], limit: NONE, virtual_columns: [val['name'], val['pricings'][0]['type'], val['tags'][0]]]
└── estimated rows: 1.00

-- Explain the query execution plan for selecting only the 'name' field from the table.
EXPLAIN
SELECT
val ['name']
FROM
test;

-[ EXPLAIN ]-----------------------------------
TableScan
├── table: default.default.test
├── output columns: [val['name'] (#2)]
├── read rows: 1
├── read bytes: 203
├── partitions total: 1
├── partitions scanned: 1
├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1, bloom pruning: 0 to 0>]
├── push downs: [filters: [], limit: NONE, virtual_columns: [val['name']]]
└── estimated rows: 1.00

-- Display all the virtual columns defined in the system.
SHOW VIRTUAL COLUMNS;

┌─────────────────────────────────────────────────────────────────────────────┐
databasetable │ virtual_columns │
├──────────┼────────┼─────────────────────────────────────────────────────────┤
default │ test │ val['name'], val['pricings'][0]['type'], val['tags'][0]
└─────────────────────────────────────────────────────────────────────────────┘

-- Drop the virtual columns associated with the 'test' table.
DROP VIRTUAL COLUMN FOR test;
这篇文章对您有帮助吗?
Yes
No
开始使用 Databend Cloud
低成本
快速分析
多种数据源
弹性扩展
注册