HISTOGRAM

引入或更新: v1.2.377

计算数据的分布情况。它使用“等高”分桶策略来生成直方图。该函数的结果返回一个空字符串或Json字符串。

语法

HISTOGRAM(<expr>)
HISTOGRAM(<expr> [, max_num_buckets])

max_num_buckets 表示可以使用的最大桶数，默认值为128。

例如：

select histogram(c_id) from histagg;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                  histogram(c_id)                                                  │
│                                                  Nullable(String)                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [{"lower":"1","upper":"1","ndv":1,"count":6,"pre_sum":0},{"lower":"2","upper":"2","ndv":1,"count":6,"pre_sum":6}] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

:::

参数

参数	描述
`<expr>`	`<expr>` 的数据类型应该是可排序的。
`max_num_buckets`	可选的常量正整数，表示可以使用的最大桶数。

返回类型

Nullable String 类型

示例

创建表并插入示例数据

CREATE TABLE histagg (
  c_id INT,
  c_tinyint TINYINT,
  c_smallint SMALLINT,
  c_int INT
);

INSERT INTO histagg VALUES
  (1, 10, 20, 30),
  (1, 11, 21, 33),
  (1, 11, 12, 13),
  (2, 21, 22, 23),
  (2, 31, 32, 33),
  (2, 10, 20, 30);

查询示例1

SELECT HISTOGRAM(c_int) FROM histagg;

结果

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                                                              histogram(c_int)                                                                                                             │
│                                                                                                              Nullable(String)                                                                                                             │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [{"lower":"13","upper":"13","ndv":1,"count":1,"pre_sum":0},{"lower":"23","upper":"23","ndv":1,"count":1,"pre_sum":1},{"lower":"30","upper":"30","ndv":1,"count":2,"pre_sum":2},{"lower":"33","upper":"33","ndv":1,"count":2,"pre_sum":4}] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

查询结果描述：

[
  {
    "lower": "13",
    "upper": "13",
    "ndv": 1,
    "count": 1,
    "pre_sum": 0
  },
  {
    "lower": "23",
    "upper": "23",
    "ndv": 1,
    "count": 1,
    "pre_sum": 1
  },
  {
    "lower": "30",
    "upper": "30",
    "ndv": 1,
    "count": 2,
    "pre_sum": 2
  },
  {
    "lower": "33",
    "upper": "33",
    "ndv": 1,
    "count": 2,
    "pre_sum": 4
  }
]

字段描述:

buckets：所有桶
- lower：桶的上界
- upper：桶的下界
- count：桶中包含的元素数量
- pre_sum：前面桶中元素的总数量
- ndv：桶中不同值的数量

HISTOGRAM

语法

参数

返回类型

示例

加入我们的社区

GitHub

知乎

bilibili

开源中国

微信

销售电话

开始使用 Databend Cloud

语法​

参数​

返回类型​

示例​

加入我们的社区

GitHub

知乎

bilibili

开源中国

微信

销售电话

开始使用 Databend Cloud

语法

参数

返回类型

示例