SQL Statements

当前支持的 SQL 语义和 ClickHouse 社区版比较类似,但还是建议参考手册中的示例进行使用。文中的一些示例和内容参考了社区文档进行修改,来确保可以在 ByteHouse 中正常使用。

📘

Note

Symbols between token [] are optional. For example [IF NOT EXISTS]
Token | means or
Token ... means can repeat more times

Alter Statement

ADD COLUMN

Adds a new column to the table.
Syntax

ALTER TABLE [tableIdentifier] ADD COLUMN [IF NOT EXISTS] [tableColumnDfnt] [AFTER name_after]
  • IF NOT EXISTS clause is included, the query won’t return an error if the column already exists.

  • AFTER name_after (the name of another column), the column is added after the specified one in the list of table columns.

Example

# Step 1: create a table 
CREATE TABLE db_name.table_name
(
    `order_by_column ` String
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column `)

# Step 2: add column
ALTER TABLE db_name.table_name ADD COLUMN IF NOT EXISTS column_name String COMMENT 'column comment here' AFTER order_by_column

MODIFY COLUMN

Syntax

Modify Column

ALTER TABLE [tableIdentifier] MODIFY COLUMN [IF EXISTS] [tableColumnDfnt] 

If the modify column is:

  • Order by column: can modify its default_expr, comment, codec.

  • Partition by column: can modify its comment.

  • Normal column: can modify itstype, default_expr, comment, codec.

Example

# Step 1: create a table 
CREATE TABLE db_name.table_name
(
    `order_by_column ` String,
    `normal_column` Int64 DEFAULT 0
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column `)

# Step 2: modify column normal_column_name & update its data type, comment and default value
ALTER TABLE db_name.table_name MODIFY COLUMN IF EXISTS normal_column String DEFAULT 'new_default_value' COMMENT 'new comment' 

DROP COLUMN

Deletes a column in the table. Partition by columns and order by columns are not allowed to drop.
Syntax

ALTER TABLE [tableIdentifier] DROP COLUMN [IF EXISTS] column_name;
  • IF EXISTS clause is specified, the query won’t return an error if the column does not exist.

Example

# Step 1: create a table 
CREATE TABLE db_name.table_name
(
    `order_by_column` String,
    `normal_column` Int64
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# Step 2: drop the normal column
ALTER TABLE db_name.table_name DROP COLUMN IF EXISTS normal_column

RENAME COLUMN

Rename a column in the table. There are some usage notes:

  • Cannot rename column to an existing column.

  • Cannot rename column to itself.

  • Rename from column must exists.

Syntax

ALTER TABLE [tableIdentifier] RENAME COLUMN [IF EXISTS] column_name_from TO column_name_to;
  • IF EXISTS clause is specified, the query won’t return an error if the column does not exist.

Example

# Step 1: create a table 
CREATE TABLE db_name.table_name
(
    `order_by_column` String,
    `old_column_name` Int64
)
ENGINE = `CnchMergeTree`
ORDER BY (`old_column_name`)

# Step 2: rename column
ALTER TABLE db_name.table_name RENAME COLUMN old_column_name TO new_column_name

ADD CONSTRAINT

Adds a constraint to the table.
Syntax

ALTER TABLE [tableIdentifier] ADD CONSTRAINT [IF NOT EXISTS] constraint_name CHECK columnExpr;
  • IF NOT EXISTS clause is included, the query won’t return an error if the constraint already exists.

Example

# Step 1: create a table
CREATE TABLE `example_table`
(
    `order_by_column` String
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# Step 2: add constraint check_str
ALTER TABLE example_table ADD CONSTRAINT IF NOT EXISTS check_str CHECK order_by_column != 'forbidden_string'

DROP CONSTRAINT

Deletes a constraint to the table.
Syntax

ALTER TABLE [tableIdentifier] DROP CONSTRAINT [IF EXISTS] constraint_name;
  • IF EXISTS clause is specified, the query won’t return an error if the constraint does not exist.

Example

# Step 1: create a table with the constraint check_str
CREATE TABLE `example_table`
(
    `order_by_column` String,
    CONSTRAINT check_str CHECK order_by_column != 'forbidden_string'
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# Step 2: drop constraint check_str
ALTER TABLE example_table DROP CONSTRAINT IF EXISTS check_str

MODIFYTTL

Change table TTL. Columns used in ttl Clause must be in partition by columns.
Syntax

ALTER TABLE [tableIdentifier] MODIFY [ttlClause];

Example

# Step 1: create a table with table TTL to be 1 day
CREATE TABLE `example_table`
(
    `d` DateTime
)
ENGINE = `CnchMergeTree`
PARTITION BY d
ORDER BY `d`
TTL d + INTERVAL 1 DAY

# Step 2: modify table ttl & change the TTL to 2 month
ALTER TABLE example_table
    MODIFY TTL d + INTERVAL 2 MONTH

REMOVETTL

Remove table TTL.
Syntax

ALTER TABLE [tableIdentifier] REMOVE TTL;

Example

# Step 1: create a table with table TTL to be 1 day
CREATE TABLE `example_table`
(
    `d` DateTime
)
ENGINE = `CnchMergeTree`
PARTITION BY d
ORDER BY d
TTL d + INTERVAL 1 DAY

# Step 2: remove table ttl
ALTER TABLE example_table
    REMOVE TTL

AST Statement

Shows the execution plan of a statement. Dumps the query AST.
Syntax

AST query

Output

ColumnDescription
explainThe parsedASTof the statement

Example

AST SELECT 1;
Explain ParsedAST (children 1)
 SelectWithUnionQuery (children 1)
 ExpressionList (children 1)
  SelectQuery (children 1)
  ExpressionList (children 1)
   Literal UInt64_1

Create Statement

CREATE DATABASE

Creates a new database
Syntax

CREATE DATABASE [IF NOT EXISTS] [databaseIdentifier] [ENGINE=Cnch]
  • IF NOT EXISTS : query won’t return an error if the database already exists.

Example

  1. Create database with default database engine.
CREATE DATABASE IF NOT EXISTS test;
  1. Create database with optional engine clause. Currently only Cnch engine is allowed.
CREATE DATABASE IF NOT EXISTS test ENGINE=Cnch;

CREATE TABLE

Creates a new table.

Create Table With Explicit Schema

Syntax

CREATE TABLE [IF NOT EXISTS] [tableIdentifier] [UUID uuid]
(
    [tableColumnDfnt],
    [CONSTRAINT constraint_name CHECK columnExpr,]
    ...
) [engineClause]
  • IF NOT EXISTS : query won’t return an error if the table already exists.

  • UUID : table will have an uuid provided by user. Otherwise, a generated uuid will be used.

  • [CONSTRAINT constraint_name CHECK columnExpr] : add a constraint to table.

    • columnExpr after CHECK should be a boolean expression.

    • If constraints are checked for every row in INSERT query. Exception will be raised if any constraint is not satisfied.

    • Adding large amount of constraints can negatively affect performance of big INSERT queries.

Example

  1. Create Table
CREATE TABLE IF NOT EXISTS test.createTable(
    id UInt32,
    name String DEFAULT '',
    CONSTRAINT constraint1 CHECK id > 10
) 
ENGINE=CnchMergeTree
ORDER BY id

CREATE VIEW

CREATE NORMAL VIEW

Normal views don’t store any data. They just perform a read from another table on each access. In other words, a normal view is nothing more than a saved query. When reading from a view, this saved query is used as a subquery in the FROM clause.
Syntax

CREATE [OR REPLACE] VIEW [IF NOT EXISTS] [tableIdentifier] [UUID uuid] AS [selectUnionStmt]
  • OR REPLACE is included, ByteHouse will create a new view and replace the old view with the same name.

  • IF NOT EXISTS clause is included, the query won’t return an error if the table already exists.

  • UUID is specified, table will have an uuid provided by user. Otherwise, a generated uuid will be used.

Example

# Step 1: create a table
CREATE TABLE `example_table`
(
    `order_by_column` String
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# step 2: create a view based on the created table
CREATE VIEW example_view AS SELECT * FROM example_table

Describe Statement

Returns the columns definition of a table
Syntax

DESC|DESCRIBE [TABLE] [tableIdentifier]

Example

DESCRIBE example_view
N ameTypeDefaultTypeDefaultExpressionCommentCodecExpressionTTLExpressionLastQueriedAtLastQueriedBy
order_by_columnString

Explain Statement

EXPLAIN SYNTAX

checks the syntax validity.
Syntax

EXPLAIN SYNTAX query

Example

EXPLAIN SYNTAX SELECT 1
syntax_correctsyntax_messagehas_joinhas_asterisk
100

Drop Statement

DROP DATABASE

Deletes the Database.
Syntax

DROP DATABASE [IF EXISTS] [databaseIdentifier]

Example

# create a database
CREATE DATABASE example_db

# drop a database
DROP DATABASE example_db

DROP TABLE

Deletes the table.
Syntax

DROP TABLE [IF EXISTS] [tableIdentifier]

Example

# step 1: create a table
CREATE TABLE `example_table`
(
    `order_by_column` String
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# step 2: drop a table
DROP TABLE example_table

Insert Statement

Adds data into a table.

INSERT VALUES

Data can be inserted with this basic query format.
Syntax

INSERT INTO [TABLE] [tableIdentifier] [columnsClause] VALUES

Example

# step 1: create a table
CREATE TABLE `example_table`
(
    `a` Int8,
    `b` String,
    `c` Date
)
ENGINE = `CnchMergeTree`
ORDER BY (`a`)

# step 2: insert 2 rows into the table
INSERT INTO example_table VALUES (1, 'a', '2021-07-27'), (2, 'b', '2021-07-27')

INSERT FORMAT

Data can be passed to the INSERT in aformatsupported by ByteHouse.
Syntax

INSERT INTO [TABLE] [tableIdentifier] [columnsClause] FORMAT format_name

Example

# step 1: create a table
CREATE TABLE `example_table`
(
    `a` Int8,
    `b` String,
    `c` Date
)
ENGINE = `CnchMergeTree`
ORDER BY (`a`) 

# step 2: insert 2 rows in value format into the table
INSERT INTO example_table FORMAT VALUES (1, 'a', '2021-07-27'), (2, 'b', '2021-07-27');

INSERT SELECT

Inserts the results of a SELECT query. Columns are mapped according to their position in the SELECT clause. However, their names in the SELECT expression and the table for INSERT may differ. If necessary, typecasting is performed.
Syntax

INSERT INTO [TABLE] [tableIdentifier] [columnsClause] [selectUnionStmt]

Example

INSERT INTO example_table SELECT * FROM example_table

INSERT INFILE

Insert data to a table from a file. gateway-client only
Supported file format:

  • .csv

  • .json

  • .avro

  • .parquet

Syntax

INSERT INTO [TABLE] [tableIdentifier] [columnsClause] [FORMAT format_name] INFILE filepath

Example

INSERT INTO my_table FORMAT csvwithnames INFILE '/Users/my_name/Downloads/ETH-USD.csv'

Rename Statement

RENAME TABLE

Renames one or more tables.
Syntax

RENAME TABLE [tableIdentifier] TO [tableIdentifier], [tableIdentifier] TO [tableIdentifier]

Example

# step 1: create a table
CREATE TABLE `example_table`
(
    `a` Int8,
    `b` String,
    `c` Date
)
ENGINE = `CnchMergeTree`
ORDER BY (`a`)

# step 2: rename the new created table
RENAME TABLE example_table to new_table_name

Select Union Statement

SELECT query, possibly with UNION ALL.
Syntax

[selectStmt] [UNION ALL [selectStmt]...

Example

CREATE TABLE IF NOT EXISTS test.example_table1 (column1 Date, column2 UInt32) ENGINE=CnchMergeTree ORDER BY column1;
CREATE TABLE IF NOT EXISTS test.example_table2 (column1 Date, column2 UInt32) ENGINE=CnchMergeTree ORDER BY column1;
INSERT INTO test.example_table1 VALUES ('2015-08-07',1);
INSERT INTO test.example_table2 VALUES ('2015-08-08',2);
SELECT * FROM test.example_table1 UNION ALL SELECT * FROM test.example_table2;
column1column2
2015-08-071
column1column2
2015-08-082

Select Statement

SELECT queries perform data retrieval from tables.
Syntax

[withClause]
SELECT [DISTINCT][columnExprList])
[fromClause]
[arrayJoinClause]
[joinClause]
[prewhereClause]
[whereClause]
[groupByClause]
[havingClause]
[orderByClause]
[limitByClause]
[limitClause]
[settingsClause]
  • DISTINCT , only unique rows will remain in a query result. It works with NULL as if NULL were a specific value, and NULL==NULL .

  • Asterisk symbol (*) , An asterisk can be put in any part of a query instead of an expression. When the query is analyzed, the asterisk is expanded to a list of all table columns (excluding the MATERIALIZED and ALIAS columns).

Set Statement

Set a list of settings for the current session.
Syntax

SET [settingExprList]

Example

# set virtual warehouse
SET WAREHOUSE your_warehouse_name

# set role
SET ROLE AccountAdmin

# set if to choose ansi mode or not
SET ansi_sql = 1

Show Statement

SHOW DATABASES

Prints a list of all databases. The command does not require a running warehouse to execute.
Syntax

SHOW DATABASES

Examples

SHOW DATABASES
NameCreatedAtCreatedByUpdatedAtUpdatedByLastQueriedAtLastQueriedByCommentsEngine
db_name1162736652512345616273665251234561627366525123456comment here
db_name2162736652512345616273665251234561627366525123456comment here

SHOW TABLES

Displays a list of tables. The command does not require a running warehouse to execute.
Syntax

SHOW TABLES [FROM [databaseIdentifier] [LIKE STRING_LITERAL]

If the FROM clause is not specified, the query returns the list of tables from the current database.
Example

SHOW TABLES FROM my_db LIKE '%table'
NameCreatedAtCreatedByUpdatedAtUpdatedByLastQueriedAtLastQueriedByTypeHasUniqueKey
sells_table162736652512345616273665251234561627366525123456TABLE0
view_table162736652512345616273665251234561627366525123456VIEW0

SHOW CREATE TABLE

Display table create sql
Syntax

SHOW CREATE [TABLE] [tableIdentifier]

Example

# step 1: create a table
CREATE TABLE `example_table`
(
    `order_by_column` String
)
ENGINE = `CnchMergeTree`
ORDER BY (`order_by_column`)

# step 2: show create table
SHOW CREATE TABLE example_table
Query
CREATE TABLE example_db.example_tableCOMMENT '',(, order_by_column String,),ENGINE=CnchMergeTree,PRIMARY KEY order_by_column,ORDER BY order_by_column;

Truncate Statement

Removes all data from a table.
Syntax

TRUNCATE TABLE [IF EXISTS] [tableIdentifier]

Example

TRUNCATE TABLE IF EXISTS example_table

Use Statement

Sets the current database for the session. The current database is used for searching for tables if the database is not explicitly defined in the query with a dot before the table name.
Note: This query can’t be made when using the HTTP protocol, since there is no concept of a session.

Syntax

USE [databaseIdentifier];

Example

USE example_db

Database Identifier

Syntax

database_name
  • database_name: String. Name of the database.

Table Identifier

Syntax

[database_name.]table_name
  • database_name: String. Name of the database.

  • table_name: String. Name of the table.

  • If database_name is not provided, current database will be used.

Table Column Definition

Column definition

Syntax

column_name column_type [tableColumnPropertyExpr] [COMMENT comment] [codecExpr]
column_name [column_type] [tableColumnPropertyExpr] [COMMENT comment] [codecExpr]
  • If column_type is not explicit declared, [tableColumnPropertyExpr] must be provided for column type inference.

Example

  1. Column with explicit type
id UInt32
  1. Column with inexplicit type but type can be inferred from [tableColumnPropertyExpr] .
id DEFAULT 1
  1. Column with comment
id UInt32 COMMENT 'id'
  1. Column with codec
id UInt32 CODEC(LZ4)

Table Column Property Expression

Column properties.

Syntax

DEFAULT [columnExpr]
  • DEFAULT columnExpr : Normal default value. If the INSERT query doesn’t specify the corresponding column, it will be filled in by computing the corresponding expression.

Example

  1. Column with default expression
id UInt32 DEFAULT 1

Setting Expression List

Syntax

settingExprList

settingExpr [,settingExpr]...

settingExpr

setting_name=setting_val
  • setting_name: String. Name of the setting.

  • setting_val: number or string. Value of the setting.

EngineClause

Syntax

ENGINE=engine_name
-- below are optinal field for table engine only
[orderByClause]
[partitionByClause]
[primaryKeyClause]
[uniqueKeyClause]
[sampleByClause]
[ttlClause]
[settingsClause]
  • For database engine, engine_name can only be Cnch.

  • For table engine, engine_name can only be CnchMergeTree.

orderByClause

Syntax

orderByClause

ORDER BY orderExpr [,orderExpr]...

orderExpr

[columnExpr] [ASCENDING|ASC|DESCENDING|DESC] [NULLS [FIRST|LAST] [COLLATE STRING_LITERAL]
  • [ASCENDING|ASC|DESCENDING|DESC] : determines the sorting direction. If the direction is not specified, ASC is assumed.

  • [NULLS [FIRST|LAST] : determines NaN and NULL sorting order.

    • By default or with the NULLS LAST modifier: first the values, then NaN , then NULL .

    • With the NULLS FIRST modifier: first NULL , then NaN , then other values.

  • [COLLATE STRING_LITERAL] : For sorting by String values, you can specify collation.

    • Collate is supported in LowCardinality, Nullable, Array and Tuple

    • When using COLLATE , sorting is always case-insensitive.

    • only recommend using COLLATE for final sorting of a small number of rows, since sorting with COLLATE is less efficient than normal sorting by bytes.

Example

  1. Order by multiple orderExpr
CREATE TABLE IF NOT EXISTS test.orderByClause (id UInt32, val UInt32) ENGINE=CnchMergeTree() ORDER BY id;
INSERT INTO TABLE test.orderByClause VALUES (1,5),(1,15),(2,5),(2,15);
SELECT * FROM test.orderByClause ORDER BY id ASC, val DESC;
  1. Order by NULLS FIRST
CREATE TABLE IF NOT EXISTS test.orderByClause (id UInt32, val Nullable(UInt32)) ENGINE=CnchMergeTree() ORDER BY id;
INSERT INTO test.orderByClause VALUES (1,2),(2,NULL),(3,2),(3,3),(3,NULL);
SELECT * FROM test.orderByClause ORDER BY val DESC NULLS FIRST;
  1. Order by with COLLATE
CREATE TABLE IF NOT EXISTS test.orderByClause (x UInt32, s Nullable(String)) ENGINE=CnchMergeTree ORDER BY x;
INSERT INTO test.orderByClause VALUES (1,'bca'),(2,NULL),(3,'ABC'),(4,'123a'),(5,'abc'),(6,NULL),(7,'BCA');
SELECT * FROM test.orderByClause ORDER BY s ASC COLLATE 'ru';

partitionByClause

Used by engineClause to define the partition key. Partition key can be any expression from the table columns.
Syntax

PARTITION BY [columnExpr]

Example

  1. Partition key defined by a columnExpr
CREATE TABLE test.partitionByClause
(
    VisitDate Date,
    Hour UInt8,
    ClientID UUID
)
ENGINE = CnchMergeTree
PARTITION BY VisitDate
ORDER BY Hour;

In this example, records will be partitioned by the VisitDate.
2. Partition key defined in a tuple of columnExpr

CREATE TABLE test.partitionByClause
(
    VisitDate Date,
    Hour UInt8,
    ClientID UUID
)
ENGINE = CnchMergeTree
PARTITION BY (VisitDate,Hour)
ORDER BY Hour;

In this example, records will be partitioned by the hour of the visitDate.

primaryKeyClause

Used by engineClause to define the primary key if it differs from order by key.
By default the primary key is the same as the order by key. Thus in most cases it is unnecessary to specify a separate primaryKeyClause.
Syntax

PRIMARY KEY [columnExpr]

Example

CREATE TABLE test.primaryKeyClause
(
    VisitDate Date,
    Hour UInt8,
    ClientID UUID
)
ENGINE = CnchMergeTree
ORDER BY (VisitDate,Hour)
PRIMARY KEY VisitDate;

uniqueKeyClause

Used by engineClause to define the unique key. If specified, the table creates a partition-level constraint that all unique key values in the same partition must be distinct. If you try to insert a row with a unique key that matches an existing row, it will first delete the old row and then insert the new row.
Syntax

UNIQUE KEY [columnExpr]

Note that only expressions with the following data types can be used in UNIQUE KEY

  • Primary data type: [U]Int8/16/32/64, Boolean, Date, DateTime, String

  • Composite date type: Tuple composed of elements in supported primary data types

Example

  1. Unique key defined by a single column
CREATE TABLE test.uniqueKeyClause
(
  `event_date` Date,
  `order_id` UInt64,
  `state` UInt32,
  `amount` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY event_date
ORDER BY state
UNIQUE KEY order_id;

-- rows with the same key will replace previous rows
INSERT INTO test.uniqueKeyClause VALUES
('2021-03-01', 1001, 1, 100),
('2021-03-01', 1002, 1, 200),
('2021-03-01', 1001, 2, 100),
('2021-03-02', 1001, 1, 400);

SELECT * FROM test.uniqueKeyClause ORDER BY event_date, order_id;
┌─event_date─┬─order_id─┬─state─┬─amount─┐
│ 2021-03-01 │     1001 │     2 │    100 │
│ 2021-03-01 │     1002 │     1 │    200 │
│ 2021-03-02 │     1001 │     1 │    400 │
└────────────┴──────────┴───────┴────────┘

INSERT INTO test.uniqueKeyClause VALUES
('2021-03-01', 1002, 2, 200),
('2021-03-02', 1001, 2, 400),
('2021-03-02', 1002, 1, 300);

SELECT * FROM test.uniqueKeyClause ORDER BY event_date, order_id;
┌─event_date─┬─order_id─┬─state─┬─amount─┐
│ 2021-03-01 │     1001 │     2 │    100 │
│ 2021-03-01 │     1002 │     2 │    200 │
│ 2021-03-02 │     1001 │     2 │    400 │
│ 2021-03-02 │     1002 │     1 │    300 │
└────────────┴──────────┴───────┴────────┘
  1. Composite unique key
CREATE TABLE test.uniqueKeyClause
(
  `event_date` Date,
  `region` UInt64,
  `state` String,
  `amount` UInt64
)
ENGINE = CnchMergeTree
PARTITION BY event_date
ORDER BY state
UNIQUE KEY (region, state);

sampleByClause

Used by engineClause to define the sample key. The sampling expression must contain primary key and result of sampling expression must be unsigned integer.
Syntax

SAMPLE BY [columnExpr]

Example

CREATE TABLE IF NOT EXISTS test.sampleByClause 
(
    id UInt32
) 
ENGINE=CnchMergeTree 
ORDER BY id 
SAMPLE BY id;

ttlClause

Expression to specify storage duration of rows.

  • columnExpr return results must have one Date or DateTime column.

  • Columns used in ttlClause must be in partition by columns.

Syntax

TTL [columnExpr]

Example

CREATE TABLE test.ttlClause
(
    VisitDate Date,
    Hour UInt8,
    ClientID UUID
)
ENGINE = CnchMergeTree
ORDER BY Hour
PARTITION BY VisitDate
TTL VisitDate + INTERVAL 1 DAY;

settingsClause

Syntax

SETTINGS [settingExprList]

columnsClause

Used by insertStmt to represent a list of columns.
Syntax

(column_name[,column_name]...)

withClause

Bytehouse supports Common Table Expressions( CTE ). The results of WITH clause can be used in the remaining SELECT query. There are certain limitations to the support, including

  • Recursions are not allowed in CTE

  • Subqueries are not allowed in CTE

We only support below syntax.

Syntax

WITH [columnExpr] AS identifier

Example

  1. Using constant expression as “variable”
WITH '2019-08-01 15:23:00' as ts SELECT ts
  1. Using results of a scalar subquery (1 row)
/* this example would return TOP 10 of most huge tables */
WITH
(
    SELECT 1
) AS number
SELECT number;

limitClause

Syntax

select the first m rows from the result.

LIMIT m

select the m rows from the result after skipping the first n rows.

LIMIT n, m
LIMIT m OFFSET n
  • n and m must be non-negative integers.

  • If there is no orderByClause that explicitly sorts results, the choice of rows for the result may be arbitrary and non-deterministic.

Example

  1. Example of Limit m.
CREATE TABLE IF NOT EXISTS test.limitClause (id UInt32) engine=CnchMergeTree() order by id;
INSERT INTO test.limitClause VALUES (1),(2),(3),(4),(5);
SELECT * FROM test.limitClause LIMIT 2; -- first 2 values will be return
  1. Example of Limit n,m
CREATE TABLE IF NOT EXISTS test.limitClause (id UInt32) engine=CnchMergeTree() order by id;
INSERT INTO test.limitClause VALUES (1),(2),(3),(4),(5);
SELECT * FROM test.limitClause LIMIT 1,2; -- skip first value,next 2 values will be return
SELECT * FROM test.limitClause LIMIT 2 OFFSET 1; -- skip first value,next 2 values will be return

limitByClause

select the m rows for each distinct value of expressions
Syntax

select the first m rows from the result for each distinct value of columnExpr.

LIMIT m BY [columnExprList]

Example

  1. Example of Limit m by columnExpr
CREATE TABLE test.limitByClause(id UInt32, val UInt32) ENGINE = CnchMergeTree() order by id;
INSERT INTO test.limitByClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT * FROM test.limitByClause ORDER BY id, val LIMIT 2 BY id; -- expect 4 rows. (1, 10), (1, 11), (2, 20), (2, 21) 

In this example, for each id (include value 1 and 2), we need to return 2 rows.

havingClause

Filtering the aggregation results produced by groupByClause. It is similar to the whereClause, but the difference is that WHERE is performed before aggregation, while HAVING is performed after it.
Syntax

HAVING [columnExpr](http://columnexpr/)

Example

CREATE TABLE test.havingClause(id UInt32, val UInt32) ENGINE = CnchMergeTree() ORDER BY id;
INSERT INTO test.havingClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT id FROM test.havingClause GROUP BY id HAVING count(id)>2; -- only 1 is expected. 

arrayJoinClause

For table contains array column, array join can produce a new table that has a column with each individual array element of that initial column, while values of other columns are duplicated
Syntax

[LEFT] ARRAY JOIN [columnExprList]
  • You can specify only one arrayJoinClause in a SELECT query.

  • [LEFT] ARRAY JOIN : types of ARRAY JOIN

    • ARRAY JOIN - In base case, empty arrays are not included in the result of JOIN .

    • LEFT ARRAY JOIN - The result of JOIN contains rows with empty arrays. The value for an empty array is set to the default value for the array element type (usually 0, empty string or NULL).

Example

CREATE TABLE test.arrayJoinClause(s String, arr Array(UInt8)) ENGINE = CnchMergeTree ORDER BY s;
INSERT INTO test.arrayJoinClause VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
SELECT s, arr FROM test.arrayJoinClause ARRAY JOIN arr;

prewhereClause

Prewhere is an optimization to apply filtering more efficiently. It is enabled by default even if PREWHERE clause is not specified explicitly. It works by automatically moving part of WHERE condition to prewhere stage. The role of PREWHERE clause is only to control this optimization if you think that you know how to do it better than it happens by default.
With prewhere optimization, at first only the columns necessary for executing prewhere expression are read. Then the other columns are read that are needed for running the rest of the query, but only those blocks where the prewhere expression is “true” at least for some rows. If there are a lot of blocks where prewhere expression is “false” for all rows and prewhere needs less columns than other parts of query, this often allows to read a lot less data from disk for query execution.
Syntax

PREWHERE [columnExpr](http://columnexpr/)

Example

CREATE TABLE test.prewhereClause(id UInt32, val UInt32) ENGINE = CnchMergeTree() ORDER BY id;
INSERT INTO test.prewhereClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT * FROM test.prewhereClause PREWHERE id=1 WHERE val>10; -- (1, 11), (1, 12) expected

whereClause

whereClause allows to filter the data that is coming from fromClause of SELECT . It must contain an expression with the UInt8 type. This is usually an expression with comparison and logical operators.
Syntax

WHERE [columnExpr]

Example

CREATE TABLE test.whereClause(id UInt32, val UInt32) ENGINE = CnchMergeTree() ORDER BY id;
INSERT INTO test.whereClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT * FROM test.whereClause WHERE val>10; -- (1, 11), (1, 12), (2, 20), (2, 21) expected

fromClause

The fromClause specifies the source to read data from:

  • Table

  • Subquery

  • Table Function

Syntax

Read data from table:

FROM [tableIdentifier] [FINAL] [sampleClause] 

Read data from subquery:

FROM ([selectUnionStmt]) [FINAL] [sampleClause]

Read data from table function:

FROM tableFunctionExpr [FINAL] [sampleClause]
  • FINAL : When FINAL is specified, ByteHouse fully merges the data before returning the result and thus performs all data transformations that happen during merges.

Example

  1. Select from table
CREATE TABLE test.fromClause (id UInt32, val UInt32) ENGINE=CnchMergeTree() ORDER BY id;
INSERT INTO test.fromClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT * FROM test.fromClause; -- expect 5 rows 
  1. Select from subquery
CREATE TABLE test.fromClause (id UInt32, val UInt32) ENGINE=CnchMergeTree() ORDER BY id;
INSERT INTO test.fromClause VALUES (1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
SELECT * FROM (SELECT * FROM test.fromClause LIMIT 3); -- expect 3 rows 
  1. Select from tableFunctionExpr
SELECT * FROM numbers(10); -- expect 10 rows 

We use table function numbers here to generate a table with 10 rows.

groupByClause

groupByClause switches the SELECT query into an aggregation mode. [columnExprList] under groupByClause acts as grouping key and result of aggregating SELECT query will contain as many rows as there were unique values of grouping key in source table.
Syntax

GROUP BY [(][columnExprList][)] [WITH ROLLUP|WITH CUBE] [WITH TOTALS]
  • NULL processing : ByteHouse interprets NULL as a value. If group by keys contains NULL value, it will appear in the result.

  • [WITH ROLLUP] : calculate subtotals for the key expressions, based on their order in the GROUP BY list.

    • The subtotals rows are added after the result table.

    • In the subtotals rows the values of already "grouped" key expressions are set to 0 or empty line.

  • [WITH CUBE] : calculate subtotals for every combination of the key expressions in the GROUP BY list.

  • [WITH TOTALS] : calculate subtotals for a combination of all key expressions in the GROUP BY list.

Example

  1. Group by key contains NULL value.
CREATE TABLE IF NOT EXISTS test.groupByClause (x UInt32, y Nullable(UInt32)) ENGINE=CnchMergeTree ORDER BY x;
INSERT INTO test.groupByClause VALUES (1,2),(2,NULL),(3,2),(3,3),(3,NULL);
SELECT sum(x), y FROM test.groupByClause GROUP BY y; 
sum(x)y
42
33
5NULL
  1. Group by WITH ROLLUP modifier
CREATE TABLE IF NOT EXISTS test.groupByClause (year UInt32, month UInt32, day UInt32) ENGINE=CnchMergeTree ORDER BY year;
INSERT INTO test.groupByClause VALUES (2019,1,5),(2019,1,15),(2020,1,5),(2020,1,15),(2021,1,5),(2021,1,15);
SELECT year, month, day, count() FROM test.groupByClause GROUP BY year, month, day WITH ROLLUP;

As GROUP BY section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left:

  • GROUP BY year, month, day ;

  • GROUP BY year, month (and day column is filled with zeros);

  • GROUP BY year (now month, day columns are both filled with zeros);

  • and totals (and all three key expression columns are zeros).

  1. Group by WITH CUBE modifier
CREATE TABLE IF NOT EXISTS test.groupByClause (year UInt32, month UInt32, day UInt32) ENGINE=CnchMergeTree ORDER BY year;
INSERT INTO test.groupByClause VALUES (2019,1,5),(2019,1,15),(2020,1,5),(2020,1,15),(2021,1,5),(2021,1,15);
SELECT year, month, day, count() FROM test.groupByClause GROUP BY year, month, day WITH CUBE;

As GROUP BY section has three key expressions, the result contains eight tables with subtotals for all key expression combinations:

  • GROUP BY year, month, day

  • GROUP BY year, month

  • GROUP BY year, day

  • GROUP BY year

  • GROUP BY month, day

  • GROUP BY month

  • GROUP BY day

  • and totals. (and all three key expression columns are zeros).

  1. Group by WITH TOTAL modifier
CREATE TABLE IF NOT EXISTS test.groupByClause (year UInt32, month UInt32, day UInt32) ENGINE=CnchMergeTree ORDER BY year;
INSERT INTO test.groupByClause VALUES (2019,1,5),(2019,1,15),(2020,1,5),(2020,1,15),(2021,1,5),(2021,1,15);
SELECT year, month, day, count() FROM test.groupByClause GROUP BY year, month, day WITH TOTALS;

If the WITH TOTALS modifier is specified, another row will be calculated.

  • GROUP BY year, month, day

  • and totals. (and all three key expression columns are zeros).

subqueryClause

Syntax

AS [selectUnionStmt]

sampleClause

The sampleClause allows for approximated SELECT query processing. With sampleClause enabled, query is not performed on all the data, but only on a certain fraction of data. This is useful when:

  • Have strict timing requirements (like <100ms) but you can’t justify the cost of additional hardware resources to meet them.

  • Raw data is not accurate, so approximation does not noticeably degrade the quality.

  • Business requirements target approximate results (for cost-effectiveness, or to market exact results to premium users).

Note: To use sampling, you must declare sampling expression was specified during CnchMergeTree table creation, see [sampleByClause].
Syntax

Sample K. K is the number from 0 to 1.

SAMPLE K

Sample N. N is integer larger than 1.

SAMPLE N

SAMPLE K OFFSET M. K and M are numbers from 0 to 1.

SAMPLE K OFFSET M

Example

  1. Sample K
CREATE TABLE IF NOT EXISTS test.sampleClause (id UInt32) ENGINE=CnchMergeTree ORDER BY id SAMPLE BY id;
INSERT INTO test.sampleClause SELECT * FROM numbers(1000);
SELECT COUNT() FROM test.sampleClause SAMPLE 0.1; -- 1000 is expected

In this example, 10% of data will be used for approximation.
2. Sample N

CREATE TABLE IF NOT EXISTS test.sampleClause (id UInt32) ENGINE=CnchMergeTree ORDER BY id SAMPLE BY id;
INSERT INTO test.sampleClause SELECT * FROM numbers(1000);
SELECT COUNT() FROM test.sampleClause SAMPLE 2; -- 1000 is expected

In this example 2 rows of data will be used for approximation.
3. SAMPLE K OFFSET M

CREATE TABLE IF NOT EXISTS test.sampleClause (id UInt32) ENGINE=CnchMergeTree ORDER BY id SAMPLE BY id;
INSERT INTO test.sampleClause SELECT * FROM numbers(1000);
SELECT COUNT() FROM test.sampleClause SAMPLE 0.1 OFFSET 0.2;

In this example, 10% of data will be used for approximation after skipping 20% of data.

joinClause

Join produces a new table by combining columns from one or multiple tables by using values common to each.
Syntax

[GLOBAL|LOCAL] [ANY|ALL|ASOF] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN [tableIdentifier] ON|USING [columnExprList]
  • [GLOBAL|LOCAL] :

    - `GLOBAL` , broadcast Join. Broadcast joins cannot be used when joining two large DataFrames. 
      
    - `LOCAL` , local join. Using distribution keys on the join columns can use local join. 
    
  • [ANY|ALL|ASOF] :

    • ANY , If the right table has several matching rows, only the first one found is joined. If the right table has only one matching row, the results of ANY and ALL are the same.

    • ALL , If the right table has several matching rows, ClickHouse creates a Cartesian product from matching rows. This is the normal JOIN behaviour from standard SQL.

    • ASOF , For joining sequences with an uncertain match.

  • [INNER|LEFT|RIGHT|FULL|CROSS] : All standard SQL JOIN types

    • INNER JOIN , only matching rows are returned.

    • LEFT OUTER JOIN , non-matching rows from left table are returned in addition to matching rows.

    • RIGHT OUTER JOIN , non-matching rows from right table are returned in addition to matching rows.

    • FULL OUTER JOIN , non-matching rows from both tables are returned in addition to matching rows.

    • CROSS JOIN , produces cartesian product of whole tables, “join keys” are not specified.

  • ON|USING : Expressions from ON clause and columns from USING clause are called “join keys”.

Example

CREATE TABLE IF NOT EXISTS test.joinClause (number UInt64) ENGINE=CnchMergeTree ORDER BY number;
INSERT INTO test.joinClause SELECT * FROM numbers(10);

SELECT number, joined FROM test.joinClause ANY LEFT JOIN (SELECT number * 2 AS number, number * 10 + 1 AS joined FROM test.joinClause LIMIT 10) js2 USING number LIMIT 10
numberjoined
01
1NULL
221
3NULL
441
5NULL
661
7NULL
881
9NULL

columnExprList

A list of columnExpr seperate by Comma.
Syntax

[columnExpr] [columnExpr]...

columnExpr

A columnExpr is a function, identifier, literal, application of an operator, expression in brackets, subquery, or asterisk. It can also contain an alias.