当前支持的 SQL 语义和 ClickHouse 社区版比较类似,但还是建议参考手册中的示例进行使用。文中的一些示例和内容参考了[社区文档](🔗)进行修改,来确保可以在 ByteHouse 中正常使用。

Note

Symbols between token [] are optional. For example [IF NOT EXISTS] Token | means or Token ... means can repeat more times

## Alter Statement

### ADD COLUMN

Adds a new column to the table. ** Syntax**


  • `IF NOT EXISTS` clause is included, the query won’t return an error if the column already exists.

  • `AFTER name_after` (the name of another column), the column is added after the specified one in the list of table columns.

**Example**



### MODIFY COLUMN

** Syntax**

Modify Column



If the modify column is:

  • Order by column: can modify its default_expr, comment, codec.

  • Partition by column: can modify its comment.

  • Normal column: can modify itstype, default_expr, comment, codec.

**Example**



### DROP COLUMN

Deletes a column in the table. Partition by columns and order by columns are not allowed to drop. ** Syntax**


  • `IF EXISTS` clause is specified, the query won’t return an error if the column does not exist.

**Example**



### RENAME COLUMN

Rename a column in the table. There are some usage notes:

  • Cannot rename column to an existing column.

  • Cannot rename column to itself.

  • Rename from column must exists.

** Syntax**


  • `IF EXISTS` clause is specified, the query won’t return an error if the column does not exist.

**Example**



### ADD CONSTRAINT

Adds a constraint to the table. ** Syntax**


  • `IF NOT EXISTS` clause is included, the query won’t return an error if the constraint already exists.

**Example**



### DROP CONSTRAINT

Deletes a constraint to the table. ** Syntax**


  • `IF EXISTS` clause is specified, the query won’t return an error if the constraint does not exist.

**Example**



### MODIFYTTL

Change table TTL. Columns used in ttl Clause must be in partition by columns. ** Syntax**



**Example**



### REMOVETTL

Remove table TTL. ** Syntax**



**Example**



###

## AST Statement

Shows the execution plan of a statement. Dumps the query AST. ** Syntax**



**Output**

**Column****Description**
explainThe parsedASTof the statement

**Example**



## Create Statement

### CREATE DATABASE

Creates a new database ** Syntax**


  • `IF NOT EXISTS` : query won’t return an error if the database already exists.

**Example**

  1. Create database with default database engine.


  1. Create database with optional engine clause. Currently only Cnch engine is allowed.



### CREATE TABLE

Creates a new table.

#### Create Table With Explicit Schema

** Syntax**


  • `IF NOT EXISTS` : query won’t return an error if the table already exists.

  • `UUID` : table will have an uuid provided by user. Otherwise, a generated uuid will be used.

  • `[CONSTRAINT constraint_name CHECK columnExpr]` : add a constraint to table.

    • columnExpr after `CHECK` should be a boolean expression.

    • If constraints are checked for every row in `INSERT` query. Exception will be raised if any constraint is not satisfied.

    • Adding large amount of constraints can negatively affect performance of big `INSERT` queries.

**Example**

  1. Create Table



### CREATE VIEW

#### CREATE NORMAL VIEW

Normal views don’t store any data. They just perform a read from another table on each access. In other words, a normal view is nothing more than a saved query. When reading from a view, this saved query is used as a subquery in the FROM clause. ** Syntax**


  • `OR REPLACE` is included, ByteHouse will create a new view and replace the old view with the same name.

  • `IF NOT EXISTS` clause is included, the query won’t return an error if the table already exists.

  • `UUID` is specified, table will have an uuid provided by user. Otherwise, a generated uuid will be used.

**Example**



## Describe Statement

Returns the columns definition of a table ** Syntax**



**Example**


**N** **ame****Type****DefaultType****DefaultExpression****Comment****CodecExpression****TTLExpression****LastQueriedAt****LastQueriedBy**
order_by_columnString







## Explain Statement

### EXPLAIN SYNTAX

checks the syntax validity. ** Syntax**



**Example**


**syntax_correct****syntax_message****has_join****has_asterisk**
1
00

## Drop Statement

### DROP DATABASE

Deletes the Database. ** Syntax**



**Example**



### DROP TABLE

Deletes the table. ** Syntax**



**Example**



## Insert Statement

Adds data into a table.

### INSERT VALUES

Data can be inserted with this basic query format. ** Syntax**



**Example**



### INSERT FORMAT

Data can be passed to the INSERT in aformatsupported by ByteHouse. ** Syntax**



**Example**



### INSERT SELECT

Inserts the results of a SELECT query. Columns are mapped according to their position in the SELECT clause. However, their names in the SELECT expression and the table for INSERT may differ. If necessary, typecasting is performed. ** Syntax**



**Example**



### INSERT INFILE

Insert data to a table from a file. gateway-client only Supported file format:

  • .csv

  • .json

  • .avro

  • .parquet

** Syntax**



**Example**



## Rename Statement

### RENAME TABLE

Renames one or more tables. ** Syntax**



**Example**



## Select Union Statement

SELECT query, possibly with UNION ALL. ** Syntax**



**Example**


**column1****column2**
2015-08-071
**column1****column2**
2015-08-082

## Select Statement

`SELECT` queries perform data retrieval from tables. ** Syntax**


  • `DISTINCT` , only unique rows will remain in a query result. It works with NULL as if `NULL` were a specific value, and `NULL==NULL` .

  • `Asterisk symbol (*)` , An asterisk can be put in any part of a query instead of an expression. When the query is analyzed, the asterisk is expanded to a list of all table columns (excluding the `MATERIALIZED` and `ALIAS` columns).

## Set Statement

Set a list of settings for the current session. ** Syntax**



**Example**



## Show Statement

### SHOW DATABASES

Prints a list of all databases. The command does not require a running warehouse to execute. ** Syntax**



**Example**s


**Name****CreatedAt****CreatedBy****UpdatedAt****UpdatedBy****LastQueriedAt****LastQueriedBy****Comments****Engine**
db_name1162736652512345616273665251234561627366525123456comment here
db_name2162736652512345616273665251234561627366525123456comment here

### SHOW TABLES

Displays a list of tables. The command does not require a running warehouse to execute. ** Syntax**



If the `FROM` clause is not specified, the query returns the list of tables from the current database. **Example**


**Name****CreatedAt****CreatedBy****UpdatedAt****UpdatedBy****LastQueriedAt****LastQueriedBy****Type****HasUniqueKey**
sells_table162736652512345616273665251234561627366525123456TABLE0
view_table162736652512345616273665251234561627366525123456VIEW0

### SHOW CREATE TABLE

Display table create sql ** Syntax**



**Example**


Query
`CREATE TABLE `example_db`.`example_table`COMMENT '',(, `order_by_column` String,),ENGINE=`CnchMergeTree`,PRIMARY KEY `order_by_column`,ORDER BY `order_by_column`;`

## Truncate Statement

Removes all data from a table. ** Syntax**



**Example**



## Use Statement

Sets the current database for the session. The current database is used for searching for tables if the database is not explicitly defined in the query with a dot before the table name. Note: This query can’t be made when using the HTTP protocol, since there is no concept of a session.

**Syntax**



**Example**



## Database Identifier

**Syntax**


  • database_name: String. Name of the database.

## Table Identifier

**Syntax**


  • database_name: String. Name of the database.

  • table_name: String. Name of the table.

  • If database_name is not provided, current database will be used.

## Table Column Definition

Column definition

**Syntax**


  • If column_type is not explicit declared, [tableColumnPropertyExpr] must be provided for column type inference.

**Example**

  1. Column with explicit type


  1. Column with inexplicit type but type can be inferred from [tableColumnPropertyExpr] .


  1. Column with comment


  1. Column with codec



## Table Column Property Expression

Column properties.

**Syntax**


  • `DEFAULT columnExpr` : Normal default value. If the INSERT query doesn’t specify the corresponding column, it will be filled in by computing the corresponding expression.

**Example**

  1. Column with default expression



## Setting Expression List

**Syntax**

settingExprList



settingExpr


  • setting_name: String. Name of the setting.

  • setting_val: number or string. Value of the setting.

## EngineClause

**Syntax**


  • For database engine, engine_name can only be Cnch.

  • For table engine, engine_name can only be CnchMergeTree.

## orderByClause

**Syntax**

orderByClause



orderExpr


  • `[ASCENDING|ASC|DESCENDING|DESC]` : determines the sorting direction. If the direction is not specified, `ASC` is assumed.

  • `[NULLS [FIRST|LAST]` : determines `NaN` and `NULL` sorting order.

    • By default or with the `NULLS LAST` modifier: first the values, then `NaN` , then `NULL` .

    • With the `NULLS FIRST` modifier: first `NULL` , then `NaN` , then other values.

  • `[COLLATE STRING_LITERAL]` : For sorting by String values, you can specify collation.

    • Collate is supported in LowCardinality, Nullable, Array and Tuple

    • When using `COLLATE` , sorting is always case-insensitive.

    • only recommend using `COLLATE` for final sorting of a small number of rows, since sorting with `COLLATE` is less efficient than normal sorting by bytes.

**Example**

  1. Order by multiple orderExpr


  1. Order by NULLS FIRST


  1. Order by with COLLATE



## partitionByClause

Used by engineClause to define the partition key. Partition key can be any expression from the table columns. ** Syntax**



**Example**

  1. Partition key defined by a columnExpr



In this example, records will be partitioned by the VisitDate.

  1. Partition key defined in a tuple of columnExpr



In this example, records will be partitioned by the hour of the visitDate.

## primaryKeyClause

Used by engineClause to define the primary key if it differs from order by key. By default the primary key is the same as the order by key. Thus in most cases it is unnecessary to specify a separate primaryKeyClause. ** Syntax**



**Example**



## uniqueKeyClause

Used by engineClause to define the unique key. If specified, the table creates a partition-level constraint that all unique key values in the same partition must be distinct. If you try to insert a row with a unique key that matches an existing row, it will first delete the old row and then insert the new row. ** Syntax**



Note that only expressions with the following data types can be used in UNIQUE KEY

  • Primary data type: `[U]Int8/16/32/64, Boolean, Date, DateTime, String`

  • Composite date type: `Tuple` composed of elements in supported primary data types

**Example**

  1. Unique key defined by a single column


  1. Composite unique key



## sampleByClause

Used by engineClause to define the sample key. The sampling expression must contain primary key and result of sampling expression must be unsigned integer. ** Syntax**



**Example**



## ttlClause

Expression to specify storage duration of rows.

  • columnExpr return results must have one `Date` or `DateTime` column.

  • Columns used in ttlClause must be in partition by columns.

** Syntax**



**Example**



## settingsClause

**Syntax**



## columnsClause

Used by insertStmt to represent a list of columns. **Syntax**



## withClause

Bytehouse supports Common Table Expressions( [CTE](🔗) ). The results of WITH clause can be used in the remaining `SELECT` query. There are certain limitations to the support, including

  • Recursions are not allowed in CTE

  • Subqueries are not allowed in CTE

We only support below syntax.

** Syntax**



**Example**

  1. Using constant expression as “variable”


  1. Using results of a scalar subquery (1 row)



## limitClause

** Syntax**

select the first `m` rows from the result.



select the `m` rows from the result after skipping the first `n` rows.


  • `n` and `m` must be non-negative integers.

  • If there is no orderByClause that explicitly sorts results, the choice of rows for the result may be arbitrary and non-deterministic.

**Example**

  1. Example of Limit m.


  1. Example of Limit n,m



## limitByClause

select the `m` rows for each distinct value of `expressions` ** Syntax**

select the first `m` rows from the result for each distinct value of columnExpr.



**Example**

  1. Example of Limit m by columnExpr



In this example, for each id (include value 1 and 2), we need to return 2 rows.

## havingClause

Filtering the aggregation results produced by groupByClause. It is similar to the whereClause, but the difference is that `WHERE` is performed before aggregation, while `HAVING` is performed after it. ** Syntax**



**Example**



## arrayJoinClause

For table contains array column, array join can produce a new table that has a column with each individual array element of that initial column, while values of other columns are duplicated ** Syntax**


  • You can specify only one arrayJoinClause in a `SELECT` query.

  • `[LEFT] ARRAY JOIN` : types of ARRAY JOIN

    • `ARRAY JOIN` - In base case, empty arrays are not included in the result of `JOIN` .

    • `LEFT ARRAY JOIN` - The result of `JOIN` contains rows with empty arrays. The value for an empty array is set to the default value for the array element type (usually 0, empty string or NULL).

**Example**



## prewhereClause

Prewhere is an optimization to apply filtering more efficiently. It is enabled by default even if `PREWHERE` clause is not specified explicitly. It works by automatically moving part of WHERE condition to prewhere stage. The role of `PREWHERE` clause is only to control this optimization if you think that you know how to do it better than it happens by default. With prewhere optimization, at first only the columns necessary for executing prewhere expression are read. Then the other columns are read that are needed for running the rest of the query, but only those blocks where the prewhere expression is “true” at least for some rows. If there are a lot of blocks where prewhere expression is “false” for all rows and prewhere needs less columns than other parts of query, this often allows to read a lot less data from disk for query execution. ** Syntax**



**Example**



## whereClause

whereClause allows to filter the data that is coming from fromClause of `SELECT` . It must contain an expression with the `UInt8` type. This is usually an expression with comparison and logical operators. ** Syntax**



**Example**



## fromClause

The fromClause specifies the source to read data from:

  • Table

  • Subquery

  • Table Function

** Syntax**

Read data from table:



Read data from subquery:



Read data from table function:


  • `FINAL` : When `FINAL` is specified, ByteHouse fully merges the data before returning the result and thus performs all data transformations that happen during merges.

**Example**

  1. Select from table


  1. Select from subquery


  1. Select from tableFunctionExpr



We use table function numbers here to generate a table with 10 rows.

## groupByClause

`groupByClause` switches the `SELECT` query into an aggregation mode. [columnExprList] under `groupByClause` acts as grouping key and result of aggregating `SELECT` query will contain as many rows as there were unique values of grouping key in source table. ** Syntax**


  • `NULL processing` : ByteHouse interprets NULL as a value. If group by keys contains NULL value, it will appear in the result.

  • `[WITH ROLLUP]` : calculate subtotals for the key expressions, based on their order in the `GROUP BY` list.

    • The subtotals rows are added after the result table.

    • In the subtotals rows the values of already "grouped" key expressions are set to `0` or empty line.

  • `[WITH CUBE]` : calculate subtotals for every combination of the key expressions in the `GROUP BY` list.

  • `[WITH TOTALS]` : calculate subtotals for a combination of all key expressions in the `GROUP BY` list.

**Example**

  1. Group by key contains NULL value.


**sum(x)****y**
42
33
5NULL
  1. Group by WITH ROLLUP modifier



As `GROUP BY` section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left:

  • `GROUP BY year, month, day` ;

  • `GROUP BY year, month` (and `day` column is filled with zeros);

  • `GROUP BY year` (now `month, day` columns are both filled with zeros);

  • and totals (and all three key expression columns are zeros).

  1. Group by WITH CUBE modifier



As `GROUP BY` section has three key expressions, the result contains eight tables with subtotals for all key expression combinations:

  • `GROUP BY year, month, day`

  • `GROUP BY year, month`

  • `GROUP BY year, day`

  • `GROUP BY year`

  • `GROUP BY month, day`

  • `GROUP BY month`

  • `GROUP BY day`

  • and totals. (and all three key expression columns are zeros).

  1. Group by WITH TOTAL modifier



If the `WITH TOTALS` modifier is specified, another row will be calculated.

  • `GROUP BY year, month, day`

  • and totals. (and all three key expression columns are zeros).

## subqueryClause

**Syntax**



## sampleClause

The sampleClause allows for approximated `SELECT` query processing. With sampleClause enabled, query is not performed on all the data, but only on a certain fraction of data. This is useful when:

  • Have strict timing requirements (like \<100ms) but you can’t justify the cost of additional hardware resources to meet them.

  • Raw data is not accurate, so approximation does not noticeably degrade the quality.

  • Business requirements target approximate results (for cost-effectiveness, or to market exact results to premium users).

Note: To use sampling, you must declare sampling expression was specified during CnchMergeTree table creation, see [sampleByClause]. **Syntax**

Sample K. K is the number from 0 to 1.



Sample N. N is integer larger than 1.



SAMPLE K OFFSET M. K and M are numbers from 0 to 1.



**Example**

  1. Sample K



In this example, 10% of data will be used for approximation.

  1. Sample N



In this example 2 rows of data will be used for approximation.

  1. SAMPLE K OFFSET M



In this example, 10% of data will be used for approximation after skipping 20% of data.

## joinClause

Join produces a new table by combining columns from one or multiple tables by using values common to each. ** Syntax**


  • `[GLOBAL|LOCAL]` :

    
  • `[ANY|ALL|ASOF]` :

    • `ANY` , If the right table has several matching rows, only the first one found is joined. If the right table has only one matching row, the results of ANY and ALL are the same.

    • `ALL` , If the right table has several matching rows, ClickHouse creates a Cartesian product from matching rows. This is the normal JOIN behaviour from standard SQL.

    • `ASOF` , For joining sequences with an uncertain match.

  • `[INNER|LEFT|RIGHT|FULL|CROSS]` : All standard [SQL JOIN](🔗) types

    • `INNER JOIN` , only matching rows are returned.

    • `LEFT OUTER JOIN` , non-matching rows from left table are returned in addition to matching rows.

    • `RIGHT OUTER JOIN` , non-matching rows from right table are returned in addition to matching rows.

    • `FULL OUTER JOIN` , non-matching rows from both tables are returned in addition to matching rows.

    • `CROSS JOIN` , produces cartesian product of whole tables, “join keys” are not specified.

  • `ON|USING` : Expressions from `ON` clause and columns from `USING` clause are called “join keys”.

**Example**


**number****joined**
01
1NULL
221
3NULL
441
5NULL
661
7NULL
881
9NULL

## columnExprList

A list of columnExpr seperate by Comma. ** Syntax**



## columnExpr

A columnExpr is a function, identifier, literal, application of an operator, expression in brackets, subquery, or asterisk. It can also contain an alias.