Analyzer
In ClickHouse version 24.3, the new query analyzer was enabled by default.
You can read more details about how it works here.
Known incompatibilities
Despite fixing a large number of bugs and introducing new optimizations, it also introduces some breaking changes in ClickHouse behaviour. Please read the following changes to determine how to rewrite your queries for the new analyzer.
Invalid queries are no longer optimized
The previous query planning infrastructure applied AST-level optimizations before the query validation step. Optimizations could rewrite the initial query to be valid and executable.
In the new analyzer, query validation takes place before the optimization step. This means that invalid queries which were previously possible to execute, are now unsupported. In such cases, the query must be fixed manually.
Example 1
The following query uses column number in the projection list when only toString(number) is available after the aggregation.
In the old analyzer, GROUP BY toString(number) was optimized into GROUP BY number, making the query valid.
Example 2
The same problem occurs in this query. Column number is used after aggregation with another key.
The previous query analyzer fixed this query by moving the number > 5 filter from the HAVING clause to the WHERE clause.
To fix the query, you should move all conditions that apply to non-aggregated columns to the WHERE section to conform to standard SQL syntax:
CREATE VIEW with an invalid query
The new analyzer always performs type-checking.
Previously, it was possible to create a VIEW with an invalid SELECT query.
It would then fail during the first SELECT or INSERT (in the case of MATERIALIZED VIEW).
It is no longer possible to create a VIEW in this way.
Example
Known incompatibilities of the JOIN clause
JOIN using a column from a projection
An alias from the SELECT list can not be used as a JOIN USING key by default.
A new setting, analyzer_compatibility_join_using_top_level_identifier, when enabled, alters the behavior of JOIN USING to prefer resolving identifiers based on expressions from the projection list of the SELECT query, rather than using the columns from the left table directly.
For example:
With analyzer_compatibility_join_using_top_level_identifier set to true, the join condition is interpreted as t1.a + 1 = t2.b, matching the behavior of the earlier versions.
The result will be 2, 'two'.
When the setting is false, the join condition defaults to t1.b = t2.b, and the query will return 2, 'one'.
If b is not present in t1, the query will fail with an error.
Changes in behavior with JOIN USING and ALIAS/MATERIALIZED columns
In the new analyzer, using * in a JOIN USING query that involves ALIAS or MATERIALIZED columns will include those columns in the result-set by default.
For example:
In the new analyzer, the result of this query will include the payload column along with id from both tables.
In contrast, the previous analyzer would only include these ALIAS columns if specific settings (asterisk_include_alias_columns or asterisk_include_materialized_columns) were enabled,
and the columns might appear in a different order.
To ensure consistent and expected results, especially when migrating old queries to the new analyzer, it is advisable to specify columns explicitly in the SELECT clause rather than using *.
Handling of type modifiers for columns in the USING clause
In the new version of the analyzer, the rules for determining the common supertype for columns specified in the USING clause have been standardized to produce more predictable outcomes,
especially when dealing with type modifiers like LowCardinality and Nullable.
LowCardinality(T)andT: When a column of typeLowCardinality(T)is joined with a column of typeT, the resulting common supertype will beT, effectively discarding theLowCardinalitymodifier.Nullable(T)andT: When a column of typeNullable(T)is joined with a column of typeT, the resulting common supertype will beNullable(T), ensuring that the nullable property is preserved.
For example:
In this query, the common supertype for id is determined as String, discarding the LowCardinality modifier from t1.
Projection column names changes
During projection names computation, aliases are not substituted.
Incompatible function arguments types
In the new analyzer, type inference happens during initial query analysis.
This change means that type checks are done before short-circuit evaluation; thus, the if function arguments must always have a common supertype.
For example, the following query fails with There is no supertype for types Array(UInt8), String because some of them are Array and some of them are not:
Heterogeneous clusters
The new analyzer significantly changes the communication protocol between servers in the cluster. Thus, it's impossible to run distributed queries on servers with different enable_analyzer setting values.
Mutations are interpreted by previous analyzer
Mutations are still using the old analyzer.
This means some new ClickHouse SQL features can't be used in mutations. For example, the QUALIFY clause.
The status can be checked here.
Unsupported features
The list of features that the new analyzer currently doesn't support is given below:
- Annoy index.
- Hypothesis index. Work in progress here.
- Window view is not supported. There are no plans to support it in the future.
Cloud Migration
We are enabling the new query analyzer on all instances where it is currently disabled to support new functional and performance optimizations. This change enforces stricter SQL scoping rules, requiring customers to manually update non-compliant queries.
Migration workflow
- Identify the query by filtering
system.query_logusing thenormalized_query_hash:
- Run the query with the analyzer enabled by adding these settings.
- Refactor and verify the query results to ensure they match the output generated when the analyzer is disabled.
Please refer to the most frequent incompatibilities encountered during internal testing.
Unknown expression identifier
Error: Unknown expression identifier ... in scope ... (UNKNOWN_IDENTIFIER). Exception code: 47
Cause: Queries that rely on non-standard, permissive legacy behaviors such as referencing calculated aliases in filters, ambiguous subquery projections, or "dynamic" CTE scoping are now correctly identified as invalid and rejected immediately.
Solution: Update your SQL patterns as follows:
- Filter logic: Move logic from WHERE to HAVING if filtering on results, or duplicate the expression in WHERE if filtering on source data.
- Subquery scope: Explicitly select all columns needed by the outer query.
- JOIN keys: Use ON with full expressions instead of USING if the key is an alias.
- In outer queries, refer to the alias of the Subquery/CTE itself, not the tables inside it.
Non-Aggregated Columns in GROUP BY
Error: Column ... is not under aggregate function and not in GROUP BY keys (NOT_AN_AGGREGATE). Exception code: 215
Cause: The old analyzer allowed selecting columns not present in the GROUP BY clause (often picking an arbitrary value). The new analyzer adheres to standard SQL: every selected column must be either an aggregate or a grouping key.
Solution: Wrap the column in any(), argMax(), or add it to the GROUP BY.
Duplicate CTE names
Error: CTE with name ... already exists (MULTIPLE_EXPRESSIONS_FOR_ALIAS). Exception code: 179
Cause: The old analyzer permitted defining multiple Common Table Expressions (WITH ...) with the same name shadowing the earlier one. The new analyzer forbids this ambiguity.
Solution: Rename duplicate CTEs to be unique.
Ambiguous column identifiers
Error: JOIN [JOIN TYPE] ambiguous identifier ... (AMBIGUOUS_IDENTIFIER) Exception code: 207
Cause: The query references a column name present in multiple tables within a JOIN without specifying the source table. The old analyzer often guessed the column based on internal logic, the new analyzer requires explicit name.
Solution: Fully qualify the column with table_alias.column_name.
Invalid usage of FINAL
Error: Table expression modifiers FINAL are not supported for subquery... or Storage ... doesn't support FINAL (UNSUPPORTED_METHOD). Exception codes: 1, 181
Cause: FINAL is a modifier for table storage (specifically [Shared]ReplacingMergeTree). The new analyzer rejects FINAL when applied to:
- Subqueries or derived tables (e.g., FROM (SELECT ...) FINAL).
- Table engines that do not support it (e.g., SharedMergeTree).
Solution: Apply FINAL only to the source table inside the subquery, or remove it if the engine does not support it.
countDistinct() function case-insensitivity
Error: Function with name countdistinct does not exist (UNKNOWN_FUNCTION). Exception code: 46
Cause: Function names are case-sensitive or strictly mapped in the new analyzer. countdistinct (all lowercase) is no longer resolved automatically.
Solution: Use the standard countDistinct (camelCase) or the ClickHouse specific uniq.